Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Vitor Silva

An Experimental Exploration of In-Memory Computing for Multi-Layer Perceptrons

Aug 10, 2025

Pedro Carrinho, Hamid Moghadaspour, Oscar Ferraz, João Dinis Ferreira, Yann Falevoz, Vitor Silva, Gabriel Falcao

Abstract:In modern computer architectures, the performance of many memory-bound workloads (e.g., machine learning, graph processing, databases) is limited by the data movement bottleneck that emerges when transferring large amounts of data between the main memory and the central processing unit (CPU). Processing-in-memory is an emerging computing paradigm that aims to alleviate this data movement bottleneck by performing computation close to or within the memory units, where data resides. One example of a prevalent workload whose performance is bound by the data movement bottleneck is the training and inference process of artificial neural networks. In this work, we analyze the potential of modern general-purpose PiM architectures to accelerate neural networks. To this end, we selected the UPMEM PiM system, the first commercially available real-world general-purpose PiM architecture. We compared the implementation of multilayer perceptrons (MLPs) in PiM with a sequential baseline running on an Intel Xeon CPU. The UPMEM implementation achieves up to $259\times$ better performance for inference of large batch sizes when compared against the CPU that exploits the size of the available PiM memory. Additionally, two smaller MLPs were implemented using UPMEM's working SRAM (WRAM), a scratchpad memory, to evaluate their performance against a low-power Nvidia Jetson graphics processing unit (GPU), providing further insights into the efficiency of UPMEM's PiM for neural network inference. Results show that using WRAM achieves kernel execution times for MLP inference of under $3$ ms, which is within the same order of magnitude as low-power GPUs.

* 19 pages, 1 figures, and 2 tables

Via

Access Paper or Ask Questions

Streamlined Swift Allocation Strategies for Radio Stripe Networks

Dec 10, 2024

Filipe Conceição, Marco Gomes, Vitor Silva, Rui Dinis

Abstract:This paper proposes the use of an access point (AP) selection scheme to improve the total uplink (UL) spectral efficiency (SE) of a radio stripe (RS) network. This scheme optimizes the allocation matrix between the total number of APs' antennas and users' equipment (UEs) while considering two state-of-the-art and two newly proposed equalization approaches: centralized maximum ratio combining (CMRC), centralized optimal sequence linear processing (COSLP), sequential MRC (SMRC), and parallel MRC (PMRC). The optimization problem is solved through a low-complexity and adaptive genetic algorithm (GA) which aims to output an efficient solution for the AP-UE association matrix. We evaluate the proposed schemes in several network scenarios in terms of SE performance, convergence speed, computational complexity, and fronthaul signalling capacity requirements. The COSLP exhibits the best SE performance at the expense of high computational complexity and fronthaul signalling. The SMRC and PMRC are efficient solutions alternatives to the CMRC, improving its computational complexity and convergence speed. Additionally, we assess the adaptability of the MRC schemes for two different instances of network change: when a new randomly located UE must connect to the RS network and when a random UE is removed from it. We have found that in some cases, by reusing the allocation matrix from the original instance as an initial solution, the SMRC and/or the PMRC can significantly boost the optimization performance of the GA-based AP selection scheme.

* 16 pages, 24 figures

Via

Access Paper or Ask Questions

Named Entity Recognition in Twitter: A Dataset and Analysis on Short-Term Temporal Shifts

Oct 07, 2022

Asahi Ushio, Leonardo Neves, Vitor Silva, Francesco Barbieri, Jose Camacho-Collados

Figure 1 for Named Entity Recognition in Twitter: A Dataset and Analysis on Short-Term Temporal Shifts

Figure 2 for Named Entity Recognition in Twitter: A Dataset and Analysis on Short-Term Temporal Shifts

Figure 3 for Named Entity Recognition in Twitter: A Dataset and Analysis on Short-Term Temporal Shifts

Figure 4 for Named Entity Recognition in Twitter: A Dataset and Analysis on Short-Term Temporal Shifts

Abstract:Recent progress in language model pre-training has led to important improvements in Named Entity Recognition (NER). Nonetheless, this progress has been mainly tested in well-formatted documents such as news, Wikipedia, or scientific articles. In social media the landscape is different, in which it adds another layer of complexity due to its noisy and dynamic nature. In this paper, we focus on NER in Twitter, one of the largest social media platforms, and construct a new NER dataset, TweetNER7, which contains seven entity types annotated over 11,382 tweets from September 2019 to August 2021. The dataset was constructed by carefully distributing the tweets over time and taking representative trends as a basis. Along with the dataset, we provide a set of language model baselines and perform an analysis on the language model performance on the task, especially analyzing the impact of different time periods. In particular, we focus on three important temporal aspects in our analysis: short-term degradation of NER models over time, strategies to fine-tune a language model over different periods, and self-labeling as an alternative to lack of recently-labeled data. TweetNER7 is released publicly (https://huggingface.co/datasets/tner/tweetner7) along with the models fine-tuned on it (NER models have been integrated into TweetNLP and can be found athttps://github.com/asahi417/tner/tree/master/examples/tweetner7_paper).

* AACL 2022 main conference

Via

Access Paper or Ask Questions

Joint Channel Estimation and Synchronization Techniques for Time Interleaved Block Windowed Burst OFDM

Mar 30, 2021

João Martins, Filipe Conceição, Marco Gomes, Vitor Silva, Rui Dinis

Figure 1 for Joint Channel Estimation and Synchronization Techniques for Time Interleaved Block Windowed Burst OFDM

Figure 2 for Joint Channel Estimation and Synchronization Techniques for Time Interleaved Block Windowed Burst OFDM

Figure 3 for Joint Channel Estimation and Synchronization Techniques for Time Interleaved Block Windowed Burst OFDM

Figure 4 for Joint Channel Estimation and Synchronization Techniques for Time Interleaved Block Windowed Burst OFDM

Abstract:From a conceptual perspective, 5G technology promises to deliver low latency, high data rate and more reliable connections for the next generations of communication systems. To face these demands, modulation schemes based on Orthogonal Frequency Domain Multiplexing (OFDM) can accommodate these requirements for wireless systems. On the other hand, several hybrid OFDM-based systems such as the Time-Interleaved Block Windowed Burst Orthogonal Frequency Division Multiplexing (TIBWB-OFDM) are capable of achieving even better spectral confinement and power efficiency. This paper addresses to the implementation of the TIBWB-OFDM system in a more realistic and practical wireless link scenarios by addressing the challenges of proper and reliable channel estimation and frame synchronization. We propose to incorporate a preamble formed by optimum correlation training sequences, such as the Zadoff-Chu (ZC) sequences. The added ZC preamble sequence is used to jointly estimate the frame beginning, through signal correlation strategies and a threshold decision device, and acquire the channel state information (CSI), by employing estimators based on the preamble sequence and the transmitted data. The employed receiver estimators show that it is possible to detect the TIBWB-OFDM frame beginning and provide a close BER performance comparatively to the one where the perfect channel is known.

Via

Access Paper or Ask Questions