Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Eleonora Grassucci

Beyond Answers: How LLMs Can Pursue Strategic Thinking in Education

Apr 07, 2025

Eleonora Grassucci, Gualtiero Grassucci, Aurelio Uncini, Danilo Comminiello

Abstract:Artificial Intelligence (AI) holds transformative potential in education, enabling personalized learning, enhancing inclusivity, and encouraging creativity and curiosity. In this paper, we explore how Large Language Models (LLMs) can act as both patient tutors and collaborative partners to enhance education delivery. As tutors, LLMs personalize learning by offering step-by-step explanations and addressing individual needs, making education more inclusive for students with diverse backgrounds or abilities. As collaborators, they expand students' horizons, supporting them in tackling complex, real-world problems and co-creating innovative projects. However, to fully realize these benefits, LLMs must be leveraged not as tools for providing direct solutions but rather to guide students in developing resolving strategies and finding learning paths together. Therefore, a strong emphasis should be placed on educating students and teachers on the successful use of LLMs to ensure their effective integration into classrooms. Through practical examples and real-world case studies, this paper illustrates how LLMs can make education more inclusive and engaging while empowering students to reach their full potential.

Via

Access Paper or Ask Questions

Gramian Multimodal Representation Learning and Alignment

Dec 16, 2024

Giordano Cicchetti, Eleonora Grassucci, Luigi Sigillo, Danilo Comminiello

Figure 1 for Gramian Multimodal Representation Learning and Alignment

Figure 2 for Gramian Multimodal Representation Learning and Alignment

Figure 3 for Gramian Multimodal Representation Learning and Alignment

Figure 4 for Gramian Multimodal Representation Learning and Alignment

Abstract:Human perception integrates multiple modalities, such as vision, hearing, and language, into a unified understanding of the surrounding reality. While recent multimodal models have achieved significant progress by aligning pairs of modalities via contrastive learning, their solutions are unsuitable when scaling to multiple modalities. These models typically align each modality to a designated anchor without ensuring the alignment of all modalities with each other, leading to suboptimal performance in tasks requiring a joint understanding of multiple modalities. In this paper, we structurally rethink the pairwise conventional approach to multimodal learning and we present the novel Gramian Representation Alignment Measure (GRAM), which overcomes the above-mentioned limitations. GRAM learns and then aligns $n$ modalities directly in the higher-dimensional space in which modality embeddings lie by minimizing the Gramian volume of the $k$-dimensional parallelotope spanned by the modality vectors, ensuring the geometric alignment of all modalities simultaneously. GRAM can replace cosine similarity in any downstream method, holding for 2 to $n$ modality and providing more meaningful alignment with respect to previous similarity measures. The novel GRAM-based contrastive loss function enhances the alignment of multimodal models in the higher-dimensional embedding space, leading to new state-of-the-art performance in downstream tasks such as video-audio-text retrieval and audio-video classification. The project page, the code, and the pretrained models are available at https://ispamm.github.io/GRAM/.

Via

Access Paper or Ask Questions

Lightweight Diffusion Models for Resource-Constrained Semantic Communication

Oct 03, 2024

Giovanni Pignata, Eleonora Grassucci, Giordano Cicchetti, Danilo Comminiello

Figure 1 for Lightweight Diffusion Models for Resource-Constrained Semantic Communication

Figure 2 for Lightweight Diffusion Models for Resource-Constrained Semantic Communication

Figure 3 for Lightweight Diffusion Models for Resource-Constrained Semantic Communication

Figure 4 for Lightweight Diffusion Models for Resource-Constrained Semantic Communication

Abstract:Recently, generative semantic communication models have proliferated as they are revolutionizing semantic communication frameworks, improving their performance, and opening the way to novel applications. Despite their impressive ability to regenerate content from the compressed semantic information received, generative models pose crucial challenges for communication systems in terms of high memory footprints and heavy computational load. In this paper, we present a novel Quantized GEnerative Semantic COmmunication framework, Q-GESCO. The core method of Q-GESCO is a quantized semantic diffusion model capable of regenerating transmitted images from the received semantic maps while simultaneously reducing computational load and memory footprint thanks to the proposed post-training quantization technique. Q-GESCO is robust to different channel noises and obtains comparable performance to the full precision counterpart in different scenarios saving up to 75% memory and 79% floating point operations. This allows resource-constrained devices to exploit the generative capabilities of Q-GESCO, widening the range of applications and systems for generative semantic communication frameworks. The code is available at https://github.com/ispamm/Q-GESCO.

Via

Access Paper or Ask Questions

Language-Oriented Semantic Latent Representation for Image Transmission

May 16, 2024

Giordano Cicchetti, Eleonora Grassucci, Jihong Park, Jinho Choi, Sergio Barbarossa, Danilo Comminiello

Figure 1 for Language-Oriented Semantic Latent Representation for Image Transmission

Figure 2 for Language-Oriented Semantic Latent Representation for Image Transmission

Figure 3 for Language-Oriented Semantic Latent Representation for Image Transmission

Figure 4 for Language-Oriented Semantic Latent Representation for Image Transmission

Abstract:In the new paradigm of semantic communication (SC), the focus is on delivering meanings behind bits by extracting semantic information from raw data. Recent advances in data-to-text models facilitate language-oriented SC, particularly for text-transformed image communication via image-to-text (I2T) encoding and text-to-image (T2I) decoding. However, although semantically aligned, the text is too coarse to precisely capture sophisticated visual features such as spatial locations, color, and texture, incurring a significant perceptual difference between intended and reconstructed images. To address this limitation, in this paper, we propose a novel language-oriented SC framework that communicates both text and a compressed image embedding and combines them using a latent diffusion model to reconstruct the intended image. Experimental results validate the potential of our approach, which transmits only 2.09\% of the original image size while achieving higher perceptual similarities in noisy communication channels compared to a baseline SC method that communicates only through text.The code is available at https://github.com/ispamm/Img2Img-SC/ .

* Under review at IEEE International Workshop on Machine Learning for Signal Processing (MLSP) 2024

Via

Access Paper or Ask Questions

Rethinking Multi-User Semantic Communications with Deep Generative Models

May 16, 2024

Eleonora Grassucci, Jinho Choi, Jihong Park, Riccardo F. Gramaccioni, Giordano Cicchetti, Danilo Comminiello

Figure 1 for Rethinking Multi-User Semantic Communications with Deep Generative Models

Figure 2 for Rethinking Multi-User Semantic Communications with Deep Generative Models

Figure 3 for Rethinking Multi-User Semantic Communications with Deep Generative Models

Figure 4 for Rethinking Multi-User Semantic Communications with Deep Generative Models

Abstract:In recent years, novel communication strategies have emerged to face the challenges that the increased number of connected devices and the higher quality of transmitted information are posing. Among them, semantic communication obtained promising results especially when combined with state-of-the-art deep generative models, such as large language or diffusion models, able to regenerate content from extremely compressed semantic information. However, most of these approaches focus on single-user scenarios processing the received content at the receiver on top of conventional communication systems. In this paper, we propose to go beyond these methods by developing a novel generative semantic communication framework tailored for multi-user scenarios. This system assigns the channel to users knowing that the lost information can be filled in with a diffusion model at the receivers. Under this innovative perspective, OFDMA systems should not aim to transmit the largest part of information, but solely the bits necessary to the generative model to semantically regenerate the missing ones. The thorough experimental evaluation shows the capabilities of the novel diffusion model and the effectiveness of the proposed framework, leading towards a GenAI-based next generation of communications.

* Under review in IEEE Journal on Selected Areas in Communications

Via

Access Paper or Ask Questions

Demystifying the Hypercomplex: Inductive Biases in Hypercomplex Deep Learning

May 11, 2024

Danilo Comminiello, Eleonora Grassucci, Danilo P. Mandic, Aurelio Uncini

Figure 1 for Demystifying the Hypercomplex: Inductive Biases in Hypercomplex Deep Learning

Figure 2 for Demystifying the Hypercomplex: Inductive Biases in Hypercomplex Deep Learning

Figure 3 for Demystifying the Hypercomplex: Inductive Biases in Hypercomplex Deep Learning

Figure 4 for Demystifying the Hypercomplex: Inductive Biases in Hypercomplex Deep Learning

Abstract:Hypercomplex algebras have recently been gaining prominence in the field of deep learning owing to the advantages of their division algebras over real vector spaces and their superior results when dealing with multidimensional signals in real-world 3D and 4D paradigms. This paper provides a foundational framework that serves as a roadmap for understanding why hypercomplex deep learning methods are so successful and how their potential can be exploited. Such a theoretical framework is described in terms of inductive bias, i.e., a collection of assumptions, properties, and constraints that are built into training algorithms to guide their learning process toward more efficient and accurate solutions. We show that it is possible to derive specific inductive biases in the hypercomplex domains, which extend complex numbers to encompass diverse numbers and data structures. These biases prove effective in managing the distinctive properties of these domains, as well as the complex structures of multidimensional and multimodal signals. This novel perspective for hypercomplex deep learning promises to both demystify this class of methods and clarify their potential, under a unifying framework, and in this way promotes hypercomplex models as viable alternatives to traditional real-valued deep learning for multidimensional signal processing.

* Accepted for Publication in IEEE Signal Processing Magazine

Via

Access Paper or Ask Questions

Towards Explaining Hypercomplex Neural Networks

Mar 26, 2024

Eleonora Lopez, Eleonora Grassucci, Debora Capriotti, Danilo Comminiello

Abstract:Hypercomplex neural networks are gaining increasing interest in the deep learning community. The attention directed towards hypercomplex models originates from several aspects, spanning from purely theoretical and mathematical characteristics to the practical advantage of lightweight models over conventional networks, and their unique properties to capture both global and local relations. In particular, a branch of these architectures, parameterized hypercomplex neural networks (PHNNs), has also gained popularity due to their versatility across a multitude of application domains. Nonetheless, only few attempts have been made to explain or interpret their intricacies. In this paper, we propose inherently interpretable PHNNs and quaternion-like networks, thus without the need for any post-hoc method. To achieve this, we define a type of cosine-similarity transform within the parameterized hypercomplex domain. This PHB-cos transform induces weight alignment with relevant input features and allows to reduce the model into a single linear transform, rendering it directly interpretable. In this work, we start to draw insights into how this unique branch of neural models operates. We observe that hypercomplex networks exhibit a tendency to concentrate on the shape around the main object of interest, in addition to the shape of the object itself. We provide a thorough analysis, studying single neurons of different layers and comparing them against how real-valued networks learn. The code of the paper is available at https://github.com/ispamm/HxAI.

* The paper has been accepted at IEEE WCCI 2024

Via

Access Paper or Ask Questions

Generative AI Meets Semantic Communication: Evolution and Revolution of Communication Tasks

Jan 10, 2024

Eleonora Grassucci, Jihong Park, Sergio Barbarossa, Seong-Lyun Kim, Jinho Choi, Danilo Comminiello

Abstract:While deep generative models are showing exciting abilities in computer vision and natural language processing, their adoption in communication frameworks is still far underestimated. These methods are demonstrated to evolve solutions to classic communication problems such as denoising, restoration, or compression. Nevertheless, generative models can unveil their real potential in semantic communication frameworks, in which the receiver is not asked to recover the sequence of bits used to encode the transmitted (semantic) message, but only to regenerate content that is semantically consistent with the transmitted message. Disclosing generative models capabilities in semantic communication paves the way for a paradigm shift with respect to conventional communication systems, which has great potential to reduce the amount of data traffic and offers a revolutionary versatility to novel tasks and applications that were not even conceivable a few years ago. In this paper, we present a unified perspective of deep generative models in semantic communication and we unveil their revolutionary role in future communication frameworks, enabling emerging applications and tasks. Finally, we analyze the challenges and opportunities to face to develop generative models specifically tailored for communication systems.

* Under consideration in IEEE Network Special Issue "The Interplay Between Generative AI and 5G-Advanced toward 6G"

Via

Access Paper or Ask Questions

Generalizing Medical Image Representations via Quaternion Wavelet Networks

Oct 16, 2023

Luigi Sigillo, Eleonora Grassucci, Aurelio Uncini, Danilo Comminiello

Figure 1 for Generalizing Medical Image Representations via Quaternion Wavelet Networks

Figure 2 for Generalizing Medical Image Representations via Quaternion Wavelet Networks

Figure 3 for Generalizing Medical Image Representations via Quaternion Wavelet Networks

Figure 4 for Generalizing Medical Image Representations via Quaternion Wavelet Networks

Abstract:Neural network generalizability is becoming a broad research field due to the increasing availability of datasets from different sources and for various tasks. This issue is even wider when processing medical data, where a lack of methodological standards causes large variations being provided by different imaging centers or acquired with various devices and cofactors. To overcome these limitations, we introduce a novel, generalizable, data- and task-agnostic framework able to extract salient features from medical images. The proposed quaternion wavelet network (QUAVE) can be easily integrated with any pre-existing medical image analysis or synthesis task, and it can be involved with real, quaternion, or hypercomplex-valued models, generalizing their adoption to single-channel data. QUAVE first extracts different sub-bands through the quaternion wavelet transform, resulting in both low-frequency/approximation bands and high-frequency/fine-grained features. Then, it weighs the most representative set of sub-bands to be involved as input to any other neural model for image processing, replacing standard data samples. We conduct an extensive experimental evaluation comprising different datasets, diverse image analysis, and synthesis tasks including reconstruction, segmentation, and modality translation. We also evaluate QUAVE in combination with both real and quaternion-valued models. Results demonstrate the effectiveness and the generalizability of the proposed framework that improves network performance while being flexible to be adopted in manifold scenarios.

* This paper has been submitted to IEEE Transactions on Medical Imaging

Via

Access Paper or Ask Questions

Hypercomplex Multimodal Emotion Recognition from EEG and Peripheral Physiological Signals

Oct 11, 2023

Eleonora Lopez, Eleonora Chiarantano, Eleonora Grassucci, Danilo Comminiello

Abstract:Multimodal emotion recognition from physiological signals is receiving an increasing amount of attention due to the impossibility to control them at will unlike behavioral reactions, thus providing more reliable information. Existing deep learning-based methods still rely on extracted handcrafted features, not taking full advantage of the learning ability of neural networks, and often adopt a single-modality approach, while human emotions are inherently expressed in a multimodal way. In this paper, we propose a hypercomplex multimodal network equipped with a novel fusion module comprising parameterized hypercomplex multiplications. Indeed, by operating in a hypercomplex domain the operations follow algebraic rules which allow to model latent relations among learned feature dimensions for a more effective fusion step. We perform classification of valence and arousal from electroencephalogram (EEG) and peripheral physiological signals, employing the publicly available database MAHNOB-HCI surpassing a multimodal state-of-the-art network. The code of our work is freely available at https://github.com/ispamm/MHyEEG.

* Published at IEEE ICASSP workshops 2023

Via

Access Paper or Ask Questions