Abstract:We present the first application of modern Hopfield networks to the problem of portfolio optimization. We performed an extensive study based on combinatorial purged cross-validation over several datasets and compared our results to both traditional and deep-learning-based methods for portfolio selection. Compared to state-of-the-art deep-learning methods such as Long-Short Term Memory networks and Transformers, we find that the proposed approach performs on par or better, while providing faster training times and better stability. Our results show that Modern Hopfield Networks represent a promising approach to portfolio optimization, allowing for an efficient, scalable, and robust solution for asset allocation, risk management, and dynamic rebalancing.
Abstract:Large Language Models (LLMs) demonstrate an impressive capacity to recall a vast range of common factual knowledge information. However, unravelling the underlying reasoning of LLMs and explaining their internal mechanisms of exploiting this factual knowledge remain active areas of investigation. Our work analyzes the factual knowledge encoded in the latent representation of LLMs when prompted to assess the truthfulness of factual claims. We propose an end-to-end framework that jointly decodes the factual knowledge embedded in the latent space of LLMs from a vector space to a set of ground predicates and represents its evolution across the layers using a temporal knowledge graph. Our framework relies on the technique of activation patching which intervenes in the inference computation of a model by dynamically altering its latent representations. Consequently, we neither rely on external models nor training processes. We showcase our framework with local and global interpretability analyses using two claim verification datasets: FEVER and CLIMATE-FEVER. The local interpretability analysis exposes different latent errors from representation to multi-hop reasoning errors. On the other hand, the global analysis uncovered patterns in the underlying evolution of the model's factual knowledge (e.g., store-and-seek factual information). By enabling graph-based analyses of the latent representations, this work represents a step towards the mechanistic interpretability of LLMs.
Abstract:A substantial gap persists in understanding the reasons behind the exceptional performance of the Transformer architecture in NLP. A particularly unexplored area involves the mechanistic description of how the distribution of parameters evolves over time during training. In this work we suggest that looking at the time evolution of the statistic distribution of model parameters, and specifically at bifurcation effects, can help understanding the model quality, potentially reducing training costs and evaluation efforts and empirically showing the reasons behind the effectiveness of weights sparsification.
Abstract:Over the last decade, several regulatory bodies have started requiring the disclosure of non-financial information from publicly listed companies, in light of the investors' increasing attention to Environmental, Social, and Governance (ESG) issues. Such information is publicly released in a variety of non-structured and multi-modal documentation. Hence, it is not straightforward to aggregate and consolidate such data in a cohesive framework to further derive insights about sustainability practices across companies and markets. Thus, it is natural to resort to Information Extraction (IE) techniques to provide concise, informative and actionable data to the stakeholders. Moving beyond traditional text processing techniques, in this work we leverage Large Language Models (LLMs), along with prominent approaches such as Retrieved Augmented Generation and in-context learning, to extract semantically structured information from sustainability reports. We then adopt graph-based representations to generate meaningful statistical, similarity and correlation analyses concerning the obtained findings, highlighting the prominent sustainability actions undertaken across industries and discussing emerging similarity and disclosing patterns at company, sector and region levels. Lastly, we investigate which factual aspects impact the most on companies' ESG scores using our findings and other company information.