Abstract:Bayesian optimization (BO) methods based on information theory have obtained state-of-the-art results in several tasks. These techniques heavily rely on the Kullback-Leibler (KL) divergence to compute the acquisition function. In this work, we introduce a novel information-based class of acquisition functions for BO called Alpha Entropy Search (AES). AES is based on the {\alpha}-divergence, that generalizes the KL divergence. Iteratively, AES selects the next evaluation point as the one whose associated target value has the highest level of the dependency with respect to the location and associated value of the global maximum of the optimization problem. Dependency is measured in terms of the {\alpha}-divergence, as an alternative to the KL divergence. Intuitively, this favors the evaluation of the objective function at the most informative points about the global maximum. The {\alpha}-divergence has a free parameter {\alpha}, which determines the behavior of the divergence, trading-off evaluating differences between distributions at a single mode, and evaluating differences globally. Therefore, different values of {\alpha} result in different acquisition functions. AES acquisition lacks a closed-form expression. However, we propose an efficient and accurate approximation using a truncated Gaussian distribution. In practice, the value of {\alpha} can be chosen by the practitioner, but here we suggest to use a combination of acquisition functions obtained by simultaneously considering a range of values of {\alpha}. We provide an implementation of AES in BOTorch and we evaluate its performance in both synthetic, benchmark and real-world experiments involving the tuning of the hyper-parameters of a deep neural network. These experiments show that the performance of AES is competitive with respect to other information-based acquisition functions such as JES, MES or PES.
Abstract:Traditional economic models often rely on fixed assumptions about market dynamics, limiting their ability to capture the complexities and stochastic nature of real-world scenarios. However, reality is more complex and includes noise, making traditional models assumptions not met in the market. In this paper, we explore the application of deep reinforcement learning (DRL) to obtain optimal production strategies in microeconomic market environments to overcome the limitations of traditional models. Concretely, we propose a DRL-based approach to obtain an effective policy in competitive markets with multiple producers, each optimizing their production decisions in response to fluctuating demand, supply, prices, subsidies, fixed costs, total production curve, elasticities and other effects contaminated by noise. Our framework enables agents to learn adaptive production policies to several simulations that consistently outperform static and random strategies. As the deep neural networks used by the agents are universal approximators of functions, DRL algorithms can represent in the network complex patterns of data learnt by trial and error that explain the market. Through extensive simulations, we demonstrate how DRL can capture the intricate interplay between production costs, market prices, and competitor behavior, providing insights into optimal decision-making in dynamic economic settings. The results show that agents trained with DRL can strategically adjust production levels to maximize long-term profitability, even in the face of volatile market conditions. We believe that the study bridges the gap between theoretical economic modeling and practical market simulation, illustrating the potential of DRL to revolutionize decision-making in market strategies.
Abstract:Large Language Models (LLMs), as the case of OpenAI ChatGPT-4 Turbo, are revolutionizing several industries, including higher education. In this context, LLMs can be personalized through a fine-tuning process to meet the student demands on every particular subject, like statistics. Recently, OpenAI has launched the possibility to fine-tune their model with a natural language web interface, enabling the possibility to create customized GPT version deliberately conditioned to meet the demands of a specific task. The objective of this research is to assess the potential of the customized GPTs that have recently been launched by OpenAI. After developing a Business Statistics Virtual Professor (BSVP), tailored for students at the Universidad Pontificia Comillas, its behavior was evaluated and compared with that of ChatGPT-4 Turbo. The results lead to several conclusions. Firstly, a substantial modification in the style of communication was observed. Following the instructions it was trained with, BSVP provided responses in a more relatable and friendly tone, even incorporating a few minor jokes. Secondly, and this is a matter of relevance, when explicitly asked for something like, "I would like to practice a programming exercise similar to those in R practice 4," BSVP was capable of providing a far superior response: having access to contextual documentation, it could fulfill the request, something beyond ChatGPT-4 Turbo's capabilities. On the downside, the response times were generally higher. Lastly, regarding overall performance, quality, depth, and alignment with the specific content of the course, no statistically significant differences were observed in the responses between BSVP and ChatGPT-4 Turbo. It appears that customized assistants trained with prompts present advantages as virtual aids for students, yet they do not constitute a substantial improvement over ChatGPT-4 Turbo.
Abstract:Generative AI has experienced remarkable growth in recent years, leading to a wide array of applications across diverse domains. In this paper, we present a comprehensive survey of more than 350 generative AI applications, providing a structured taxonomy and concise descriptions of various unimodal and even multimodal generative AIs. The survey is organized into sections, covering a wide range of unimodal generative AI applications such as text, images, video, gaming and brain information. Our survey aims to serve as a valuable resource for researchers and practitioners to navigate the rapidly expanding landscape of generative AI, facilitating a better understanding of the current state-of-the-art and fostering further innovation in the field.
Abstract:In this paper, we present a novel approach to simulating H.P. Lovecraft's horror literature using the ChatGPT large language model, specifically the GPT-4 architecture. Our study aims to generate text that emulates Lovecraft's unique writing style and themes, while also examining the effectiveness of prompt engineering techniques in guiding the model's output. To achieve this, we curated a prompt containing several specialized literature references and employed advanced prompt engineering methods. We conducted an empirical evaluation of the generated text by administering a survey to a sample of undergraduate students. Utilizing statistical hypothesis testing, we assessed the students ability to distinguish between genuine Lovecraft works and those generated by our model. Our findings demonstrate that the participants were unable to reliably differentiate between the two, indicating the effectiveness of the GPT-4 model and our prompt engineering techniques in emulating Lovecraft's literary style. In addition to presenting the GPT model's capabilities, this paper provides a comprehensive description of its underlying architecture and offers a comparative analysis with related work that simulates other notable authors and philosophers, such as Dennett. By exploring the potential of large language models in the context of literary emulation, our study contributes to the body of research on the applications and limitations of these models in various creative domains.
Abstract:This article explores the ethical problems arising from the use of ChatGPT as a kind of generative AI and suggests responses based on the Human-Centered Artificial Intelligence (HCAI) framework. The HCAI framework is appropriate because it understands technology above all as a tool to empower, augment, and enhance human agency while referring to human wellbeing as a grand challenge, thus perfectly aligning itself with ethics, the science of human flourishing. Further, HCAI provides objectives, principles, procedures, and structures for reliable, safe, and trustworthy AI which we apply to our ChatGPT assessments. The main danger ChatGPT presents is the propensity to be used as a weapon of mass deception (WMD) and an enabler of criminal activities involving deceit. We review technical specifications to better comprehend its potentials and limitations. We then suggest both technical (watermarking, styleme, detectors, and fact-checkers) and non-technical measures (terms of use, transparency, educator considerations, HITL) to mitigate ChatGPT misuse or abuse and recommend best uses (creative writing, non-creative writing, teaching and learning). We conclude with considerations regarding the role of humans in ensuring the proper use of ChatGPT for individual and social wellbeing.
Abstract:In recent years there has been a growing demand from financial agents, especially from particular and institutional investors, for companies to report on climate-related financial risks. A vast amount of information, in text format, can be expected to be disclosed in the short term by firms in order to identify these types of risks in their financial and non financial reports, particularly in response to the growing regulation that is being passed on the matter. To this end, this paper applies state-of-the-art NLP techniques to achieve the detection of climate change in text corpora. We use transfer learning to fine-tune two transformer models, BERT and ClimateBert -a recently published DistillRoBERTa-based model that has been specifically tailored for climate text classification-. These two algorithms are based on the transformer architecture which enables learning the contextual relationships between words in a text. We carry out the fine-tuning process of both models on the novel Clima-Text database, consisting of data collected from Wikipedia, 10K Files Reports and web-based claims. Our text classification model obtained from the ClimateBert fine-tuning process on ClimaText, outperforms the models created with BERT and the current state-of-the-art transformer in this particular problem. Our study is the first one to implement on the ClimaText database the recently published ClimateBert algorithm. Based on our results, it can be said that ClimateBert fine-tuned on ClimaText is an outstanding tool within the NLP pre-trained transformer models that may and should be used by investors, institutional agents and companies themselves to monitor the disclosure of climate risk in financial reports. In addition, our transfer learning methodology is cheap in computational terms, thus allowing any organization to perform it.
Abstract:Financial experts and analysts seek to predict the variability of financial markets. In particular, the correct prediction of this variability ensures investors successful investments. However, there has been a big trend in finance in the last years, which are the ESG criteria. Concretely, ESG (Economic, Social and Governance) criteria have become more significant in finance due to the growing importance of investments being socially responsible, and because of the financial impact companies suffer when not complying with them. Consequently, creating a stock portfolio should not only take into account its performance but compliance with ESG criteria. Hence, this paper combines mathematical modelling, with ESG and finance. In more detail, we use Bayesian optimization (BO), a sequential state-of-the-art design strategy to optimize black-boxes with unknown analytical and costly-to compute expressions, to maximize the performance of a stock portfolio under the presence of ESG criteria soft constraints incorporated to the objective function. In an illustrative experiment, we use the Sharpe ratio, that takes into consideration the portfolio returns and its variance, in other words, it balances the trade-off between maximizing returns and minimizing risks. In the present work, ESG criteria have been divided into fourteen independent categories used in a linear combination to estimate a firm total ESG score. Most importantly, our presented approach would scale to alternative black-box methods of estimating the performance and ESG compliance of the stock portfolio. In particular, this research has opened the door to many new research lines, as it has proved that a portfolio can be optimized using a BO that takes into consideration financial performance and the accomplishment of ESG criteria.
Abstract:Integrated information theory (IIT) is a theoretical framework that provides a quantitative measure to estimate when a physical system is conscious, its degree of consciousness, and the complexity of the qualia space that the system is experiencing. Formally, IIT rests on the assumption that if a surrogate physical system can fully embed the phenomenological properties of consciousness, then the system properties must be constrained by the properties of the qualia being experienced. Following this assumption, IIT represents the physical system as a network of interconnected elements that can be thought of as a probabilistic causal graph, $\mathcal{G}$, where each node has an input-output function and all the graph is encoded in a transition probability matrix. Consequently, IIT's quantitative measure of consciousness, $\Phi$, is computed with respect to the transition probability matrix and the present state of the graph. In this paper, we provide a random search algorithm that is able to optimize $\Phi$ in order to investigate, as the number of nodes increases, the structure of the graphs that have higher $\Phi$. We also provide arguments that show the difficulties of applying more complex black-box search algorithms, such as Bayesian optimization or metaheuristics, in this particular problem. Additionally, we suggest specific research lines for these techniques to enhance the search algorithm that guarantees maximal $\Phi$.
Abstract:Are intelligent machines really intelligent? Is the underlying philosophical concept of intelligence satisfactory for describing how the present systems work? Is understanding a necessary and sufficient condition for intelligence? If a machine could understand, should we attribute subjectivity to it? This paper addresses the problem of deciding whether the so-called "intelligent machines" are capable of understanding, instead of merely processing signs. It deals with the relationship between syntaxis and semantics. The main thesis concerns the inevitability of semantics for any discussion about the possibility of building conscious machines, condensed into the following two tenets: "If a machine is capable of understanding (in the strong sense), then it must be capable of combining rules and intuitions"; "If semantics cannot be reduced to syntaxis, then a machine cannot understand." Our conclusion states that it is not necessary to attribute understanding to a machine in order to explain its exhibited "intelligent" behavior; a merely syntactic and mechanistic approach to intelligence as a task-solving tool suffices to justify the range of operations that it can display in the current state of technological development.