Abstract:Large language models (LLMs) such as ChatGPT, GPT-4, Claude-3, and Llama are being integrated across a variety of industries. Despite this rapid proliferation, experts are calling for caution in the interpretation and adoption of LLMs, owing to numerous associated ethical concerns. Research has also uncovered shortcomings in LLMs' reasoning and logical abilities, raising questions on the potential of LLMs as evaluation tools. In this paper, we investigate LLMs' self-evaluation capabilities on a novel proverb reasoning task. We introduce a novel proverb database consisting of 300 proverb pairs that are similar in intent but different in wordings, across topics spanning gender, wisdom, and society. We propose tests to evaluate textual consistencies as well as numerical consistencies across similar proverbs, and demonstrate the effectiveness of our method and dataset in identifying failures in LLMs' self-evaluation which in turn can highlight issues related to gender stereotypes and lack of cultural understanding in LLMs.
Abstract:In recent years, there has been an increased emphasis on understanding and mitigating adverse impacts of artificial intelligence (AI) technologies on society. Across academia, industry, and government bodies, a variety of endeavours are being pursued towards enhancing AI ethics. A significant challenge in the design of ethical AI systems is that there are multiple stakeholders in the AI pipeline, each with their own set of constraints and interests. These different perspectives are often not understood, due in part to communication gaps.For example, AI researchers who design and develop AI models are not necessarily aware of the instability induced in consumers' lives by the compounded effects of AI decisions. Educating different stakeholders about their roles and responsibilities in the broader context becomes necessary. In this position paper, we outline some potential ways in which generative artworks can play this role by serving as accessible and powerful educational tools for surfacing different perspectives. We hope to spark interdisciplinary discussions about computational creativity broadly as a tool for enhancing AI ethics.
Abstract:With rapid progress in artificial intelligence (AI), popularity of generative art has grown substantially. From creating paintings to generating novel art styles, AI based generative art has showcased a variety of applications. However, there has been little focus concerning the ethical impacts of AI based generative art. In this work, we investigate biases in the generative art AI pipeline right from those that can originate due to improper problem formulation to those related to algorithm design. Viewing from the lens of art history, we discuss the socio-cultural impacts of these biases. Leveraging causal models, we highlight how current methods fall short in modeling the process of art creation and thus contribute to various types of biases. We illustrate the same through case studies. To the best of our knowledge, this is the first extensive analysis that investigates biases in the generative art AI pipeline from the perspective of art history. We hope our work sparks interdisciplinary discussions related to accountability of generative art.
Abstract:Financial decisions impact our lives, and thus everyone from the regulator to the consumer is interested in fair, sound, and explainable decisions. There is increasing competitive desire and regulatory incentive to deploy AI mindfully within financial services. An important mechanism towards that end is to explain AI decisions to various stakeholders. State-of-the-art explainable AI systems mostly serve AI engineers and offer little to no value to business decision makers, customers, and other stakeholders. Towards addressing this gap, in this work we consider the scenario of explaining loan denials. We build the first-of-its-kind dataset that is representative of loan-applicant friendly explanations. We design a novel Generative Adversarial Network (GAN) that can accommodate smaller datasets, to generate user-friendly textual explanations. We demonstrate how our system can also generate explanations serving different purposes: those that help educate the loan applicants, or help them take appropriate action towards a future approval.