Abstract:We study the use of large language models (LLMs) to both evaluate and greenwash corporate climate disclosures. First, we investigate the use of the LLM-as-a-Judge (LLMJ) methodology for scoring company-submitted reports on emissions reduction targets and progress. Second, we probe the behavior of an LLM when it is prompted to greenwash a response subject to accuracy and length constraints. Finally, we test the robustness of the LLMJ methodology against responses that may be greenwashed using an LLM. We find that two LLMJ scoring systems, numerical rating and pairwise comparison, are effective in distinguishing high-performing companies from others, with the pairwise comparison system showing greater robustness against LLM-greenwashed responses.
Abstract:When auditing a redistricting plan, a persuasive method is to compare the plan with an ensemble of neutrally drawn redistricting plans. Ensembles are generated via algorithms that sample distributions on balanced graph partitions. To audit the partisan difference between the ensemble and a given plan, one must ensure that the non-partisan criteria are matched so that we may conclude that partisan differences come from bias rather than, for example, levels of compactness or differences in community preservation. Certain sampling algorithms allow one to explicitly state the policy-based probability distribution on plans, however, these algorithms have shown poor mixing times for large graphs (i.e. redistricting spaces) for all but a few specialized measures. In this work, we generate a multiscale parallel tempering approach that makes local moves at each scale. The local moves allow us to adopt a wide variety of policy-based measures. We examine our method in the state of Connecticut and succeed at achieving fast mixing on a policy-based distribution that has never before been sampled at this scale. Our algorithm shows promise to expand to a significantly wider class of measures that will (i) allow for more principled and situation-based comparisons and (ii) probe for the typical partisan impact that policy can have on redistricting.