Abstract:Efficiently extracting data from tables in the scientific literature is pivotal for building large-scale databases. However, the tables reported in materials science papers exist in highly diverse forms; thus, rule-based extractions are an ineffective approach. To overcome this challenge, we present MaTableGPT, which is a GPT-based table data extractor from the materials science literature. MaTableGPT features key strategies of table data representation and table splitting for better GPT comprehension and filtering hallucinated information through follow-up questions. When applied to a vast volume of water splitting catalysis literature, MaTableGPT achieved an extraction accuracy (total F1 score) of up to 96.8%. Through comprehensive evaluations of the GPT usage cost, labeling cost, and extraction accuracy for the learning methods of zero-shot, few-shot and fine-tuning, we present a Pareto-front mapping where the few-shot learning method was found to be the most balanced solution owing to both its high extraction accuracy (total F1 score>95%) and low cost (GPT usage cost of 5.97 US dollars and labeling cost of 10 I/O paired examples). The statistical analyses conducted on the database generated by MaTableGPT revealed valuable insights into the distribution of the overpotential and elemental utilization across the reported catalysts in the water splitting literature.
Abstract:This paper introduces a first-order method for solving optimal powered descent guidance (PDG) problems, that directly handles the nonconvex constraints associated with the maximum and minimum thrust bounds with varying mass and the pointing angle constraints on thrust vectors. This issue has been conventionally circumvented via lossless convexification (LCvx), which lifts a nonconvex feasible set to a higher-dimensional convex set, and via linear approximation of another nonconvex feasible set defined by exponential functions. However, this approach sometimes results in an infeasible solution when the solution obtained from the higher-dimensional space is projected back to the original space, especially when the problem involves a nonoptimal time of flight. Additionally, the Taylor series approximation introduces an approximation error that grows with both flight time and deviation from the reference trajectory. In this paper, we introduce a first-order approach that makes use of orthogonal projections onto nonconvex sets, allowing expansive projection (ExProj). We show that 1) this approach produces a feasible solution with better performance even for the nonoptimal time of flight cases for which conventional techniques fail and 2) the proposed method compensates for the linearization error that arises from Taylor series approximation. We claim that the proposed approach offers more flexibility in generating feasible trajectories for a wide variety of planetary soft landing problems.