Univ. Tokyo
Abstract:A college-level benchmark dataset for large language models (LLMs) in the materials science field, MaterialBENCH, is constructed. This dataset consists of problem-answer pairs, based on university textbooks. There are two types of problems: one is the free-response answer type, and the other is the multiple-choice type. Multiple-choice problems are constructed by adding three incorrect answers as choices to a correct answer, so that LLMs can choose one of the four as a response. Most of the problems for free-response answer and multiple-choice types overlap except for the format of the answers. We also conduct experiments using the MaterialBENCH on LLMs, including ChatGPT-3.5, ChatGPT-4, Bard (at the time of the experiments), and GPT-3.5 and GPT-4 with the OpenAI API. The differences and similarities in the performance of LLMs measured by the MaterialBENCH are analyzed and discussed. Performance differences between the free-response type and multiple-choice type in the same models and the influence of using system massages on multiple-choice problems are also studied. We anticipate that MaterialBENCH will encourage further developments of LLMs in reasoning abilities to solve more complicated problems and eventually contribute to materials research and discovery.
Abstract:Recent advancements in Natural Language Processing have opened up new possibilities for the development of large language models like ChatGPT, which can facilitate knowledge management in the design process by providing designers with access to a vast array of relevant information. However, integrating ChatGPT into the design process also presents new challenges. In this paper, we provide a concise review of the classification and representation of design knowledge, and past efforts to support designers in acquiring knowledge. We analyze the opportunities and challenges that ChatGPT presents for knowledge management in design and propose promising future research directions. A case study is conducted to validate the advantages and drawbacks of ChatGPT, showing that designers can acquire targeted knowledge from various domains, but the quality of the acquired knowledge is highly dependent on the prompt.
Abstract:In previous research, various types of aerial robots were developed to improve maneuverability or manipulation abilities. However, there was a challenge in achieving both mobility and manipulation capabilities simultaneously. This is because aerial robots with high mobility lack the necessary rotors to perform manipulation tasks, while those with manipulation ability are too large to achieve high mobility. To address this issue, a new aerial robot called TRADY was introduced in this article. TRADY is a tilted-rotor-equipped aerial robot that can autonomously assemble and disassemble in-flight, allowing for a switch in control model between under-actuated and fully-actuated models. The system features a novel docking mechanism and optimized rotor configuration, as well as a control system that can transition between under-actuated and fully-actuated modes and compensate for discrete changes. Additionally, a new motion strategy for assembly/disassembly motion that includes recovery behavior from hazardous conditions was introduced. Experimental results showed that TRADY can successfully execute aerial assembly/disassembly motions with a 90% success rate and generate more than nine times the torque of a single unit in the assembly state. This is the first robot system capable of performing both assembly and disassembly while seamlessly transitioning between fully-actuated and under-actuated models.