Analogical reasoning is essential for human cognition, allowing us to comprehend new concepts by relating them to familiar ones based on common relational structures. Previous work mainly focuses on word analogies, which do not fully represent the analogical reasoning ability of language models (LMs) aligning with humans. This paper first examines analogy prompting for large language models (LLMs) in scientific question-answering tasks. Then we discover that LLMs tend to ignore relational structures when performing word analogies, casting doubt on their utility for evaluating analogical reasoning. For better evaluation aligning with humans, we propose an analogical structure abduction task based on cognitive psychology, which aims to abduct structures between two systems to establish an analogy. Then we create a benchmark of scientific analogical reasoning with structure abduction, SCAR, consisting of 400 scientific analogies across 13 domains for this task. Empirical results reveal that LLMs struggle with this task, but the Chain-of-Thought (CoT) method with background knowledge and explanations can improve their capability.