Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ju Li

ReV, LS2N

Wan-Animate: Unified Character Animation and Replacement with Holistic Replication

Sep 17, 2025

Gang Cheng, Xin Gao, Li Hu, Siqi Hu, Mingyang Huang, Chaonan Ji, Ju Li, Dechao Meng, Jinwei Qi, Penchong Qiao(+16 more)

Figure 1 for Wan-Animate: Unified Character Animation and Replacement with Holistic Replication

Figure 2 for Wan-Animate: Unified Character Animation and Replacement with Holistic Replication

Figure 3 for Wan-Animate: Unified Character Animation and Replacement with Holistic Replication

Figure 4 for Wan-Animate: Unified Character Animation and Replacement with Holistic Replication

Abstract:We introduce Wan-Animate, a unified framework for character animation and replacement. Given a character image and a reference video, Wan-Animate can animate the character by precisely replicating the expressions and movements of the character in the video to generate high-fidelity character videos. Alternatively, it can integrate the animated character into the reference video to replace the original character, replicating the scene's lighting and color tone to achieve seamless environmental integration. Wan-Animate is built upon the Wan model. To adapt it for character animation tasks, we employ a modified input paradigm to differentiate between reference conditions and regions for generation. This design unifies multiple tasks into a common symbolic representation. We use spatially-aligned skeleton signals to replicate body motion and implicit facial features extracted from source images to reenact expressions, enabling the generation of character videos with high controllability and expressiveness. Furthermore, to enhance environmental integration during character replacement, we develop an auxiliary Relighting LoRA. This module preserves the character's appearance consistency while applying the appropriate environmental lighting and color tone. Experimental results demonstrate that Wan-Animate achieves state-of-the-art performance. We are committed to open-sourcing the model weights and its source code.

* Project Page: https://humanaigc.github.io/wan-animate/

Via

Access Paper or Ask Questions

Collaborative AI Enhances Image Understanding in Materials Science

Mar 17, 2025

Ruoyan Avery Yin, Zhichu Ren, Zongyou Yin, Zhen Zhang, So Yeon Kim, Chia-Wei Hsu, Ju Li

Abstract:The Copilot for Real-world Experimental Scientist (CRESt) system empowers researchers to control autonomous laboratories through conversational AI, providing a seamless interface for managing complex experimental workflows. We have enhanced CRESt by integrating a multi-agent collaboration mechanism that utilizes the complementary strengths of the ChatGPT and Gemini models for precise image analysis in materials science. This innovative approach significantly improves the accuracy of experimental outcomes by fostering structured debates between the AI models, which enhances decision-making processes in materials phase analysis. Additionally, to evaluate the generalizability of this approach, we tested it on a quantitative task of counting particles. Here, the collaboration between the AI models also led to improved results, demonstrating the versatility and robustness of this method. By harnessing this dual-AI framework, this approach stands as a pioneering method for enhancing experimental accuracy and efficiency in materials research, with applications extending beyond CRESt to broader scientific experimentation and analysis.

* 10 pages, 4 figures

Via

Access Paper or Ask Questions

Frankenstein Optimizer: Harnessing the Potential by Revisiting Optimization Tricks

Mar 04, 2025

Chia-Wei Hsu, Nien-Ti Tsou, Yu-Cheng Chen, Yang Jeong Park, Ju Li

Figure 1 for Frankenstein Optimizer: Harnessing the Potential by Revisiting Optimization Tricks

Figure 2 for Frankenstein Optimizer: Harnessing the Potential by Revisiting Optimization Tricks

Figure 3 for Frankenstein Optimizer: Harnessing the Potential by Revisiting Optimization Tricks

Figure 4 for Frankenstein Optimizer: Harnessing the Potential by Revisiting Optimization Tricks

Abstract:Gradient-based optimization drives the unprecedented performance of modern deep neural network models across diverse applications. Adaptive algorithms have accelerated neural network training due to their rapid convergence rates; however, they struggle to find ``flat minima" reliably, resulting in suboptimal generalization compared to stochastic gradient descent (SGD). By revisiting various adaptive algorithms' mechanisms, we propose the Frankenstein optimizer, which combines their advantages. The proposed Frankenstein dynamically adjusts first- and second-momentum coefficients according to the optimizer's current state to directly maintain consistent learning dynamics and immediately reflect sudden gradient changes. Extensive experiments across several research domains such as computer vision, natural language processing, few-shot learning, and scientific simulations show that Frankenstein surpasses existing adaptive algorithms and SGD empirically regarding convergence speed and generalization performance. Furthermore, this research deepens our understanding of adaptive algorithms through centered kernel alignment analysis and loss landscape visualization during the learning process.

Via

Access Paper or Ask Questions

Contrastive Learning of English Language and Crystal Graphs for Multimodal Representation of Materials Knowledge

Feb 23, 2025

Yang Jeong Park, Mayank Kumaran, Chia-Wei Hsu, Elsa Olivetti, Ju Li

Figure 1 for Contrastive Learning of English Language and Crystal Graphs for Multimodal Representation of Materials Knowledge

Figure 2 for Contrastive Learning of English Language and Crystal Graphs for Multimodal Representation of Materials Knowledge

Figure 3 for Contrastive Learning of English Language and Crystal Graphs for Multimodal Representation of Materials Knowledge

Figure 4 for Contrastive Learning of English Language and Crystal Graphs for Multimodal Representation of Materials Knowledge

Abstract:Artificial intelligence (AI) is increasingly used for the inverse design of materials, such as crystals and molecules. Existing AI research on molecules has integrated chemical structures of molecules with textual knowledge to adapt to complex instructions. However, this approach has been unattainable for crystals due to data scarcity from the biased distribution of investigated crystals and the lack of semantic supervision in peer-reviewed literature. In this work, we introduce a contrastive language-crystals model (CLaC) pre-trained on a newly synthesized dataset of 126k crystal structure-text pairs. To demonstrate the advantage of using synthetic data to overcome data scarcity, we constructed a comparable dataset extracted from academic papers. We evaluate CLaC's generalization ability through various zero-shot cross-modal tasks and downstream applications. In experiments, CLaC achieves state-of-the-art zero-shot generalization performance in understanding crystal structures, surpassing latest large language models.

* 24 pages, 14 figure

Via

Access Paper or Ask Questions

Bounds of Block Rewards in Honest PinFi Systems

Apr 01, 2024

Qi He, Yunwei Mao, Ju Li

Figure 1 for Bounds of Block Rewards in Honest PinFi Systems

Figure 2 for Bounds of Block Rewards in Honest PinFi Systems

Figure 3 for Bounds of Block Rewards in Honest PinFi Systems

Figure 4 for Bounds of Block Rewards in Honest PinFi Systems

Abstract:PinFi is a class of novel protocols for decentralized pricing of dissipative assets, whose value naturally declines over time. Central to the protocol's functionality and its market efficiency is the role of liquidity providers (LPs). This study addresses critical stability and sustainability challenges within the protocol, namely: the propensity of LPs to prefer selling in external markets over participation in the protocol; a similar inclination towards selling within the PinFi system rather than contributing as LPs; and a scenario where LPs are disinclined to sell within the protocol. Employing a game-theoretic approach, we explore PinFi's mechanisms and its broader ramifications. Our findings reveal that, under a variety of common conditions and with an assumption of participant integrity, PinFi is capable of fostering a dynamic equilibrium among LPs, sellers, and buyers. This balance is maintained through a carefully calibrated range of block rewards for LPs, ensuring the protocol's long-term stability and utility.

Via

Access Paper or Ask Questions

Generative Model for Constructing Reaction Path from Initial to Final States

Jan 19, 2024

Akihide Hayashi, So Takamoto, Ju Li, Daisuke Okanohara

Abstract:Mapping out reaction pathways and their corresponding activation barriers is a significant aspect of molecular simulation. Given their inherent complexity and nonlinearity, even generating a initial guess of these paths remains a challenging problem. Presented in this paper is an innovative approach that utilizes neural networks to generate initial guess for these reaction pathways. The proposed method is initiated by inputting the coordinates of the initial state, followed by progressive alterations to its structure. This iterative process culminates in the generation of the approximate representation of the reaction path and the coordinates of the final state. The application of this method extends to complex reaction pathways illustrated by organic reactions. Training was executed on the Transition1x dataset, an organic reaction pathway dataset. The results revealed generation of reactions that bore substantial similarities with the corresponding test data. The method's flexibility allows for reactions to be generated either to conform to predetermined conditions or in a randomized manner.

Via

Access Paper or Ask Questions

Blind quantum machine learning with quantum bipartite correlator

Oct 19, 2023

Changhao Li, Boning Li, Omar Amer, Ruslan Shaydulin, Shouvanik Chakrabarti, Guoqing Wang, Haowei Xu, Hao Tang, Isidor Schoch, Niraj Kumar(+4 more)

Figure 1 for Blind quantum machine learning with quantum bipartite correlator

Figure 2 for Blind quantum machine learning with quantum bipartite correlator

Figure 3 for Blind quantum machine learning with quantum bipartite correlator

Figure 4 for Blind quantum machine learning with quantum bipartite correlator

Abstract:Distributed quantum computing is a promising computational paradigm for performing computations that are beyond the reach of individual quantum devices. Privacy in distributed quantum computing is critical for maintaining confidentiality and protecting the data in the presence of untrusted computing nodes. In this work, we introduce novel blind quantum machine learning protocols based on the quantum bipartite correlator algorithm. Our protocols have reduced communication overhead while preserving the privacy of data from untrusted parties. We introduce robust algorithm-specific privacy-preserving mechanisms with low computational overhead that do not require complex cryptographic techniques. We then validate the effectiveness of the proposed protocols through complexity and privacy analysis. Our findings pave the way for advancements in distributed quantum computing, opening up new possibilities for privacy-aware machine learning applications in the era of quantum technologies.

* 11 pages, 3 figures

Via

Access Paper or Ask Questions

1.5 million materials narratives generated by chatbots

Aug 25, 2023

Yang Jeong Park, Sung Eun Jerng, Jin-Sung Park, Choah Kwon, Chia-Wei Hsu, Zhichu Ren, Sungroh Yoon, Ju Li

Abstract:The advent of artificial intelligence (AI) has enabled a comprehensive exploration of materials for various applications. However, AI models often prioritize frequently encountered materials in the scientific literature, limiting the selection of suitable candidates based on inherent physical and chemical properties. To address this imbalance, we have generated a dataset of 1,494,017 natural language-material paragraphs based on combined OQMD, Materials Project, JARVIS, COD and AFLOW2 databases, which are dominated by ab initio calculations and tend to be much more evenly distributed on the periodic table. The generated text narratives were then polled and scored by both human experts and ChatGPT-4, based on three rubrics: technical accuracy, language and structure, and relevance and depth of content, showing similar scores but with human-scored depth of content being the most lagging. The merger of multi-modality data sources and large language model (LLM) holds immense potential for AI frameworks to help the exploration and discovery of solid-state materials for specific applications.

Via

Access Paper or Ask Questions

A new 3-DOF 2T1R parallel mechanism: Topology design and kinematics

Jun 22, 2023

Huiping Shen, Zhongqiu Du, Damien Chablat, Ju Li, Guanglei Wu

Figure 1 for A new 3-DOF 2T1R parallel mechanism: Topology design and kinematics

Figure 2 for A new 3-DOF 2T1R parallel mechanism: Topology design and kinematics

Figure 3 for A new 3-DOF 2T1R parallel mechanism: Topology design and kinematics

Figure 4 for A new 3-DOF 2T1R parallel mechanism: Topology design and kinematics

Abstract:This article presents a new three-degree-of-freedom (3-DOF) parallel mechanism (PM) with two translations and one rotation (2T1R), designed based on the topological design theory of the parallel mechanism using position and orientation characteristics (POC). The PM is primarily intended for use in package sorting and delivery. The mobile platform of the PM moves along a translation axis, picks up objects from a conveyor belt, and tilts them to either side of the axis. We first calculate the PM's topological characteristics, such as the degree of freedom (DOF) and the degree of coupling, and provide its topological analytical formula to represent the topological information of the PM. Next, we solve the direct and inverse kinematic models based on the kinematic modelling principle using the topological features. The models are purely analytic and are broken down into a series of quadratic equations, making them suitable for use in an industrial robot. We also study the singular configurations to identify the serial and parallel singularities. Using the decoupling properties, we size the mechanism to address the package sorting and depositing problem using an algebraic approach. To determine the smallest segment lengths, we use a cylindrical algebraic decomposition to solve a system with inequalities.

* IDETC-CIE 2023 International Design Engineering Technical Conferences & Computers and Information in Engineering Conference, ASME, Aug 2023, Boston, France

Via

Access Paper or Ask Questions

Can ChatGPT be used to generate scientific hypotheses?

Mar 30, 2023

Yang Jeong Park, Daniel Kaplan, Zhichu Ren, Chia-Wei Hsu, Changhao Li, Haowei Xu, Sipei Li, Ju Li

Figure 1 for Can ChatGPT be used to generate scientific hypotheses?

Abstract:We investigate whether large language models can perform the creative hypothesis generation that human researchers regularly do. While the error rate is high, generative AI seems to be able to effectively structure vast amounts of scientific knowledge and provide interesting and testable hypotheses. The future scientific enterprise may include synergistic efforts with a swarm of "hypothesis machines", challenged by automated experimentation and adversarial peer reviews.

Via

Access Paper or Ask Questions