Picture for Zhenyu Wu

Zhenyu Wu

School of Computing and Artificial Intelligence, Southwest Jiaotong University

VLA-Reasoner: Empowering Vision-Language-Action Models with Reasoning via Online Monte Carlo Tree Search

Add code
Sep 26, 2025
Viaarxiv icon

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data

Add code
Sep 18, 2025
Figure 1 for ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
Figure 2 for ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
Figure 3 for ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
Figure 4 for ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
Viaarxiv icon

Schema Inference for Tabular Data Repositories Using Large Language Models

Add code
Sep 04, 2025
Viaarxiv icon

SafeBimanual: Diffusion-based Trajectory Optimization for Safe Bimanual Manipulation

Add code
Aug 25, 2025
Viaarxiv icon

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Add code
Aug 25, 2025
Viaarxiv icon

Superpixel-informed Continuous Low-Rank Tensor Representation for Multi-Dimensional Data Recovery

Add code
Aug 17, 2025
Figure 1 for Superpixel-informed Continuous Low-Rank Tensor Representation for Multi-Dimensional Data Recovery
Figure 2 for Superpixel-informed Continuous Low-Rank Tensor Representation for Multi-Dimensional Data Recovery
Figure 3 for Superpixel-informed Continuous Low-Rank Tensor Representation for Multi-Dimensional Data Recovery
Figure 4 for Superpixel-informed Continuous Low-Rank Tensor Representation for Multi-Dimensional Data Recovery
Viaarxiv icon

UniLGL: Learning Uniform Place Recognition for FOV-limited/Panoramic LiDAR Global Localization

Add code
Jul 16, 2025
Viaarxiv icon

ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows

Add code
May 26, 2025
Figure 1 for ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows
Figure 2 for ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows
Figure 3 for ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows
Figure 4 for ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows
Viaarxiv icon

Towards NSFW-Free Text-to-Image Generation via Safety-Constraint Direct Preference Optimization

Add code
Apr 19, 2025
Viaarxiv icon

Bearing fault diagnosis based on multi-scale spectral images and convolutional neural network

Add code
Mar 27, 2025
Figure 1 for Bearing fault diagnosis based on multi-scale spectral images and convolutional neural network
Figure 2 for Bearing fault diagnosis based on multi-scale spectral images and convolutional neural network
Figure 3 for Bearing fault diagnosis based on multi-scale spectral images and convolutional neural network
Figure 4 for Bearing fault diagnosis based on multi-scale spectral images and convolutional neural network
Viaarxiv icon