Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yunkai Wang

SpeHeatal: A Cluster-Enhanced Segmentation Method for Sperm Morphology Analysis

Feb 18, 2025

Yi Shi, Yunkai Wang, Xupeng Tian, Tieyi Zhang, Bing Yao, Hui Wang, Yong Shao, Cencen Wang, Rong Zeng

Abstract:The accurate assessment of sperm morphology is crucial in andrological diagnostics, where the segmentation of sperm images presents significant challenges. Existing approaches frequently rely on large annotated datasets and often struggle with the segmentation of overlapping sperm and the presence of dye impurities. To address these challenges, this paper first analyzes the issue of overlapping sperm tails from a geometric perspective and introduces a novel clustering algorithm, Con2Dis, which effectively segments overlapping tails by considering three essential factors: CONnectivity, CONformity, and DIStance. Building on this foundation, we propose an unsupervised method, SpeHeatal, designed for the comprehensive segmentation of the SPErm HEAd and TAiL. SpeHeatal employs the Segment Anything Model(SAM) to generate masks for sperm heads while filtering out dye impurities, utilizes Con2Dis to segment tails, and then applies a tailored mask splicing technique to produce complete sperm masks. Experimental results underscore the superior performance of SpeHeatal, particularly in handling images with overlapping sperm.

* AAAI2025

Via

Access Paper or Ask Questions

Zero-shot Transfer Learning of Driving Policy via Socially Adversarial Traffic Flow

Apr 25, 2023

Dongkun Zhang, Jintao Xue, Yuxiang Cui, Yunkai Wang, Eryun Liu, Wei Jing, Junbo Chen, Rong Xiong, Yue Wang

Figure 1 for Zero-shot Transfer Learning of Driving Policy via Socially Adversarial Traffic Flow

Figure 2 for Zero-shot Transfer Learning of Driving Policy via Socially Adversarial Traffic Flow

Figure 3 for Zero-shot Transfer Learning of Driving Policy via Socially Adversarial Traffic Flow

Figure 4 for Zero-shot Transfer Learning of Driving Policy via Socially Adversarial Traffic Flow

Abstract:Acquiring driving policies that can transfer to unseen environments is challenging when driving in dense traffic flows. The design of traffic flow is essential and previous studies are unable to balance interaction and safety-criticism. To tackle this problem, we propose a socially adversarial traffic flow. We propose a Contextual Partially-Observable Stochastic Game to model traffic flow and assign Social Value Orientation (SVO) as context. We then adopt a two-stage framework. In Stage 1, each agent in our socially-aware traffic flow is driven by a hierarchical policy where upper-level policy communicates genuine SVOs of all agents, which the lower-level policy takes as input. In Stage 2, each agent in the socially adversarial traffic flow is driven by the hierarchical policy where upper-level communicates mistaken SVOs, taken by the lower-level policy trained in Stage 1. Driving policy is adversarially trained through a zero-sum game formulation with upper-level policies, resulting in a policy with enhanced zero-shot transfer capability to unseen traffic flows. Comprehensive experiments on cross-validation verify the superior zero-shot transfer performance of our method.

Via

Access Paper or Ask Questions

Least Square Estimation Network for Depth Completion

Mar 07, 2022

Xianze Fang, Zexi Chen, Yunkai Wang, Yue Wang, Rong Xiong

Figure 1 for Least Square Estimation Network for Depth Completion

Figure 2 for Least Square Estimation Network for Depth Completion

Figure 3 for Least Square Estimation Network for Depth Completion

Figure 4 for Least Square Estimation Network for Depth Completion

Abstract:Depth completion is a fundamental task in computer vision and robotics research. Many previous works complete the dense depth map with neural networks directly but most of them are non-interpretable and can not generalize to different situations well. In this paper, we propose an effective image representation method for depth completion tasks. The input of our system is a monocular camera frame and the synchronous sparse depth map. The output of our system is a dense per-pixel depth map of the frame. First we use a neural network to transform each pixel into a feature vector, which we call base functions. Then we pick out the known pixels' base functions and their depth values. We use a linear least square algorithm to fit the base functions and the depth values. Then we get the weights estimated from the least square algorithm. Finally, we apply the weights to the whole image and predict the final depth map. Our method is interpretable so it can generalize well. Experiments show that our results beat the state-of-the-art on NYU-Depth-V2 dataset both in accuracy and runtime. Moreover, experiments show that our method can generalize well on different numbers of sparse points and different datasets.

Via

Access Paper or Ask Questions

Domain Generalization for Vision-based Driving Trajectory Generation

Sep 22, 2021

Yunkai Wang, Dongkun Zhang, Yuxiang Cui, Zexi Chen, Wei Jing, Junbo Chen, Rong Xiong, Yue Wang

Figure 1 for Domain Generalization for Vision-based Driving Trajectory Generation

Figure 2 for Domain Generalization for Vision-based Driving Trajectory Generation

Figure 3 for Domain Generalization for Vision-based Driving Trajectory Generation

Figure 4 for Domain Generalization for Vision-based Driving Trajectory Generation

Abstract:One of the challenges in vision-based driving trajectory generation is dealing with out-of-distribution scenarios. In this paper, we propose a domain generalization method for vision-based driving trajectory generation for autonomous vehicles in urban environments, which can be seen as a solution to extend the Invariant Risk Minimization (IRM) method in complex problems. We leverage an adversarial learning approach to train a trajectory generator as the decoder. Based on the pre-trained decoder, we infer the latent variables corresponding to the trajectories, and pre-train the encoder by regressing the inferred latent variable. Finally, we fix the decoder but fine-tune the encoder with the final trajectory loss. We compare our proposed method with the state-of-the-art trajectory generation method and some recent domain generalization methods on both datasets and simulation, demonstrating that our method has better generalization ability.

Via

Access Paper or Ask Questions

Learn to Differ: Sim2Real Small Defection Segmentation Network

Mar 07, 2021

Zexi Chen, Zheyuan Huang, Yunkai Wang, Xuecheng Xu, Yue Wang, Rong Xiong

Figure 1 for Learn to Differ: Sim2Real Small Defection Segmentation Network

Figure 2 for Learn to Differ: Sim2Real Small Defection Segmentation Network

Figure 3 for Learn to Differ: Sim2Real Small Defection Segmentation Network

Figure 4 for Learn to Differ: Sim2Real Small Defection Segmentation Network

Abstract:Recent studies on deep-learning-based small defection segmentation approaches are trained in specific settings and tend to be limited by fixed context. Throughout the training, the network inevitably learns the representation of the background of the training data before figuring out the defection. They underperform in the inference stage once the context changed and can only be solved by training in every new setting. This eventually leads to the limitation in practical robotic applications where contexts keep varying. To cope with this, instead of training a network context by context and hoping it to generalize, why not stop misleading it with any limited context and start training it with pure simulation? In this paper, we propose the network SSDS that learns a way of distinguishing small defections between two images regardless of the context, so that the network can be trained once for all. A small defection detection layer utilizing the pose sensitivity of phase correlation between images is introduced and is followed by an outlier masking layer. The network is trained on randomly generated simulated data with simple shapes and is generalized across the real world. Finally, SSDS is validated on real-world collected data and demonstrates the ability that even when trained in cheap simulation, SSDS can still find small defections in the real world showing the effectiveness and its potential for practical applications.

Via

Access Paper or Ask Questions

Pose Randomization for Weakly Paired Image Style Translation

Oct 31, 2020

Zexi Chen, Jiaxin Guo, Xuecheng Xu, Yunkai Wang, Yue Wang, Rong Xiong

Figure 1 for Pose Randomization for Weakly Paired Image Style Translation

Figure 2 for Pose Randomization for Weakly Paired Image Style Translation

Figure 3 for Pose Randomization for Weakly Paired Image Style Translation

Figure 4 for Pose Randomization for Weakly Paired Image Style Translation

Abstract:Utilizing the trained model under different conditions without data annotation is attractive for robot applications. Towards this goal, one class of methods is to translate the image style from the training environment to the current one. Conventional studies on image style translation mainly focus on two settings: paired data on images from two domains with exactly aligned content, and unpaired data, with independent content. In this paper, we would like to propose a new setting, where the content in the two images is aligned with error in poses. We consider that this setting is more practical since robots with various sensors are able to align the data up to some error level, even with different styles. To solve this problem, we propose PRoGAN to learn a style translator by intentionally transforming the original domain images with a noisy pose, then matching the distribution of translated transformed images and the distribution of the target domain images. The adversarial training enforces the network to learn the style translation, avoiding being entangled with other variations. In addition, we propose two pose estimation based self-supervised tasks to further improve the performance. Finally, PRoGAN is validated on both simulated and real-world collected data to show the effectiveness. Results on down-stream tasks, classification, road segmentation, object detection, and feature matching show its potential for real applications. https://github.com/wrld/PRoGAN .

Via

Access Paper or Ask Questions

Imitation Learning of Hierarchical Driving Model: from Continuous Intention to Continuous Trajectory

Oct 20, 2020

Yunkai Wang, Dongkun Zhang, Jingke Wang, Zexi Chen, Yue Wang, Rong Xiong

Figure 1 for Imitation Learning of Hierarchical Driving Model: from Continuous Intention to Continuous Trajectory

Figure 2 for Imitation Learning of Hierarchical Driving Model: from Continuous Intention to Continuous Trajectory

Figure 3 for Imitation Learning of Hierarchical Driving Model: from Continuous Intention to Continuous Trajectory

Figure 4 for Imitation Learning of Hierarchical Driving Model: from Continuous Intention to Continuous Trajectory

Abstract:One of the challenges to reduce the gap between the machine and the human level driving is how to endow the system with the learning capacity to deal with the coupled complexity of environments, intentions, and dynamics. In this paper, we propose a hierarchical driving model with explicit model of continuous intention and continuous dynamics, which decouples the complexity in the observation-to-action reasoning in the human driving data. Specifically, the continuous intention module takes the route planning and perception to generate a potential map encoded with obstacles and goals being expressed as grid based potentials. Then, the potential map is regarded as a condition, together with the current dynamics, to generate the trajectory. The trajectory is modeled by a network based continuous function approximator, which naturally reserves the derivatives for high-order supervision without any additional parameters. Finally, we validate our method on both datasets and simulators, demonstrating superior performance. The method is also deployed on the real vehicle with loop latency, validating its effectiveness.

Via

Access Paper or Ask Questions

Collaborative Localization of Aerial and Ground Mobile Robots through Orthomosaic Map

Jul 22, 2020

Zexi Chen, Xuecheng Xu, Yue Wang, Yunkai Wang, Rong Xiong

Figure 1 for Collaborative Localization of Aerial and Ground Mobile Robots through Orthomosaic Map

Figure 2 for Collaborative Localization of Aerial and Ground Mobile Robots through Orthomosaic Map

Figure 3 for Collaborative Localization of Aerial and Ground Mobile Robots through Orthomosaic Map

Figure 4 for Collaborative Localization of Aerial and Ground Mobile Robots through Orthomosaic Map

Abstract:With the deepening of research on the SLAM system, the possibility of cooperative SLAM with multi-robots has been proposed. This paper presents a map matching and localization approach considering the cooperative SLAM of an aerial-ground system. The proposed approach aims to help precisely matching the map constructed by two independent systems that have large scale variance of viewpoints of the same route and eventually enables the ground mobile robot to localize itself in the global map given by the drone. It contains dense mapping with Elevation Map and software "Metashape", map matching with a proposed template matching algorithm, weighted normalized cross-correlation (WNCC) and localization with particle filter. The approach enables map matching for cooperative SLAM with the feasibility of multiple scene sensors, varies from stereo cameras to lidars, and is insensitive to the synchronization of the two systems. We demonstrate the accuracy, robustness, and the speed of the approach under experiments of the Aero-Ground Dataset.

Via

Access Paper or Ask Questions

Multi-agent Collaboration for Feasible Collaborative Behavior Construction and Evaluation

Sep 30, 2019

Yunkai Wang, Shenhan Jia, Zexi Chen, Zheyuan Huang, Rong Xiong

Figure 1 for Multi-agent Collaboration for Feasible Collaborative Behavior Construction and Evaluation

Figure 2 for Multi-agent Collaboration for Feasible Collaborative Behavior Construction and Evaluation

Figure 3 for Multi-agent Collaboration for Feasible Collaborative Behavior Construction and Evaluation

Figure 4 for Multi-agent Collaboration for Feasible Collaborative Behavior Construction and Evaluation

Abstract:In the case of the two-person zero-sum stochastic game with a central controller, this paper proposes a best collaborative behavior search and selection algorithm based on reinforcement learning, in response to how to choose the best collaborative object and action for the central controller. In view of the existing multi-agent collaboration and confrontation reinforcement learning methods, the methods of traversing all actions in a certain state leads to the problem of long calculation time and unsafe policy exploration. This paper proposes to construct a feasible collaborative behavior set by using action space discretization, establishing models of both sides, model-based prediction and parallel search. Then, we use the deep q-learning method in reinforcement learning to train the scoring function to select the optimal collaboration behavior from the feasible collaborative behavior set. This method enables efficient and accurate calculation in an environment with strong confrontation, high dynamics and a large number of agents, which is verified by the RoboCup Small Size League robots passing collaboration.

* 7 pages, IEEE ROBIO 2019

Via

Access Paper or Ask Questions

Champion Team Paper: Dynamic Passing-Shooting Algorithm Based on CUDA of The RoboCup SSL 2019 Champion

Sep 17, 2019

Zexi Chen, Haodong Zhang, Dashun Guo, Shenhan Jia, Xianze Fang, Zheyuan Huang, Yunkai Wang, Peng Hu, Licheng Wen, Lingyun Chen(+2 more)

Figure 1 for Champion Team Paper: Dynamic Passing-Shooting Algorithm Based on CUDA of The RoboCup SSL 2019 Champion

Figure 2 for Champion Team Paper: Dynamic Passing-Shooting Algorithm Based on CUDA of The RoboCup SSL 2019 Champion

Figure 3 for Champion Team Paper: Dynamic Passing-Shooting Algorithm Based on CUDA of The RoboCup SSL 2019 Champion

Figure 4 for Champion Team Paper: Dynamic Passing-Shooting Algorithm Based on CUDA of The RoboCup SSL 2019 Champion

Abstract:ZJUNlict became the Small Size League Champion of RoboCup 2019 with 6 victories and 1 tie for their 7 games. The overwhelming ability of ball-handling and passing allows ZJUNlict to greatly threaten its opponent and almost kept its goal clear without being threatened. This paper presents the core technology of its ball-handling and robot movement which consist of hardware optimization, dynamic passing and shooting strategy, and multi-agent cooperation and formation. We first describe the mechanical optimization on the placement of the capacitors, the redesign of the damping system of the dribbler and the electrical optimization on the replacement of the core chip. We then describe our passing point algorithm. The passing and shooting strategy can be separated into two different parts, where we search the passing point on SBIP-DPPS and evaluate the point based on the ball model. The statements and the conclusion should be supported by the performances and log of games on Small Size League RoboCup 2019.

* RoboCup SSL 2019 Champion Paper

Via

Access Paper or Ask Questions