Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ye Chen

Long-Distance Field Demonstration of Imaging-Free Drone Identification in Intracity Environments

Apr 26, 2025

Junran Guo, Tonglin Mu, Keyuan Li, Jianing Li, Ziyang Luo, Ye Chen, Xiaodong Fan, Jinquan Huang, Minjie Liu, Jinbei Zhang(+3 more)

Abstract:Detecting small objects, such as drones, over long distances presents a significant challenge with broad implications for security, surveillance, environmental monitoring, and autonomous systems. Traditional imaging-based methods rely on high-resolution image acquisition, but are often constrained by range, power consumption, and cost. In contrast, data-driven single-photon-single-pixel light detection and ranging (\text{D\textsuperscript{2}SP\textsuperscript{2}-LiDAR}) provides an imaging-free alternative, directly enabling target identification while reducing system complexity and cost. However, its detection range has been limited to a few hundred meters. Here, we introduce a novel integration of residual neural networks (ResNet) with \text{D\textsuperscript{2}SP\textsuperscript{2}-LiDAR}, incorporating a refined observation model to extend the detection range to 5~\si{\kilo\meter} in an intracity environment while enabling high-accuracy identification of drone poses and types. Experimental results demonstrate that our approach not only outperforms conventional imaging-based recognition systems, but also achieves 94.93\% pose identification accuracy and 97.99\% type classification accuracy, even under weak signal conditions with long distances and low signal-to-noise ratios (SNRs). These findings highlight the potential of imaging-free methods for robust long-range detection of small targets in real-world scenarios.

* 15 pages, 9 figures

Via

Access Paper or Ask Questions

InstantSticker: Realistic Decal Blending via Disentangled Object Reconstruction

Apr 09, 2025

Yi Zhang, Xiaoyang Huang, Yishun Dou, Yue Shi, Rui Shi, Ye Chen, Bingbing Ni, Wenjun Zhang

Abstract:We present InstantSticker, a disentangled reconstruction pipeline based on Image-Based Lighting (IBL), which focuses on highly realistic decal blending, simulates stickers attached to the reconstructed surface, and allows for instant editing and real-time rendering. To achieve stereoscopic impression of the decal, we introduce shadow factor into IBL, which can be adaptively optimized during training. This allows the shadow brightness of surfaces to be accurately decomposed rather than baked into the diffuse color, ensuring that the edited texture exhibits authentic shading. To address the issues of warping and blurriness in previous methods, we apply As-Rigid-As-Possible (ARAP) parameterization to pre-unfold a specified area of the mesh and use the local UV mapping combined with a neural texture map to enhance the ability to express high-frequency details in that area. For instant editing, we utilize the Disney BRDF model, explicitly defining material colors with 3-channel diffuse albedo. This enables instant replacement of albedo RGB values during the editing process, avoiding the prolonged optimization required in previous approaches. In our experiment, we introduce the Ratio Variance Warping (RVW) metric to evaluate the local geometric warping of the decal area. Extensive experimental results demonstrate that our method surpasses previous decal blending methods in terms of editing quality, editing speed and rendering speed, achieving the state-of-the-art.

* Accepted by AAAI 2025

Via

Access Paper or Ask Questions

AMR-Transformer: Enabling Efficient Long-range Interaction for Complex Neural Fluid Simulation

Mar 13, 2025

Zeyi Xu, Jinfan Liu, Kuangxu Chen, Ye Chen, Zhangli Hu, Bingbing Ni

Abstract:Accurately and efficiently simulating complex fluid dynamics is a challenging task that has traditionally relied on computationally intensive methods. Neural network-based approaches, such as convolutional and graph neural networks, have partially alleviated this burden by enabling efficient local feature extraction. However, they struggle to capture long-range dependencies due to limited receptive fields, and Transformer-based models, while providing global context, incur prohibitive computational costs. To tackle these challenges, we propose AMR-Transformer, an efficient and accurate neural CFD-solving pipeline that integrates a novel adaptive mesh refinement scheme with a Navier-Stokes constraint-aware fast pruning module. This design encourages long-range interactions between simulation cells and facilitates the modeling of global fluid wave patterns, such as turbulence and shockwaves. Experiments show that our approach achieves significant gains in efficiency while preserving critical details, making it suitable for high-resolution physical simulations with long-range dependencies. On CFDBench, PDEBench and a new shockwave dataset, our pipeline demonstrates up to an order-of-magnitude improvement in accuracy over baseline models. Additionally, compared to ViT, our approach achieves a reduction in FLOPs of up to 60 times.

Via

Access Paper or Ask Questions

MedCT: A Clinical Terminology Graph for Generative AI Applications in Healthcare

Jan 11, 2025

Ye Chen, Dongdong Huang, Haoyun Xu, Cong Fu, Lin Sheng, Qingli Zhou, Yuqiang Shen, Kai Wang

Figure 1 for MedCT: A Clinical Terminology Graph for Generative AI Applications in Healthcare

Figure 2 for MedCT: A Clinical Terminology Graph for Generative AI Applications in Healthcare

Figure 3 for MedCT: A Clinical Terminology Graph for Generative AI Applications in Healthcare

Figure 4 for MedCT: A Clinical Terminology Graph for Generative AI Applications in Healthcare

Abstract:We introduce the world's first clinical terminology for the Chinese healthcare community, namely MedCT, accompanied by a clinical foundation model MedBERT and an entity linking model MedLink. The MedCT system enables standardized and programmable representation of Chinese clinical data, successively stimulating the development of new medicines, treatment pathways, and better patient outcomes for the populous Chinese community. Moreover, the MedCT knowledge graph provides a principled mechanism to minimize the hallucination problem of large language models (LLMs), therefore achieving significant levels of accuracy and safety in LLM-based clinical applications. By leveraging the LLMs' emergent capabilities of generativeness and expressiveness, we were able to rapidly built a production-quality terminology system and deployed to real-world clinical field within three months, while classical terminologies like SNOMED CT have gone through more than twenty years development. Our experiments show that the MedCT system achieves state-of-the-art (SOTA) performance in semantic matching and entity linking tasks, not only for Chinese but also for English. We also conducted a longitudinal field experiment by applying MedCT and LLMs in a representative spectrum of clinical tasks, including electronic health record (EHR) auto-generation and medical document search for diagnostic decision making. Our study shows a multitude of values of MedCT for clinical workflows and patient outcomes, especially in the new genre of clinical LLM applications. We present our approach in sufficient engineering detail, such that implementing a clinical terminology for other non-English societies should be readily reproducible. We openly release our terminology, models and algorithms, along with real-world clinical datasets for the development.

Via

Access Paper or Ask Questions

SoftTiger: A Clinical Foundation Model for Healthcare Workflows

Mar 01, 2024

Ye Chen, Igor Couto, Wei Cai, Cong Fu, Bruno Dorneles

Abstract:We release and introduce SoftTiger, a clinical large language model (CLaM) designed as a foundation model for healthcare workflows. The narrative and unstructured nature of clinical notes is a major obstacle for healthcare intelligentization. We address a critical problem of structuring clinical notes into clinical data, according to international interoperability standards. We collect and annotate data for three critical subtasks, namely, international patient summary, clinical impression and medical encounter. We then supervised fine-tuned a state-of-the-art LLM using public and credentialed clinical data. The training is orchestrated in a way that the target model can first support basic clinical tasks such as abbreviation expansion and temporal information extraction, and then learn to perform more complex downstream clinical tasks such as impression and encounter summary. Moreover, we address, several modeling challenges in the healthcare context, e.g., extra long context window. Our blind pairwise evaluation shows that SoftTiger outperforms other popular open-source models and GPT-3.5, comparable to Gemini-pro, and only has a mild gap from GPT-4. We believe that LLMs may become a step-stone towards healthcare digitalization and democratization. Therefore, we publicly release SoftTiger models at scales of 13 billion and 70 billion parameters, as well as datasets and code for our innovative scalable evaluation, hopefully, making a significant contribution to the healthcare industry.

Via

Access Paper or Ask Questions

TigerBot: An Open Multilingual Multitask LLM

Dec 15, 2023

Ye Chen, Wei Cai, Liangmin Wu, Xiaowei Li, Zhanxuan Xin, Cong Fu

Abstract:We release and introduce the TigerBot family of large language models (LLMs), consisting of base and chat models, sized from 7, 13, 70 and 180 billion parameters. We develop our models embarking from Llama-2 and BLOOM, and push the boundary further in data, training algorithm, infrastructure, and application tools. Our models yield meaningful performance gain over SOTA open-source models, e.g., Llama-2, specifically 6% gain in English and 20% gain in Chinese. TigerBot model family also achieves leading performance in major academic and industrial benchmarks and leaderboards. We believe that TigerBot represents just a snapshot of lightning-fast progression in LLM open-source community. Therefore, we are thrilled to give back by publicly releasing our models and reporting our approach behind, with additional emphases on building SOTA LLMs in a democratized way and making LLMs of use in real-world applications.

Via

Access Paper or Ask Questions

Inferring Fluid Dynamics via Inverse Rendering

Apr 10, 2023

Jinxian Liu, Ye Chen, Bingbing Ni, Jiyao Mao, Zhenbo Yu

Abstract:Humans have a strong intuitive understanding of physical processes such as fluid falling by just a glimpse of such a scene picture, i.e., quickly derived from our immersive visual experiences in memory. This work achieves such a photo-to-fluid-dynamics reconstruction functionality learned from unannotated videos, without any supervision of ground-truth fluid dynamics. In a nutshell, a differentiable Euler simulator modeled with a ConvNet-based pressure projection solver, is integrated with a volumetric renderer, supporting end-to-end/coherent differentiable dynamic simulation and rendering. By endowing each sampled point with a fluid volume value, we derive a NeRF-like differentiable renderer dedicated from fluid data; and thanks to this volume-augmented representation, fluid dynamics could be inversely inferred from the error signal between the rendered result and ground-truth video frame (i.e., inverse rendering). Experiments on our generated Fluid Fall datasets and DPI Dam Break dataset are conducted to demonstrate both effectiveness and generalization ability of our method.

Via

Access Paper or Ask Questions

Deep Learning-Based Autoencoder for Data-Driven Modeling of an RF Photoinjector

Feb 18, 2021

Jun Zhu, Ye Chen, Frank Brinker, Winfried Decking, Sergey Tomin, Holger Schlarb

Figure 1 for Deep Learning-Based Autoencoder for Data-Driven Modeling of an RF Photoinjector

Figure 2 for Deep Learning-Based Autoencoder for Data-Driven Modeling of an RF Photoinjector

Figure 3 for Deep Learning-Based Autoencoder for Data-Driven Modeling of an RF Photoinjector

Figure 4 for Deep Learning-Based Autoencoder for Data-Driven Modeling of an RF Photoinjector

Abstract:We adopt a data-driven approach to model the longitudinal phase-space diagnostic beamline at the European XFEL photoinjector. A deep convolutional neural network (decoder) is used to build a 2D distribution from a small feature space learned by another neural network (encoder). We demonstrate that the autoencoder trained on experimental data can make very high-quality predictions of megapixel images for the longitudinal phase-space measurement. The prediction significantly outperforms existing methods. We also show the explicability of the autoencoder by sharing the same decoder with more than one encoder used for different setups of the photoinjector. This opens the door to a new way of accurately modeling a photoinjector using neural networks. The approach can possibly be extended to the whole accelerator and even the photon beamlines.

Via

Access Paper or Ask Questions

Limitation of Acyclic Oriented Graphs Matching as Cell Tracking Accuracy Measure when Evaluating Mitosis

Dec 22, 2020

Ye Chen, Yuankai Huo

Figure 1 for Limitation of Acyclic Oriented Graphs Matching as Cell Tracking Accuracy Measure when Evaluating Mitosis

Figure 2 for Limitation of Acyclic Oriented Graphs Matching as Cell Tracking Accuracy Measure when Evaluating Mitosis

Figure 3 for Limitation of Acyclic Oriented Graphs Matching as Cell Tracking Accuracy Measure when Evaluating Mitosis

Abstract:Multi-object tracking (MOT) in computer vision and cell tracking in biomedical image analysis are two similar research fields, whose common aim is to achieve instance level object detection/segmentation and associate such objects across different video frames. However, one major difference between these two tasks is that cell tracking also aim to detect mitosis (cell division), which is typically not considered in MOT tasks. Therefore, the acyclic oriented graphs matching (AOGM) has been used as de facto standard evaluation metrics for cell tracking, rather than directly using the evaluation metrics in computer vision, such as multiple object tracking accuracy (MOTA), ID Switches (IDS), ID F1 Score (IDF1) etc. However, based on our experiments, we realized that AOGM did not always function as expected for mitosis events. In this paper, we exhibit the limitations of evaluating mitosis with AOGM using both simulated and real cell tracking data.

Via

Access Paper or Ask Questions

Self-Prediction for Joint Instance and Semantic Segmentation of Point Clouds

Jul 27, 2020

Jinxian Liu, Minghui Yu, Bingbing Ni, Ye Chen

Figure 1 for Self-Prediction for Joint Instance and Semantic Segmentation of Point Clouds

Figure 2 for Self-Prediction for Joint Instance and Semantic Segmentation of Point Clouds

Figure 3 for Self-Prediction for Joint Instance and Semantic Segmentation of Point Clouds

Figure 4 for Self-Prediction for Joint Instance and Semantic Segmentation of Point Clouds

Abstract:We develop a novel learning scheme named Self-Prediction for 3D instance and semantic segmentation of point clouds. Distinct from most existing methods that focus on designing convolutional operators, our method designs a new learning scheme to enhance point relation exploring for better segmentation. More specifically, we divide a point cloud sample into two subsets and construct a complete graph based on their representations. Then we use label propagation algorithm to predict labels of one subset when given labels of the other subset. By training with this Self-Prediction task, the backbone network is constrained to fully explore relational context/geometric/shape information and learn more discriminative features for segmentation. Moreover, a general associated framework equipped with our Self-Prediction scheme is designed for enhancing instance and semantic segmentation simultaneously, where instance and semantic representations are combined to perform Self-Prediction. Through this way, instance and semantic segmentation are collaborated and mutually reinforced. Significant performance improvements on instance and semantic segmentation compared with baseline are achieved on S3DIS and ShapeNet. Our method achieves state-of-the-art instance segmentation results on S3DIS and comparable semantic segmentation results compared with state-of-the-arts on S3DIS and ShapeNet when we only take PointNet++ as the backbone network.

* Accepted to ECCV 2020

Via

Access Paper or Ask Questions