Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zikang Xu

MTFM: A Scalable and Alignment-free Foundation Model for Industrial Recommendation in Meituan

Feb 13, 2026

Xin Song, Zhilin Guan, Ruidong Han, Binghao Tang, Tianwen Chen, Bing Li, Zihao Li, Han Zhang, Fei Jiang, Qing Wang(+5 more)

Abstract:Industrial recommendation systems typically involve multiple scenarios, yet existing cross-domain (CDR) and multi-scenario (MSR) methods often require prohibitive resources and strict input alignment, limiting their extensibility. We propose MTFM (Meituan Foundation Model for Recommendation), a transformer-based framework that addresses these challenges. Instead of pre-aligning inputs, MTFM transforms cross-domain data into heterogeneous tokens, capturing multi-scenario knowledge in an alignment-free manner. To enhance efficiency, we first introduce a multi-scenario user-level sample aggregation that significantly enhances training throughput by reducing the total number of instances. We further integrate Grouped-Query Attention and a customized Hybrid Target Attention to minimize memory usage and computational complexity. Furthermore, we implement various system-level optimizations, such as kernel fusion and the elimination of CPU-GPU blocking, to further enhance both training and inference throughput. Offline and online experiments validate the effectiveness of MTFM, demonstrating that significant performance gains are achieved by scaling both model capacity and multi-scenario training data.

Via

Access Paper or Ask Questions

FAIR-ESI: Feature Adaptive Importance Refinement for Electrophysiological Source Imaging

Jan 22, 2026

Linyong Zou, Liang Zhang, Xiongfei Wang, Jia-Hong Gao, Yi Sun, Shurong Sheng, Kuntao Xiao, Wanli Yang, Pengfei Teng, Guoming Luan(+2 more)

Abstract:An essential technique for diagnosing brain disorders is electrophysiological source imaging (ESI). While model-based optimization and deep learning methods have achieved promising results in this field, the accurate selection and refinement of features remains a central challenge for precise ESI. This paper proposes FAIR-ESI, a novel framework that adaptively refines feature importance across different views, including FFT-based spectral feature refinement, weighted temporal feature refinement, and self-attention-based patch-wise feature refinement. Extensive experiments on two simulation datasets with diverse configurations and two real-world clinical datasets validate our framework's efficacy, highlighting its potential to advance brain disorder diagnosis and offer new insights into brain function.

Via

Access Paper or Ask Questions

U-Bench: A Comprehensive Understanding of U-Net through 100-Variant Benchmarking

Oct 08, 2025

Fenghe Tang, Chengqi Dong, Wenxin Ma, Zikang Xu, Heqin Zhu, Zihang Jiang, Rongsheng Wang, Yuhao Wang, Chenxu Wu, Shaohua Kevin Zhou

Figure 1 for U-Bench: A Comprehensive Understanding of U-Net through 100-Variant Benchmarking

Figure 2 for U-Bench: A Comprehensive Understanding of U-Net through 100-Variant Benchmarking

Figure 3 for U-Bench: A Comprehensive Understanding of U-Net through 100-Variant Benchmarking

Figure 4 for U-Bench: A Comprehensive Understanding of U-Net through 100-Variant Benchmarking

Abstract:Over the past decade, U-Net has been the dominant architecture in medical image segmentation, leading to the development of thousands of U-shaped variants. Despite its widespread adoption, there is still no comprehensive benchmark to systematically evaluate their performance and utility, largely because of insufficient statistical validation and limited consideration of efficiency and generalization across diverse datasets. To bridge this gap, we present U-Bench, the first large-scale, statistically rigorous benchmark that evaluates 100 U-Net variants across 28 datasets and 10 imaging modalities. Our contributions are threefold: (1) Comprehensive Evaluation: U-Bench evaluates models along three key dimensions: statistical robustness, zero-shot generalization, and computational efficiency. We introduce a novel metric, U-Score, which jointly captures the performance-efficiency trade-off, offering a deployment-oriented perspective on model progress. (2) Systematic Analysis and Model Selection Guidance: We summarize key findings from the large-scale evaluation and systematically analyze the impact of dataset characteristics and architectural paradigms on model performance. Based on these insights, we propose a model advisor agent to guide researchers in selecting the most suitable models for specific datasets and tasks. (3) Public Availability: We provide all code, models, protocols, and weights, enabling the community to reproduce our results and extend the benchmark with future methods. In summary, U-Bench not only exposes gaps in previous evaluations but also establishes a foundation for fair, reproducible, and practically relevant benchmarking in the next decade of U-Net-based segmentation models. The project can be accessed at: https://fenghetan9.github.io/ubench. Code is available at: https://github.com/FengheTan9/U-Bench.

* 54 pages. The project can be accessed at: https://fenghetan9.github.io/ubench. Code is available at: https://github.com/FengheTan9/U-Bench

Via

Access Paper or Ask Questions

Style-Aligned Image Composition for Robust Detection of Abnormal Cells in Cytopathology

Jun 26, 2025

Qiuyi Qi, Xin Li, Ming Kong, Zikang Xu, Bingdi Chen, Qiang Zhu, S Kevin Zhou

Figure 1 for Style-Aligned Image Composition for Robust Detection of Abnormal Cells in Cytopathology

Figure 2 for Style-Aligned Image Composition for Robust Detection of Abnormal Cells in Cytopathology

Figure 3 for Style-Aligned Image Composition for Robust Detection of Abnormal Cells in Cytopathology

Figure 4 for Style-Aligned Image Composition for Robust Detection of Abnormal Cells in Cytopathology

Abstract:Challenges such as the lack of high-quality annotations, long-tailed data distributions, and inconsistent staining styles pose significant obstacles to training neural networks to detect abnormal cells in cytopathology robustly. This paper proposes a style-aligned image composition (SAIC) method that composes high-fidelity and style-preserved pathological images to enhance the effectiveness and robustness of detection models. Without additional training, SAIC first selects an appropriate candidate from the abnormal cell bank based on attribute guidance. Then, it employs a high-frequency feature reconstruction to achieve a style-aligned and high-fidelity composition of abnormal cells and pathological backgrounds. Finally, it introduces a large vision-language model to filter high-quality synthesis images. Experimental results demonstrate that incorporating SAIC-synthesized images effectively enhances the performance and robustness of abnormal cell detection for tail categories and styles, thereby improving overall detection performance. The comprehensive quality evaluation further confirms the generalizability and practicality of SAIC in clinical application scenarios. Our code will be released at https://github.com/Joey-Qi/SAIC.

* MIDL 2025 Oral

Via

Access Paper or Ask Questions

LACOSTE: Exploiting stereo and temporal contexts for surgical instrument segmentation

Sep 14, 2024

Qiyuan Wang, Shang Zhao, Zikang Xu, S Kevin Zhou

Figure 1 for LACOSTE: Exploiting stereo and temporal contexts for surgical instrument segmentation

Figure 2 for LACOSTE: Exploiting stereo and temporal contexts for surgical instrument segmentation

Figure 3 for LACOSTE: Exploiting stereo and temporal contexts for surgical instrument segmentation

Figure 4 for LACOSTE: Exploiting stereo and temporal contexts for surgical instrument segmentation

Abstract:Surgical instrument segmentation is instrumental to minimally invasive surgeries and related applications. Most previous methods formulate this task as single-frame-based instance segmentation while ignoring the natural temporal and stereo attributes of a surgical video. As a result, these methods are less robust against the appearance variation through temporal motion and view change. In this work, we propose a novel LACOSTE model that exploits Location-Agnostic COntexts in Stereo and TEmporal images for improved surgical instrument segmentation. Leveraging a query-based segmentation model as core, we design three performance-enhancing modules. Firstly, we design a disparity-guided feature propagation module to enhance depth-aware features explicitly. To generalize well for even only a monocular video, we apply a pseudo stereo scheme to generate complementary right images. Secondly, we propose a stereo-temporal set classifier, which aggregates stereo-temporal contexts in a universal way for making a consolidated prediction and mitigates transient failures. Finally, we propose a location-agnostic classifier to decouple the location bias from mask prediction and enhance the feature semantics. We extensively validate our approach on three public surgical video datasets, including two benchmarks from EndoVis Challenges and one real radical prostatectomy surgery dataset GraSP. Experimental results demonstrate the promising performances of our method, which consistently achieves comparable or favorable results with previous state-of-the-art approaches.

* Preprint submitted to Medical Image Analysis

Via

Access Paper or Ask Questions

FairMedFM: Fairness Benchmarking for Medical Imaging Foundation Models

Jul 01, 2024

Ruinan Jin, Zikang Xu, Yuan Zhong, Qiongsong Yao, Qi Dou, S. Kevin Zhou, Xiaoxiao Li

Figure 1 for FairMedFM: Fairness Benchmarking for Medical Imaging Foundation Models

Figure 2 for FairMedFM: Fairness Benchmarking for Medical Imaging Foundation Models

Figure 3 for FairMedFM: Fairness Benchmarking for Medical Imaging Foundation Models

Figure 4 for FairMedFM: Fairness Benchmarking for Medical Imaging Foundation Models

Abstract:The advent of foundation models (FMs) in healthcare offers unprecedented opportunities to enhance medical diagnostics through automated classification and segmentation tasks. However, these models also raise significant concerns about their fairness, especially when applied to diverse and underrepresented populations in healthcare applications. Currently, there is a lack of comprehensive benchmarks, standardized pipelines, and easily adaptable libraries to evaluate and understand the fairness performance of FMs in medical imaging, leading to considerable challenges in formulating and implementing solutions that ensure equitable outcomes across diverse patient populations. To fill this gap, we introduce FairMedFM, a fairness benchmark for FM research in medical imaging.FairMedFM integrates with 17 popular medical imaging datasets, encompassing different modalities, dimensionalities, and sensitive attributes. It explores 20 widely used FMs, with various usages such as zero-shot learning, linear probing, parameter-efficient fine-tuning, and prompting in various downstream tasks -- classification and segmentation. Our exhaustive analysis evaluates the fairness performance over different evaluation metrics from multiple perspectives, revealing the existence of bias, varied utility-fairness trade-offs on different FMs, consistent disparities on the same datasets regardless FMs, and limited effectiveness of existing unfairness mitigation methods. Checkout FairMedFM's project page and open-sourced codebase, which supports extendible functionalities and applications as well as inclusive for studies on FMs in medical imaging over the long term.

* 29 pages, 17 figures

Via

Access Paper or Ask Questions

APPLE: Adversarial Privacy-aware Perturbations on Latent Embedding for Unfairness Mitigation

Mar 08, 2024

Zikang Xu, Fenghe Tang, Quan Quan, Qingsong Yao, S. Kevin Zhou

Figure 1 for APPLE: Adversarial Privacy-aware Perturbations on Latent Embedding for Unfairness Mitigation

Figure 2 for APPLE: Adversarial Privacy-aware Perturbations on Latent Embedding for Unfairness Mitigation

Figure 3 for APPLE: Adversarial Privacy-aware Perturbations on Latent Embedding for Unfairness Mitigation

Figure 4 for APPLE: Adversarial Privacy-aware Perturbations on Latent Embedding for Unfairness Mitigation

Abstract:Ensuring fairness in deep-learning-based segmentors is crucial for health equity. Much effort has been dedicated to mitigating unfairness in the training datasets or procedures. However, with the increasing prevalence of foundation models in medical image analysis, it is hard to train fair models from scratch while preserving utility. In this paper, we propose a novel method, Adversarial Privacy-aware Perturbations on Latent Embedding (APPLE), that can improve the fairness of deployed segmentors by introducing a small latent feature perturber without updating the weights of the original model. By adding perturbation to the latent vector, APPLE decorates the latent vector of segmentors such that no fairness-related features can be passed to the decoder of the segmentors while preserving the architecture and parameters of the segmentor. Experiments on two segmentation datasets and five segmentors (three U-Net-like and two SAM-like) illustrate the effectiveness of our proposed method compared to several unfairness mitigation methods.

Via

Access Paper or Ask Questions

Slide-SAM: Medical SAM Meets Sliding Window

Dec 05, 2023

Quan Quan, Fenghe Tang, Zikang Xu, Heqin Zhu, S. Kevin Zhou

Figure 1 for Slide-SAM: Medical SAM Meets Sliding Window

Figure 2 for Slide-SAM: Medical SAM Meets Sliding Window

Figure 3 for Slide-SAM: Medical SAM Meets Sliding Window

Figure 4 for Slide-SAM: Medical SAM Meets Sliding Window

Abstract:The Segment Anything Model (SAM) has achieved a notable success in two-dimensional image segmentation in natural images. However, the substantial gap between medical and natural images hinders its direct application to medical image segmentation tasks. Particularly in 3D medical images, SAM struggles to learn contextual relationships between slices, limiting its practical applicability. Moreover, applying 2D SAM to 3D images requires prompting the entire volume, which is time- and label-consuming. To address these problems, we propose Slide-SAM, which treats a stack of three adjacent slices as a prediction window. It firstly takes three slices from a 3D volume and point- or bounding box prompts on the central slice as inputs to predict segmentation masks for all three slices. Subsequently, the masks of the top and bottom slices are then used to generate new prompts for adjacent slices. Finally, step-wise prediction can be achieved by sliding the prediction window forward or backward through the entire volume. Our model is trained on multiple public and private medical datasets and demonstrates its effectiveness through extensive 3D segmetnation experiments, with the help of minimal prompts. Code is available at \url{https://github.com/Curli-quan/Slide-SAM}.

Via

Access Paper or Ask Questions

Inspecting Model Fairness in Ultrasound Segmentation Tasks

Dec 05, 2023

Zikang Xu, Fenghe Tang, Quan Quan, Jianrui Ding, Chunping Ning, S. Kevin Zhou

Figure 1 for Inspecting Model Fairness in Ultrasound Segmentation Tasks

Figure 2 for Inspecting Model Fairness in Ultrasound Segmentation Tasks

Figure 3 for Inspecting Model Fairness in Ultrasound Segmentation Tasks

Figure 4 for Inspecting Model Fairness in Ultrasound Segmentation Tasks

Abstract:With the rapid expansion of machine learning and deep learning (DL), researchers are increasingly employing learning-based algorithms to alleviate diagnostic challenges across diverse medical tasks and applications. While advancements in diagnostic precision are notable, some researchers have identified a concerning trend: their models exhibit biased performance across subgroups characterized by different sensitive attributes. This bias not only infringes upon the rights of patients but also has the potential to lead to life-altering consequences. In this paper, we inspect a series of DL segmentation models using two ultrasound datasets, aiming to assess the presence of model unfairness in these specific tasks. Our findings reveal that even state-of-the-art DL algorithms demonstrate unfair behavior in ultrasound segmentation tasks. These results serve as a crucial warning, underscoring the necessity for careful model evaluation before their deployment in real-world scenarios. Such assessments are imperative to ensure ethical considerations and mitigate the risk of adverse impacts on patient outcomes.

* Submitted to ISBI 2024

Via

Access Paper or Ask Questions

FairAdaBN: Mitigating unfairness with adaptive batch normalization and its application to dermatological disease classification

Mar 15, 2023

Zikang Xu, Shang Zhao, Quan Quan, Qingsong Yao, S. Kevin Zhou

Figure 1 for FairAdaBN: Mitigating unfairness with adaptive batch normalization and its application to dermatological disease classification

Figure 2 for FairAdaBN: Mitigating unfairness with adaptive batch normalization and its application to dermatological disease classification

Figure 3 for FairAdaBN: Mitigating unfairness with adaptive batch normalization and its application to dermatological disease classification

Figure 4 for FairAdaBN: Mitigating unfairness with adaptive batch normalization and its application to dermatological disease classification

Abstract:Deep learning is becoming increasingly ubiquitous in medical research and applications while involving sensitive information and even critical diagnosis decisions. Researchers observe a significant performance disparity among subgroups with different demographic attributes, which is called model unfairness, and put lots of effort into carefully designing elegant architectures to address unfairness, which poses heavy training burden, brings poor generalization, and reveals the trade-off between model performance and fairness. To tackle these issues, we propose FairAdaBN by making batch normalization adaptive to sensitive attribute. This simple but effective design can be adopted to several classification backbones that are originally unaware of fairness. Additionally, we derive a novel loss function that restrains statistical parity between subgroups on mini-batches, encouraging the model to converge with considerable fairness. In order to evaluate the trade-off between model performance and fairness, we propose a new metric, named Fairness-Accuracy Trade-off Efficiency (FATE), to compute normalized fairness improvement over accuracy drop. Experiments on two dermatological datasets show that our proposed method outperforms other methods on fairness criteria and FATE.

* Submitted to MICCAI 2023

Via

Access Paper or Ask Questions