Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jin Qi

Fragility-aware Classification for Understanding Risk and Improving Generalization

Feb 18, 2025

Chen Yang, Zheng Cui, Daniel Zhuoyu Long, Jin Qi, Ruohan Zhan

Abstract:Classification models play a critical role in data-driven decision-making applications such as medical diagnosis, user profiling, recommendation systems, and default detection. Traditional performance metrics, such as accuracy, focus on overall error rates but fail to account for the confidence of incorrect predictions, thereby overlooking the risk of confident misjudgments. This risk is particularly significant in cost-sensitive and safety-critical domains like medical diagnosis and autonomous driving, where overconfident false predictions may cause severe consequences. To address this issue, we introduce the Fragility Index (FI), a novel metric that evaluates classification performance from a risk-averse perspective by explicitly capturing the tail risk of confident misjudgments. To enhance generalizability, we define FI within the robust satisficing (RS) framework, incorporating data uncertainty. We further develop a model training approach that optimizes FI while maintaining tractability for common loss functions. Specifically, we derive exact reformulations for cross-entropy loss, hinge-type loss, and Lipschitz loss, and extend the approach to deep learning models. Through synthetic experiments and real-world medical diagnosis tasks, we demonstrate that FI effectively identifies misjudgment risk and FI-based training improves model robustness and generalizability. Finally, we extend our framework to deep neural network training, further validating its effectiveness in enhancing deep learning models.

Via

Access Paper or Ask Questions

Swin-X2S: Reconstructing 3D Shape from 2D Biplanar X-ray with Swin Transformers

Jan 10, 2025

Kuan Liu, Zongyuan Ying, Jie Jin, Dongyan Li, Ping Huang, Wenjian Wu, Zhe Chen, Jin Qi, Yong Lu, Lianfu Deng(+1 more)

Figure 1 for Swin-X2S: Reconstructing 3D Shape from 2D Biplanar X-ray with Swin Transformers

Figure 2 for Swin-X2S: Reconstructing 3D Shape from 2D Biplanar X-ray with Swin Transformers

Figure 3 for Swin-X2S: Reconstructing 3D Shape from 2D Biplanar X-ray with Swin Transformers

Figure 4 for Swin-X2S: Reconstructing 3D Shape from 2D Biplanar X-ray with Swin Transformers

Abstract:The conversion from 2D X-ray to 3D shape holds significant potential for improving diagnostic efficiency and safety. However, existing reconstruction methods often rely on hand-crafted features, manual intervention, and prior knowledge, resulting in unstable shape errors and additional processing costs. In this paper, we introduce Swin-X2S, an end-to-end deep learning method for directly reconstructing 3D segmentation and labeling from 2D biplanar orthogonal X-ray images. Swin-X2S employs an encoder-decoder architecture: the encoder leverages 2D Swin Transformer for X-ray information extraction, while the decoder employs 3D convolution with cross-attention to integrate structural features from orthogonal views. A dimension-expanding module is introduced to bridge the encoder and decoder, ensuring a smooth conversion from 2D pixels to 3D voxels. We evaluate proposed method through extensive qualitative and quantitative experiments across nine publicly available datasets covering four anatomies (femur, hip, spine, and rib), with a total of 54 categories. Significant improvements over previous methods have been observed not only in the segmentation and labeling metrics but also in the clinically relevant parameters that are of primary concern in practical applications, which demonstrates the promise of Swin-X2S to provide an effective option for anatomical shape reconstruction in clinical scenarios. Code implementation is available at: \url{https://github.com/liukuan5625/Swin-X2S}.

Via

Access Paper or Ask Questions

Multi-modal Multi-label Facial Action Unit Detection with Transformer

Mar 28, 2022

Lingfeng Wang, Shisen Wang, Jin Qi

Figure 1 for Multi-modal Multi-label Facial Action Unit Detection with Transformer

Figure 2 for Multi-modal Multi-label Facial Action Unit Detection with Transformer

Figure 3 for Multi-modal Multi-label Facial Action Unit Detection with Transformer

Abstract:Facial Action Coding System is an important approach of facial expression analysis.This paper describes our submission to the third Affective Behavior Analysis (ABAW) 2022 competition. We proposed a transfomer based model to detect facial action unit (FAU) in video. To be specific, we firstly trained a multi-modal model to extract both audio and visual feature. After that, we proposed a action units correlation module to learn relationships between each action unit labels and refine action unit detection result. Experimental results on validation dataset shows that our method achieves better performance than baseline model, which verifies that the effectiveness of proposed network.

Via

Access Paper or Ask Questions

Suction Grasp Region Prediction using Self-supervised Learning for Object Picking in Dense Clutter

Apr 24, 2019

Quanquan Shao, Jie Hu, Weiming Wang, Yi Fang, Wenhai Liu, Jin Qi, Jin Ma

Figure 1 for Suction Grasp Region Prediction using Self-supervised Learning for Object Picking in Dense Clutter

Figure 2 for Suction Grasp Region Prediction using Self-supervised Learning for Object Picking in Dense Clutter

Figure 3 for Suction Grasp Region Prediction using Self-supervised Learning for Object Picking in Dense Clutter

Figure 4 for Suction Grasp Region Prediction using Self-supervised Learning for Object Picking in Dense Clutter

Abstract:This paper focuses on robotic picking tasks in cluttered scenario. Because of the diversity of poses, types of stack and complicated background in bin picking situation, it is much difficult to recognize and estimate their pose before grasping them. Here, this paper combines Resnet with U-net structure, a special framework of Convolution Neural Networks (CNN), to predict picking region without recognition and pose estimation. And it makes robotic picking system learn picking skills from scratch. At the same time, we train the network end to end with online samples. In the end of this paper, several experiments are conducted to demonstrate the performance of our methods.

* 6 pages, 7 figures, conference

Via

Access Paper or Ask Questions

The Liver Tumor Segmentation Benchmark (LiTS)

Jan 13, 2019

Patrick Bilic, Patrick Ferdinand Christ, Eugene Vorontsov, Grzegorz Chlebus, Hao Chen, Qi Dou, Chi-Wing Fu, Xiao Han, Pheng-Ann Heng, Jürgen Hesser(+49 more)

Figure 1 for The Liver Tumor Segmentation Benchmark (LiTS)

Figure 2 for The Liver Tumor Segmentation Benchmark (LiTS)

Figure 3 for The Liver Tumor Segmentation Benchmark (LiTS)

Figure 4 for The Liver Tumor Segmentation Benchmark (LiTS)

Abstract:In this work, we report the set-up and results of the Liver Tumor Segmentation Benchmark (LITS) organized in conjunction with the IEEE International Symposium on Biomedical Imaging (ISBI) 2016 and International Conference On Medical Image Computing Computer Assisted Intervention (MICCAI) 2017. Twenty four valid state-of-the-art liver and liver tumor segmentation algorithms were applied to a set of 131 computed tomography (CT) volumes with different types of tumor contrast levels (hyper-/hypo-intense), abnormalities in tissues (metastasectomie) size and varying amount of lesions. The submitted algorithms have been tested on 70 undisclosed volumes. The dataset is created in collaboration with seven hospitals and research institutions and manually reviewed by independent three radiologists. We found that not a single algorithm performed best for liver and tumors. The best liver segmentation algorithm achieved a Dice score of 0.96(MICCAI) whereas for tumor segmentation the best algorithm evaluated at 0.67(ISBI) and 0.70(MICCAI). The LITS image data and manual annotations continue to be publicly available through an online evaluation system as an ongoing benchmarking resource.

* conference

Via

Access Paper or Ask Questions

Global and Local Information Based Deep Network for Skin Lesion Segmentation

Mar 16, 2017

Jin Qi, Miao Le, Chunming Li, Ping Zhou

Figure 1 for Global and Local Information Based Deep Network for Skin Lesion Segmentation

Figure 2 for Global and Local Information Based Deep Network for Skin Lesion Segmentation

Figure 3 for Global and Local Information Based Deep Network for Skin Lesion Segmentation

Abstract:With a large influx of dermoscopy images and a growing shortage of dermatologists, automatic dermoscopic image analysis plays an essential role in skin cancer diagnosis. In this paper, a new deep fully convolutional neural network (FCNN) is proposed to automatically segment melanoma out of skin images by end-to-end learning with only pixels and labels as inputs. Our proposed FCNN is capable of using both local and global information to segment melanoma by adopting skipping layers. The public benchmark database consisting of 150 validation images, 600 test images and 2000 training images in the melanoma detection challenge 2017 at International Symposium Biomedical Imaging 2017 is used to test the performance of our algorithm. All large size images (for example, $4000\times 6000$ pixels) are reduced to much smaller images with $384\times 384$ pixels (more than 10 times smaller). We got and submitted preliminary results to the challenge without any pre or post processing. The performance of our proposed method could be further improved by data augmentation and by avoiding image size reduction.

* 4 pages, 3 figures. ISIC2017

Via

Access Paper or Ask Questions