Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Fei Feng

3rd Workshop on Maritime Computer Vision (MaCVi) 2025: Challenge Results

Jan 17, 2025

Benjamin Kiefer, Lojze Žust, Jon Muhovič, Matej Kristan, Janez Perš, Matija Teršek, Uma Mudenagudi Chaitra Desai, Arnold Wiliem, Marten Kreis, Nikhil Akalwadi(+36 more)

Abstract:The 3rd Workshop on Maritime Computer Vision (MaCVi) 2025 addresses maritime computer vision for Unmanned Surface Vehicles (USV) and underwater. This report offers a comprehensive overview of the findings from the challenges. We provide both statistical and qualitative analyses, evaluating trends from over 700 submissions. All datasets, evaluation code, and the leaderboard are available to the public at https://macvi.org/workshop/macvi25.

* Part of the MaCVi 2025 workshop

Via

Access Paper or Ask Questions

Boosting Large-scale Parallel Training Efficiency with C4: A Communication-Driven Approach

Jun 07, 2024

Jianbo Dong, Bin Luo, Jun Zhang, Pengcheng Zhang, Fei Feng, Yikai Zhu, Ang Liu, Zian Chen, Yi Shi, Hairong Jiao(+15 more)

Figure 1 for Boosting Large-scale Parallel Training Efficiency with C4: A Communication-Driven Approach

Figure 2 for Boosting Large-scale Parallel Training Efficiency with C4: A Communication-Driven Approach

Figure 3 for Boosting Large-scale Parallel Training Efficiency with C4: A Communication-Driven Approach

Figure 4 for Boosting Large-scale Parallel Training Efficiency with C4: A Communication-Driven Approach

Abstract:The emergence of Large Language Models (LLMs) has necessitated the adoption of parallel training techniques, involving the deployment of thousands of GPUs to train a single model. Unfortunately, we have found that the efficiency of current parallel training is often suboptimal, largely due to the following two main issues. Firstly, hardware failures are inevitable, leading to interruptions in the training tasks. The inability to quickly identify the faulty components results in a substantial waste of GPU resources. Secondly, since GPUs must wait for parameter synchronization to complete before proceeding to the next round of computation, network congestions can greatly increase the waiting time for GPUs. To address these challenges, this paper introduces a communication-driven solution, namely the C4. The key insights of C4 are two folds. First, in parallel training, collective communication exhibits periodic and homogeneous characteristics, so any anomalies are certainly due to some form of hardware malfunction. By leveraging this feature, C4 can rapidly identify the faulty components, swiftly isolate the anomaly, and restart the task, thereby avoiding resource wastage caused by delays in anomaly detection. Second, the predictable communication model of collective communication, involving few large flows, allows C4 to efficiently execute traffic planning, substantially reducing network congestion. C4 has been extensively implemented across our production systems, cutting error-induced overhead by roughly 30% and enhancing runtime performance by about 15% for certain applications with moderate communication costs.

Via

Access Paper or Ask Questions

Pelvic floor MRI segmentation based on semi-supervised deep learning

Nov 06, 2023

Jianwei Zuo, Fei Feng, Zhuhui Wang, James A. Ashton-Miller, John O. L. Delancey, Jiajia Luo

Figure 1 for Pelvic floor MRI segmentation based on semi-supervised deep learning

Figure 2 for Pelvic floor MRI segmentation based on semi-supervised deep learning

Figure 3 for Pelvic floor MRI segmentation based on semi-supervised deep learning

Figure 4 for Pelvic floor MRI segmentation based on semi-supervised deep learning

Abstract:The semantic segmentation of pelvic organs via MRI has important clinical significance. Recently, deep learning-enabled semantic segmentation has facilitated the three-dimensional geometric reconstruction of pelvic floor organs, providing clinicians with accurate and intuitive diagnostic results. However, the task of labeling pelvic floor MRI segmentation, typically performed by clinicians, is labor-intensive and costly, leading to a scarcity of labels. Insufficient segmentation labels limit the precise segmentation and reconstruction of pelvic floor organs. To address these issues, we propose a semi-supervised framework for pelvic organ segmentation. The implementation of this framework comprises two stages. In the first stage, it performs self-supervised pre-training using image restoration tasks. Subsequently, fine-tuning of the self-supervised model is performed, using labeled data to train the segmentation model. In the second stage, the self-supervised segmentation model is used to generate pseudo labels for unlabeled data. Ultimately, both labeled and unlabeled data are utilized in semi-supervised training. Upon evaluation, our method significantly enhances the performance in the semantic segmentation and geometric reconstruction of pelvic organs, Dice coefficient can increase by 2.65% averagely. Especially for organs that are difficult to segment, such as the uterus, the accuracy of semantic segmentation can be improved by up to 3.70%.

Via

Access Paper or Ask Questions

Neuro-Dynamic State Estimation for Networked Microgrids

Aug 25, 2022

Fei Feng, Yifan Zhou, Peng Zhang

Figure 1 for Neuro-Dynamic State Estimation for Networked Microgrids

Figure 2 for Neuro-Dynamic State Estimation for Networked Microgrids

Figure 3 for Neuro-Dynamic State Estimation for Networked Microgrids

Figure 4 for Neuro-Dynamic State Estimation for Networked Microgrids

Abstract:We devise neuro-dynamic state estimation (Neuro-DSE), a learning-based dynamic state estimation (DSE) algorithm for networked microgrids (NMs) under unknown subsystems. Our contributions include: 1) a data-driven Neuro-DSE algorithm for NMs DSE with partially unidentified dynamic models, which incorporates the neural-ordinary-differential-equations (ODE-Net) into Kalman filters; 2) a self-refining Neuro-DSE algorithm (Neuro-DSE+) which enables data-driven DSE under limited and noisy measurements by establishing an automatic filtering, augmenting and correcting framework; 3) a Neuro-KalmanNet-DSE algorithm which further integrates KalmanNet with Neuro-DSE to relieve the model mismatch of both neural- and physics-based dynamic models; and 4) an augmented Neuro-DSE for joint estimation of NMs states and unknown parameters (e.g., inertia). Extensive case studies demonstrate the efficacy of Neuro-DSE and its variants under different noise levels, control modes, power sources, observabilities and model knowledge, respectively.

Via

Access Paper or Ask Questions

Image enhancement in acoustic-resolution photoacoustic microscopy enabled by a novel directional algorithm

Nov 19, 2021

Fei Feng, Siqi Liang, Sung-Liang Chen

Figure 1 for Image enhancement in acoustic-resolution photoacoustic microscopy enabled by a novel directional algorithm

Figure 2 for Image enhancement in acoustic-resolution photoacoustic microscopy enabled by a novel directional algorithm

Figure 3 for Image enhancement in acoustic-resolution photoacoustic microscopy enabled by a novel directional algorithm

Figure 4 for Image enhancement in acoustic-resolution photoacoustic microscopy enabled by a novel directional algorithm

Abstract:Acoustic-resolution photoacoustic microscopy (AR-PAM) is a promising tool for microvascular imaging. In the focal region, resolution of AR-PAM is determined by the ultrasound transducer and ultimately limited by acoustic diffraction. In the out-of-focus region, resolution deteriorates with increasing distance from the focal plane, which restricts depth of focus (DOF). Besides, a trade-off exists between resolution and DOF. Previously, synthetic aperture focusing technique (SAFT) and/or deconvolution methods have been demonstrated to enhance AR-PAM images. However, they suffer from issues in low resolution, low signal-to-noise ratio (SNR), and/or poor image fidelity. Here, we propose a novel algorithm for AR-PAM to enhance image resolution, SNR, and fidelity. The algorithm consists of a Fourier accumulation SAFT (FA-SAFT) and a directional model-based (D-MB) deconvolution method. Inspired from Fourier denoising technique and directional SAFT, FA-SAFT mainly compensates for the defocusing effect. Besides, D-MB deconvolution enhances the resolution as well as preserves the image fidelity, especially for the objects with line patterns such as microvasculature. Full width at half maximum of 26-31 um over DOF of 1.8 mm and minimum resolvable distance of 46-49 um are experimentally achieved by imaging tungsten wire phantom. Moreover, imaging of leaf skeleton phantom and in vivo imaging of mouse blood vessels also prove that our algorithm is capable of providing high-resolution, high-SNR, and good-fidelity results for complex structures and for in vivo applications.

* 34 pages (including 16 pages of supplementary materials)

Via

Access Paper or Ask Questions

Provably Correct Optimization and Exploration with Non-linear Policies

Mar 22, 2021

Fei Feng, Wotao Yin, Alekh Agarwal, Lin F. Yang

Figure 1 for Provably Correct Optimization and Exploration with Non-linear Policies

Figure 2 for Provably Correct Optimization and Exploration with Non-linear Policies

Figure 3 for Provably Correct Optimization and Exploration with Non-linear Policies

Figure 4 for Provably Correct Optimization and Exploration with Non-linear Policies

Abstract:Policy optimization methods remain a powerful workhorse in empirical Reinforcement Learning (RL), with a focus on neural policies that can easily reason over complex and continuous state and/or action spaces. Theoretical understanding of strategic exploration in policy-based methods with non-linear function approximation, however, is largely missing. In this paper, we address this question by designing ENIAC, an actor-critic method that allows non-linear function approximation in the critic. We show that under certain assumptions, e.g., a bounded eluder dimension $d$ for the critic class, the learner finds a near-optimal policy in $O(\poly(d))$ exploration rounds. The method is robust to model misspecification and strictly extends existing works on linear function approximation. We also develop some computational optimizations of our approach with slightly worse statistical guarantees and an empirical adaptation building on existing deep RL tools. We empirically evaluate this adaptation and show that it outperforms prior heuristics inspired by linear methods, establishing the value via correctly reasoning about the agent's uncertainty under non-linear function approximation.

Via

Access Paper or Ask Questions

Provably Efficient Exploration for RL with Unsupervised Learning

Mar 15, 2020

Fei Feng, Ruosong Wang, Wotao Yin, Simon S. Du, Lin F. Yang

Figure 1 for Provably Efficient Exploration for RL with Unsupervised Learning

Figure 2 for Provably Efficient Exploration for RL with Unsupervised Learning

Figure 3 for Provably Efficient Exploration for RL with Unsupervised Learning

Abstract:We study how to use unsupervised learning for efficient exploration in reinforcement learning with rich observations generated from a small number of latent states. We present a novel algorithmic framework that is built upon two components: an unsupervised learning algorithm and a no-regret reinforcement learning algorithm. We show that our algorithm provably finds a near-optimal policy with sample complexity polynomial in the number of latent states, which is significantly smaller than the number of possible observations. Our result gives theoretical justification to the prevailing paradigm of using unsupervised learning for efficient exploration [tang2017exploration,bellemare2016unifying].

Via

Access Paper or Ask Questions

Adaptive Distraction Context Aware Tracking Based on Correlation Filter

Dec 24, 2019

Fei Feng, Xiao-Jun Wu, Tianyang Xu, Josef Kittler, Xue-Feng Zhu

Figure 1 for Adaptive Distraction Context Aware Tracking Based on Correlation Filter

Figure 2 for Adaptive Distraction Context Aware Tracking Based on Correlation Filter

Figure 3 for Adaptive Distraction Context Aware Tracking Based on Correlation Filter

Figure 4 for Adaptive Distraction Context Aware Tracking Based on Correlation Filter

Abstract:The Discriminative Correlation Filter (CF) uses a circulant convolution operation to provide several training samples for the design of a classifier that can distinguish the target from the background. The filter design may be interfered by objects close to the target during the tracking process, resulting in tracking failure. This paper proposes an adaptive distraction context aware tracking algorithm to solve this problem. In the response map obtained for the previous frame by the CF algorithm, we adaptively find the image blocks that are similar to the target and use them as negative samples. This diminishes the influence of similar image blocks on the classifier in the tracking process and its accuracy is improved. The tracking results on video sequences show that the algorithm can cope with rapid changes such as occlusion and rotation, and can adaptively use the distractive objects around the target as negative samples to improve the accuracy of target tracking.

Via

Access Paper or Ask Questions

Does Knowledge Transfer Always Help to Learn a Better Policy?

Dec 06, 2019

Fei Feng, Wotao Yin, Lin F. Yang

Figure 1 for Does Knowledge Transfer Always Help to Learn a Better Policy?

Figure 2 for Does Knowledge Transfer Always Help to Learn a Better Policy?

Figure 3 for Does Knowledge Transfer Always Help to Learn a Better Policy?

Abstract:One of the key approaches to save samples when learning a policy for a reinforcement learning problem is to use knowledge from an approximate model such as its simulator. However, does knowledge transfer from approximate models always help to learn a better policy? Despite numerous empirical studies of transfer reinforcement learning, an answer to this question is still elusive. In this paper, we provide a strong negative result, showing that even the full knowledge of an approximate model may not help reduce the number of samples for learning an accurate policy of the true model. We construct an example of reinforcement learning models and show that the complexity with or without knowledge transfer has the same order. On the bright side, effective knowledge transferring is still possible under additional assumptions. In particular, we demonstrate that knowing the (linear) bases of the true model significantly reduces the number of samples for learning an accurate policy.

Via

Access Paper or Ask Questions

CSSegNet: Fine-Grained Cardiac Structures Segmentation Using Dilated Pyramid Pooling in U-net

Jul 02, 2019

Fei Feng, Jiajia Luo

Figure 1 for CSSegNet: Fine-Grained Cardiac Structures Segmentation Using Dilated Pyramid Pooling in U-net

Figure 2 for CSSegNet: Fine-Grained Cardiac Structures Segmentation Using Dilated Pyramid Pooling in U-net

Figure 3 for CSSegNet: Fine-Grained Cardiac Structures Segmentation Using Dilated Pyramid Pooling in U-net

Figure 4 for CSSegNet: Fine-Grained Cardiac Structures Segmentation Using Dilated Pyramid Pooling in U-net

Abstract:Cardiac structure segmentation plays an important role in medical analysis procedures. Images' blurred boundaries issue always limits the segmentation performance. To address this difficult problem, we presented a novel network structure which embedded dilated pyramid pooling block in the skip connections between networks' encoding and decoding stage. A dilated pyramid pooling block is made up of convolutions and pooling operations with different vision scopes. Equipped the model with such module, it could be endowed with multi-scales vision ability. Together combining with other techniques, it included a multi-scales initial features extraction and a multi-resolutions' prediction aggregation module. As for backbone feature extraction network, we referred to the basic idea of Xception network which benefited from separable convolutions. Evaluated on the Post 2017 MICCAI-ACDC challenge phase data, our proposed model could achieve state-of-the-art performance in left ventricle (LVC) cavities and right ventricle cavities (RVC) segmentation tasks. Results revealed that our method has advantages on both geometrical (Dice coefficient, Hausdorff distance) and clinical evaluation (Ejection Fraction, Volume), which represent closer boundaries and more statistically significant separately.

Via

Access Paper or Ask Questions