Abstract:Multi-object tracking (MOT) is crucial for various multi-agent analyses such as evaluating team sports tactics and player movements and performance. While pedestrian tracking has advanced with Tracking-by-Detection MOT, team sports like basketball pose unique challenges. These challenges include players' unpredictable movements, frequent close interactions, and visual similarities that complicate pose labeling and lead to significant occlusions, frequent ID switches, and high manual annotation costs. To address these challenges, we propose a novel pose-based virtual marker (VM) MOT method for team sports, named Sports-vmTracking. This method builds on the vmTracking approach developed for multi-animal tracking with active learning. First, we constructed a 3x3 basketball pose dataset for VMs and applied active learning to enhance model performance in generating VMs. Then, we overlaid the VMs on video to identify players, extract their poses with unique IDs, and convert these into bounding boxes for comparison with automated MOT methods. Using our 3x3 basketball dataset, we demonstrated that our VM configuration has been highly effective, and reduced the need for manual corrections and labeling during pose model training while maintaining high accuracy. Our approach achieved an average HOTA score of 72.3%, over 10 points higher than other state-of-the-art methods without VM, and resulted in 0 ID switches. Beyond improving performance in handling occlusions and minimizing ID switches, our framework could substantially increase the time and cost efficiency compared to traditional manual annotation.
Abstract:Out-of-distribution (OOD) detection is an important topic for real-world machine learning systems, but settings with limited in-distribution samples have been underexplored. Such few-shot OOD settings are challenging, as models have scarce opportunities to learn the data distribution before being tasked with identifying OOD samples. Indeed, we demonstrate that recent state-of-the-art OOD methods fail to outperform simple baselines in the few-shot setting. We thus propose a hypernetwork framework called HyperMix, using Mixup on the generated classifier parameters, as well as a natural out-of-episode outlier exposure technique that does not require an additional outlier dataset. We conduct experiments on CIFAR-FS and MiniImageNet, significantly outperforming other OOD methods in the few-shot regime.
Abstract:Objective: Bleeding from gastroesophageal varices (GEV) is a medical emergency associated with high mortality. We aim to construct an artificial intelligence-based model of two-dimensional shear wave elastography (2D-SWE) of the liver and spleen to precisely assess the risk of GEV and high-risk gastroesophageal varices (HRV). Design: A prospective multicenter study was conducted in patients with compensated advanced chronic liver disease. 305 patients were enrolled from 12 hospitals, and finally 265 patients were included, with 1136 liver stiffness measurement (LSM) images and 1042 spleen stiffness measurement (SSM) images generated by 2D-SWE. We leveraged deep learning methods to uncover associations between image features and patient risk, and thus conducted models to predict GEV and HRV. Results: A multi-modality Deep Learning Risk Prediction model (DLRP) was constructed to assess GEV and HRV, based on LSM and SSM images, and clinical information. Validation analysis revealed that the AUCs of DLRP were 0.91 for GEV (95% CI 0.90 to 0.93, p < 0.05) and 0.88 for HRV (95% CI 0.86 to 0.89, p < 0.01), which were significantly and robustly better than canonical risk indicators, including the value of LSM and SSM. Moreover, DLPR was better than the model using individual parameters, including LSM and SSM images. In HRV prediction, the 2D-SWE images of SSM outperform LSM (p < 0.01). Conclusion: DLRP shows excellent performance in predicting GEV and HRV over canonical risk indicators LSM and SSM. Additionally, the 2D-SWE images of SSM provided more information for better accuracy in predicting HRV than the LSM.
Abstract:We study the challenging incremental few-shot object detection (iFSD) setting. Recently, hypernetwork-based approaches have been studied in the context of continuous and finetune-free iFSD with limited success. We take a closer look at important design choices of such methods, leading to several key improvements and resulting in a more accurate and flexible framework, which we call Sylph. In particular, we demonstrate the effectiveness of decoupling object classification from localization by leveraging a base detector that is pretrained for class-agnostic localization on a large-scale dataset. Contrary to what previous results have suggested, we show that with a carefully designed class-conditional hypernetwork, finetune-free iFSD can be highly effective, especially when a large number of base categories with abundant data are available for meta-training, almost approaching alternatives that undergo test-time-training. This result is even more significant considering its many practical advantages: (1) incrementally learning new classes in sequence without additional training, (2) detecting both novel and seen classes in a single pass, and (3) no forgetting of previously seen classes. We benchmark our model on both COCO and LVIS, reporting as high as 17% AP on the long-tail rare classes on LVIS, indicating the promise of hypernetwork-based iFSD.
Abstract:In many applications, such as autonomous driving, hand manipulation, or robot navigation, object detection methods must be able to detect objects unseen in the training set. Open World Detection(OWD) seeks to tackle this problem by generalizing detection performance to seen and unseen class categories. Recent works have seen success in the generation of class-agnostic proposals, which we call Open-World Proposals(OWP), but this comes at the cost of a big drop on the classification task when both tasks are considered in the detection model. These works have investigated two-stage Region Proposal Networks (RPN) by taking advantage of objectness scoring cues; however, for its simplicity, run-time, and decoupling of localization and classification, we investigate OWP through the lens of fully convolutional one-stage detection network, such as FCOS. We show that our architectural and sampling optimizations on FCOS can increase OWP performance by as much as 6% in recall on novel classes, marking the first proposal-free one-stage detection network to achieve comparable performance to RPN-based two-stage networks. Furthermore, we show that the inherent, decoupled architecture of FCOS has benefits to retaining classification performance. While two-stage methods worsen by 6% in recall on novel classes, we show that FCOS only drops 2% when jointly optimizing for OWP and classification.
Abstract:This letter considers a multi-access mobile edge computing (MEC) network consisting of multiple users, multiple base stations, and a malicious eavesdropper. Specifically, the users adopt the partial offloading strategy by partitioning the computation task into several parts. One is executed locally and the others are securely offloaded to multiple MEC servers integrated into the base stations by leveraging the physical layer security to combat the eavesdropping. We jointly optimize power allocation, task partition, subcarrier allocation, and computation resource to maximize the secrecy offloading rate of the users, subject to communication and computation resource constraints. Numerical results demonstrate that our proposed scheme can respectively improve the secrecy offloading rate 1.11%--1.39% and 15.05%--17.35% (versus the increase of tasks' latency requirements), and 1.30%--1.75% and 6.08%--9.22% (versus the increase of the maximum transmit power) compared with the two benchmarks. Moreover, it further emphasizes the necessity of conducting computation offloading over multiple MEC servers.
Abstract:In the social media, there are a large amount of potential zombie accounts which may has negative impact on the public opinion. In tradition, PageRank algorithm is used to detect zombie accounts. However, problems such as it requires a large RAM to store adjacent matrix or adjacent list and the value of importance may approximately to zero for large graph exist. To solve the first problem, since the structure of social media makes the graph divisible, we conducted a community detection algorithm Louvain to decompose the whole graph into 1,002 subgraphs. The modularity of 0.58 shows the result is effective. To solve the second problem, we performed the uneven assignation PageRank algorithm to calculate the importance of node in each community. Then, a threshold is set to distinguish the zombie account and normal accounts. The result shows that about 20% accounts in the dataset are zombie accounts and they center in tier-one cities in China such as Beijing, Shanghai, and Guangzhou. In the future, a classification algorithm with semi-supervised learning can be used to detect zombie accounts.