Abstract:Multi-modal visual understanding of images with prompts involves using various visual and textual cues to enhance the semantic understanding of images. This approach combines both vision and language processing to generate more accurate predictions and recognition of images. By utilizing prompt-based techniques, models can learn to focus on certain features of an image to extract useful information for downstream tasks. Additionally, multi-modal understanding can improve upon single modality models by providing more robust representations of images. Overall, the combination of visual and textual information is a promising area of research for advancing image recognition and understanding. In this paper we will try an amount of prompt design methods and propose a new method for better extraction of semantic information
Abstract:With the increasing amount of information on the Internet, recommender systems are becoming increasingly crucial in supporting people to find and explore relevant content. This is also true in the online recruitment space, with websites such as LinkedIn, Indeed.com, and Monster.com all using recommender systems. In online recruitment, it can often be challenging for companies to find suitable candidates with appropriate skills because of the huge volume of user profiles available. Identifying users which satisfy a range of different employer needs is also a difficult task. Thus, effective matching of user-profiles and jobs is becoming crucial for companies. This research project applies a wide range of recommendation techniques to the task of user profile recommendation. Extensive experiments are conducted on a large-scale real-world LinkedIn dataset to evaluate their performance, with the aim of identifying the most suitable approach in this particular recommendation scenario.
Abstract:The SportsMOT competition aims to solve multiple object tracking of athletes in different sports scenes such as basketball or soccer. The competition is challenging because of the unstable camera view, athletes' complex trajectory, and complicated background. Previous MOT methods can not match enough high-quality tracks of athletes. To pursue higher performance of MOT in sports scenes, we introduce an innovative tracker named SportsTrack, we utilize tracking by detection as our detection paradigm. Then we will introduce a three-stage matching process to solve the motion blur and body overlapping in sports scenes. Meanwhile, we present another innovation point: one-to-many correspondence between detection bboxes and crowded tracks to handle the overlap of athletes' bodies during sports competitions. Compared to other trackers such as BOT-SORT and ByteTrack, We carefully restored edge-lost tracks that were ignored by other trackers. Finally, we reached the top 1 tracking score (76.264 HOTA) in the ECCV 2022 DeepAction SportsMOT competition.