Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Vivien Nguyen

Cleaning and Structuring the Label Space of the iMet Collection 2020

Jun 01, 2021

Vivien Nguyen, Sunnie S. Y. Kim

Figure 1 for Cleaning and Structuring the Label Space of the iMet Collection 2020

Figure 2 for Cleaning and Structuring the Label Space of the iMet Collection 2020

Figure 3 for Cleaning and Structuring the Label Space of the iMet Collection 2020

Figure 4 for Cleaning and Structuring the Label Space of the iMet Collection 2020

Abstract:The iMet 2020 dataset is a valuable resource in the space of fine-grained art attribution recognition, but we believe it has yet to reach its true potential. We document the unique properties of the dataset and observe that many of the attribute labels are noisy, more than is implied by the dataset description. Oftentimes, there are also semantic relationships between the labels (e.g., identical, mutual exclusion, subsumption, overlap with uncertainty) which we believe are underutilized. We propose an approach to cleaning and structuring the iMet 2020 labels, and discuss the implications and value of doing so. Further, we demonstrate the benefits of our proposed approach through several experiments. Our code and cleaned labels are available at https://github.com/sunniesuhyoung/iMet2020cleaned.

* A shorter version of this work was accepted to the CVPR 2021 FGVC Workshop

Via

Access Paper or Ask Questions

Synthetic Defocus and Look-Ahead Autofocus for Casual Videography

May 21, 2019

Xuaner Zhang, Kevin Matzen, Vivien Nguyen, Dillon Yao, You Zhang, Ren Ng

Figure 1 for Synthetic Defocus and Look-Ahead Autofocus for Casual Videography

Figure 2 for Synthetic Defocus and Look-Ahead Autofocus for Casual Videography

Figure 3 for Synthetic Defocus and Look-Ahead Autofocus for Casual Videography

Figure 4 for Synthetic Defocus and Look-Ahead Autofocus for Casual Videography

Abstract:In cinema, large camera lenses create beautiful shallow depth of field (DOF), but make focusing difficult and expensive. Accurate cinema focus usually relies on a script and a person to control focus in realtime. Casual videographers often crave cinematic focus, but fail to achieve it. We either sacrifice shallow DOF, as in smartphone videos; or we struggle to deliver accurate focus, as in videos from larger cameras. This paper is about a new approach in the pursuit of cinematic focus for casual videography. We present a system that synthetically renders refocusable video from a deep DOF video shot with a smartphone, and analyzes future video frames to deliver context-aware autofocus for the current frame. To create refocusable video, we extend recent machine learning methods designed for still photography, contributing a new dataset for machine training, a rendering model better suited to cinema focus, and a filtering solution for temporal coherence. To choose focus accurately for each frame, we demonstrate autofocus that looks at upcoming video frames and applies AI-assist modules such as motion, face, audio and saliency detection. We also show that autofocus benefits from machine learning and a large-scale video dataset with focus annotation, where we use our RVR-LAAF GUI to create this sizable dataset efficiently. We deliver, for example, a shallow DOF video where the autofocus transitions onto each person before she begins to speak. This is impossible for conventional camera autofocus because it would require seeing into the future.

* (V2 author name corrected) SIGGRAPH 2019; project website: https://ceciliavision.github.io/vid-auto-focus/

Via

Access Paper or Ask Questions