Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alan Yu

Learning Visual Parkour from Generated Images

Oct 31, 2024

Alan Yu, Ge Yang, Ran Choi, Yajvan Ravan, John Leonard, Phillip Isola

Figure 1 for Learning Visual Parkour from Generated Images

Figure 2 for Learning Visual Parkour from Generated Images

Figure 3 for Learning Visual Parkour from Generated Images

Figure 4 for Learning Visual Parkour from Generated Images

Abstract:Fast and accurate physics simulation is an essential component of robot learning, where robots can explore failure scenarios that are difficult to produce in the real world and learn from unlimited on-policy data. Yet, it remains challenging to incorporate RGB-color perception into the sim-to-real pipeline that matches the real world in its richness and realism. In this work, we train a robot dog in simulation for visual parkour. We propose a way to use generative models to synthesize diverse and physically accurate image sequences of the scene from the robot's ego-centric perspective. We present demonstrations of zero-shot transfer to the RGB-only observations of the real world on a robot equipped with a low-cost, off-the-shelf color camera. website visit https://lucidsim.github.io

* 17 pages, 19 figures

Via

Access Paper or Ask Questions

Distilled Feature Fields Enable Few-Shot Language-Guided Manipulation

Jul 27, 2023

William Shen, Ge Yang, Alan Yu, Jansen Wong, Leslie Pack Kaelbling, Phillip Isola

Abstract:Self-supervised and language-supervised image models contain rich knowledge of the world that is important for generalization. Many robotic tasks, however, require a detailed understanding of 3D geometry, which is often lacking in 2D image features. This work bridges this 2D-to-3D gap for robotic manipulation by leveraging distilled feature fields to combine accurate 3D geometry with rich semantics from 2D foundation models. We present a few-shot learning method for 6-DOF grasping and placing that harnesses these strong spatial and semantic priors to achieve in-the-wild generalization to unseen objects. Using features distilled from a vision-language model, CLIP, we present a way to designate novel objects for manipulation via free-text natural language, and demonstrate its ability to generalize to unseen expressions and novel categories of objects.

* Project website at https://f3rm.csail.mit.edu

Via

Access Paper or Ask Questions

Virtual impactor-based label-free bio-aerosol detection using holography and deep learning

Aug 30, 2022

Yi Luo, Yijie Zhang, Tairan Liu, Alan Yu, Yichen Wu, Aydogan Ozcan

Figure 1 for Virtual impactor-based label-free bio-aerosol detection using holography and deep learning

Figure 2 for Virtual impactor-based label-free bio-aerosol detection using holography and deep learning

Figure 3 for Virtual impactor-based label-free bio-aerosol detection using holography and deep learning

Figure 4 for Virtual impactor-based label-free bio-aerosol detection using holography and deep learning

Abstract:Exposure to bio-aerosols such as mold spores and pollen can lead to adverse health effects. There is a need for a portable and cost-effective device for long-term monitoring and quantification of various bio-aerosols. To address this need, we present a mobile and cost-effective label-free bio-aerosol sensor that takes holographic images of flowing particulate matter concentrated by a virtual impactor, which selectively slows down and guides particles larger than ~6 microns to fly through an imaging window. The flowing particles are illuminated by a pulsed laser diode, casting their inline holograms on a CMOS image sensor in a lens-free mobile imaging device. The illumination contains three short pulses with a negligible shift of the flowing particle within one pulse, and triplicate holograms of the same particle are recorded at a single frame before it exits the imaging field-of-view, revealing different perspectives of each particle. The particles within the virtual impactor are localized through a differential detection scheme, and a deep neural network classifies the aerosol type in a label-free manner, based on the acquired holographic images. We demonstrated the success of this mobile bio-aerosol detector with a virtual impactor using different types of pollen (i.e., bermuda, elm, oak, pine, sycamore, and wheat) and achieved a blind classification accuracy of 92.91%. This mobile and cost-effective device weighs ~700 g and can be used for label-free sensing and quantification of various bio-aerosols over extended periods since it is based on a cartridge-free virtual impactor that does not capture or immobilize particulate matter.

* 23 Pages, 5 Figures, 1 Table

Via

Access Paper or Ask Questions

Modeling Stated Preference for Mobility-on-Demand Transit: A Comparison of Machine Learning and Logit Models

Nov 04, 2018

Xilei Zhao, Xiang Yan, Alan Yu, Pascal Van Hentenryck

Figure 1 for Modeling Stated Preference for Mobility-on-Demand Transit: A Comparison of Machine Learning and Logit Models

Figure 2 for Modeling Stated Preference for Mobility-on-Demand Transit: A Comparison of Machine Learning and Logit Models

Figure 3 for Modeling Stated Preference for Mobility-on-Demand Transit: A Comparison of Machine Learning and Logit Models

Figure 4 for Modeling Stated Preference for Mobility-on-Demand Transit: A Comparison of Machine Learning and Logit Models

Abstract:Logit models are usually applied when studying individual travel behavior, i.e., to predict travel mode choice and to gain behavioral insights on traveler preferences. Recently, some studies have applied machine learning to model travel mode choice and reported higher out-of-sample prediction accuracy than conventional logit models (e.g., multinomial logit). However, there has not been a comprehensive comparison between logit models and machine learning that covers both prediction and behavioral analysis. This paper aims at addressing this gap by examining the key differences in model development, evaluation, and behavioral interpretation between logit and machine-learning models for travel-mode choice modeling. To complement the theoretical discussions, we also empirically evaluated the two approaches on stated-preference survey data for a new type of transit system integrating high-frequency fixed routes and micro-transit. The results show that machine learning can produce significantly higher predictive accuracy than logit models and are better at capturing the nonlinear relationships between trip attributes and mode-choice outcomes. On the other hand, compared to the multinomial logit model, the best-performing machine-learning model, the random forest model, produces less reasonable behavioral outputs (i.e. marginal effects and elasticities) when they were computed from a standard approach. By introducing some behavioral constraints into the computation of behavioral outputs from a random forest model, however, we obtained better results that are somewhat comparable with the multinomial logit model. We believe that there is great potential in merging ideas from machine learning and conventional statistical methods to develop refined models for travel-behavior research and suggest some possible research directions.

* 32 pages, 4 figures

Via

Access Paper or Ask Questions