Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Matt Franchi

Privacy of Groups in Dense Street Imagery

May 11, 2025

Matt Franchi, Hauke Sandhaus, Madiha Zahrah Choksi, Severin Engelmann, Wendy Ju, Helen Nissenbaum

Abstract:Spatially and temporally dense street imagery (DSI) datasets have grown unbounded. In 2024, individual companies possessed around 3 trillion unique images of public streets. DSI data streams are only set to grow as companies like Lyft and Waymo use DSI to train autonomous vehicle algorithms and analyze collisions. Academic researchers leverage DSI to explore novel approaches to urban analysis. Despite good-faith efforts by DSI providers to protect individual privacy through blurring faces and license plates, these measures fail to address broader privacy concerns. In this work, we find that increased data density and advancements in artificial intelligence enable harmful group membership inferences from supposedly anonymized data. We perform a penetration test to demonstrate how easily sensitive group affiliations can be inferred from obfuscated pedestrians in 25,232,608 dashcam images taken in New York City. We develop a typology of identifiable groups within DSI and analyze privacy implications through the lens of contextual integrity. Finally, we discuss actionable recommendations for researchers working with data from DSI providers.

* To appear in ACM Conference on Fairness, Accountability, and Transparency (FAccT) '25

Via

Access Paper or Ask Questions

The Robotability Score: Enabling Harmonious Robot Navigation on Urban Streets

Apr 15, 2025

Matt Franchi, Maria Teresa Parreira, Fanjun Bu, Wendy Ju

Abstract:This paper introduces the Robotability Score ($R$), a novel metric that quantifies the suitability of urban environments for autonomous robot navigation. Through expert interviews and surveys, we identify and weigh key features contributing to R for wheeled robots on urban streets. Our findings reveal that pedestrian density, crowd dynamics and pedestrian flow are the most critical factors, collectively accounting for 28% of the total score. Computing robotability across New York City yields significant variation; the area of highest R is 3.0 times more "robotable" than the area of lowest R. Deployments of a physical robot on high and low robotability areas show the adequacy of the score in anticipating the ease of robot navigation. This new framework for evaluating urban landscapes aims to reduce uncertainty in robot deployment while respecting established mobility patterns and urban planning principles, contributing to the discourse on harmonious human-robot environments.

* Accepted to CHI '25

Via

Access Paper or Ask Questions

Bayesian Modeling of Zero-Shot Classifications for Urban Flood Detection

Mar 18, 2025

Matt Franchi, Nikhil Garg, Wendy Ju, Emma Pierson

Abstract:Street scene datasets, collected from Street View or dashboard cameras, offer a promising means of detecting urban objects and incidents like street flooding. However, a major challenge in using these datasets is their lack of reliable labels: there are myriad types of incidents, many types occur rarely, and ground-truth measures of where incidents occur are lacking. Here, we propose BayFlood, a two-stage approach which circumvents this difficulty. First, we perform zero-shot classification of where incidents occur using a pretrained vision-language model (VLM). Second, we fit a spatial Bayesian model on the VLM classifications. The zero-shot approach avoids the need to annotate large training sets, and the Bayesian model provides frequent desiderata in urban settings - principled measures of uncertainty, smoothing across locations, and incorporation of external data like stormwater accumulation zones. We comprehensively validate this two-stage approach, showing that VLMs provide strong zero-shot signal for floods across multiple cities and time periods, the Bayesian model improves out-of-sample prediction relative to baseline methods, and our inferred flood risk correlates with known external predictors of risk. Having validated our approach, we show it can be used to improve urban flood detection: our analysis reveals 113,738 people who are at high risk of flooding overlooked by current methods, identifies demographic biases in existing methods, and suggests locations for new flood sensors. More broadly, our results showcase how Bayesian modeling of zero-shot LM annotations represents a promising paradigm because it avoids the need to collect large labeled datasets and leverages the power of foundation models while providing the expressiveness and uncertainty quantification of Bayesian models.

* In review

Via

Access Paper or Ask Questions

Fingerprinting New York City's Scaffolding Problem with Longitudinal Dashcam Data

Feb 09, 2024

Dorin Shapira, Matt Franchi, Wendy Ju

Abstract:Scaffolds, also called sidewalk sheds, are intended to be temporary structures to protect pedestrians from construction and repair hazards. However, some sidewalk sheds are left up for years. Long-term scaffolding becomes eyesores, creates accessibility issues on sidewalks, and gives cover to illicit activity. Today, there are over 8,000 active permits for scaffolds in NYC; the more problematic scaffolds are likely expired or unpermitted. This research uses computer vision on street-level imagery to develop a longitudinal map of scaffolding throughout the city. Using a dataset of 29,156,833 dashcam images taken between August 2023 and January 2024, we develop an algorithm to track the presence of scaffolding over time. We also design and implement methods to match detected scaffolds to reported locations of active scaffolding permits, enabling the identification of sidewalk sheds without corresponding permits. We identify 850,766 images of scaffolding, tagging 5,156 active sidewalk sheds and estimating 529 unpermitted sheds. We discuss the implications of an in-the-wild scaffolding classifier for urban tech, innovations to governmental inspection processes, and out-of-distribution evaluations outside of New York City.

Via

Access Paper or Ask Questions

Webots.HPC: A Parallel Robotics Simulation Pipeline for Autonomous Vehicles on High Performance Computing

Aug 01, 2021

Matt Franchi

Figure 1 for Webots.HPC: A Parallel Robotics Simulation Pipeline for Autonomous Vehicles on High Performance Computing

Figure 2 for Webots.HPC: A Parallel Robotics Simulation Pipeline for Autonomous Vehicles on High Performance Computing

Figure 3 for Webots.HPC: A Parallel Robotics Simulation Pipeline for Autonomous Vehicles on High Performance Computing

Figure 4 for Webots.HPC: A Parallel Robotics Simulation Pipeline for Autonomous Vehicles on High Performance Computing

Abstract:In the rapidly evolving and maturing field of robotics, computer simulation has become an invaluable tool in the design process. Webots, a state-of-the-art robotics simulator, is often the software of choice for robotics research. Even so, Webots simulations are often run on personal and lab computers. For projects that would benefit from an aggregated output dataset from thousands of simulation runs, there is no standard recourse; this project sets out to mitigate this by developing a formalized parallel pipeline for running sequences of Webots simulations on powerful HPC resources. Such a pipeline would allow researchers to generate massive datasets from their simulations, opening the door for potential machine learning applications and decision tool development. We have developed a pipeline capable of running Webots simulations both headlessly and in GUI-enabled mode over an SSH X11 server, with simulation execution occurring remotely on HPC compute nodes. Additionally, simulations can be run in sequence, with a batch job being distributed across an arbitrary number of computing nodes and each node having multiple instances running in parallel. The implemented distribution and parallelization are extremely effective, with a 100\% simulation completion rate after 12 hours of runs. Overall, this pipeline is very capable and can be used to extend existing projects or serve as a platform for new robotics simulation endeavors.

* 34 pages, 4 figures

Via

Access Paper or Ask Questions