Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nicolas Perony

Collaborative Control for Geometry-Conditioned PBR Image Generation

Feb 20, 2024

Shimon Vainer, Mark Boss, Mathias Parger, Konstantin Kutsy, Dante De Nigris, Ciara Rowles, Nicolas Perony, Simon Donné

Abstract:Current 3D content generation approaches build on diffusion models that output RGB images. Modern graphics pipelines, however, require physically-based rendering (PBR) material properties. We propose to model the PBR image distribution directly, avoiding photometric inaccuracies in RGB generation and the inherent ambiguity in extracting PBR from RGB. Existing paradigms for cross-modal fine-tuning are not suited for PBR generation due to both a lack of data and the high dimensionality of the output modalities: we overcome both challenges by retaining a frozen RGB model and tightly linking a newly trained PBR model using a novel cross-network communication paradigm. As the base RGB model is fully frozen, the proposed method does not risk catastrophic forgetting during fine-tuning and remains compatible with techniques such as IPAdapter pretrained for the base RGB model. We validate our design choices, robustness to data sparsity, and compare against existing paradigms with an extensive experimental section.

* 19 pages, 10 figures; Project page: https://unity-research.github.io/holo-gen/

Via

Access Paper or Ask Questions

Chimpanzee voice prints? Insights from transfer learning experiments from human voices

Dec 15, 2021

Mael Leroux, Orestes Gutierrez Al-Khudhairy, Nicolas Perony, Simon W. Townsend

Figure 1 for Chimpanzee voice prints? Insights from transfer learning experiments from human voices

Figure 2 for Chimpanzee voice prints? Insights from transfer learning experiments from human voices

Figure 3 for Chimpanzee voice prints? Insights from transfer learning experiments from human voices

Figure 4 for Chimpanzee voice prints? Insights from transfer learning experiments from human voices

Abstract:Individual vocal differences are ubiquitous in the animal kingdom. In humans, these differences pervade the entire vocal repertoire and constitute a "voice print". Apes, our closest-living relatives, possess individual signatures within specific call types, but the potential for a unique voice print has been little investigated. This is partially attributed to the limitations associated with extracting meaningful features from small data sets. Advances in machine learning have highlighted an alternative to traditional acoustic features, namely pre-trained learnt extractors. Here, we present an approach building on these developments: leveraging a feature extractor based on a deep neural network trained on over 10,000 human voice prints to provide an informative space over which we identify chimpanzee voice prints. We compare our results with those obtained by using traditional acoustic features and discuss the benefits of our methodology and the significance of our findings for the identification of "voice prints" in non-human animals.

Via

Access Paper or Ask Questions