Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Miroslav Fil

Beta-VAE Reproducibility: Challenges and Extensions

Dec 30, 2021

Miroslav Fil, Munib Mesinovic, Matthew Morris, Jonas Wildberger

Figure 1 for Beta-VAE Reproducibility: Challenges and Extensions

Figure 2 for Beta-VAE Reproducibility: Challenges and Extensions

Figure 3 for Beta-VAE Reproducibility: Challenges and Extensions

Figure 4 for Beta-VAE Reproducibility: Challenges and Extensions

Abstract:$\beta$-VAE is a follow-up technique to variational autoencoders that proposes special weighting of the KL divergence term in the VAE loss to obtain disentangled representations. Unsupervised learning is known to be brittle even on toy datasets and a meaningful, mathematically precise definition of disentanglement remains difficult to find. Here we investigate the original $\beta$-VAE paper and add evidence to the results previously obtained indicating its lack of reproducibility. We also further expand the experimentation of the models and include further more complex datasets in the analysis. We also implement an FID scoring metric for the $\beta$-VAE model and conclude a qualitative analysis of the results obtained. We end with a brief discussion on possible future investigations that can be conducted to add more robustness to the claims.

Via

Access Paper or Ask Questions

DARTS without a Validation Set: Optimizing the Marginal Likelihood

Dec 24, 2021

Miroslav Fil, Binxin Ru, Clare Lyle, Yarin Gal

Figure 1 for DARTS without a Validation Set: Optimizing the Marginal Likelihood

Figure 2 for DARTS without a Validation Set: Optimizing the Marginal Likelihood

Figure 3 for DARTS without a Validation Set: Optimizing the Marginal Likelihood

Figure 4 for DARTS without a Validation Set: Optimizing the Marginal Likelihood

Abstract:The success of neural architecture search (NAS) has historically been limited by excessive compute requirements. While modern weight-sharing NAS methods such as DARTS are able to finish the search in single-digit GPU days, extracting the final best architecture from the shared weights is notoriously unreliable. Training-Speed-Estimate (TSE), a recently developed generalization estimator with a Bayesian marginal likelihood interpretation, has previously been used in place of the validation loss for gradient-based optimization in DARTS. This prevents the DARTS skip connection collapse, which significantly improves performance on NASBench-201 and the original DARTS search space. We extend those results by applying various DARTS diagnostics and show several unusual behaviors arising from not using a validation set. Furthermore, our experiments yield concrete examples of the depth gap and topology selection in DARTS having a strongly negative impact on the search performance despite generally receiving limited attention in the literature compared to the operations selection.

* Presented at the 5th Workshop on Meta-Learning at NeurIPS 2021

Via

Access Paper or Ask Questions