Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Arslan Ali

Cosmos World Foundation Model Platform for Physical AI

Jan 07, 2025

NVIDIA, :, Niket Agarwal, Arslan Ali, Maciej Bala, Yogesh Balaji, Erik Barker, Tiffany Cai, Prithvijit Chattopadhyay, Yongxin Chen(+69 more)

Figure 1 for Cosmos World Foundation Model Platform for Physical AI

Figure 2 for Cosmos World Foundation Model Platform for Physical AI

Figure 3 for Cosmos World Foundation Model Platform for Physical AI

Figure 4 for Cosmos World Foundation Model Platform for Physical AI

Abstract:Physical AI needs to be trained digitally first. It needs a digital twin of itself, the policy model, and a digital twin of the world, the world model. In this paper, we present the Cosmos World Foundation Model Platform to help developers build customized world models for their Physical AI setups. We position a world foundation model as a general-purpose world model that can be fine-tuned into customized world models for downstream applications. Our platform covers a video curation pipeline, pre-trained world foundation models, examples of post-training of pre-trained world foundation models, and video tokenizers. To help Physical AI builders solve the most critical problems of our society, we make our platform open-source and our models open-weight with permissive licenses available via https://github.com/NVIDIA/Cosmos.

Via

Access Paper or Ask Questions

Rapid Detection of Aircrafts in Satellite Imagery based on Deep Neural Networks

Apr 21, 2021

Arsalan Tahir, Muhammad Adil, Arslan Ali

Figure 1 for Rapid Detection of Aircrafts in Satellite Imagery based on Deep Neural Networks

Figure 2 for Rapid Detection of Aircrafts in Satellite Imagery based on Deep Neural Networks

Figure 3 for Rapid Detection of Aircrafts in Satellite Imagery based on Deep Neural Networks

Figure 4 for Rapid Detection of Aircrafts in Satellite Imagery based on Deep Neural Networks

Abstract:Object detection is one of the fundamental objectives in Applied Computer Vision. In some of the applications, object detection becomes very challenging such as in the case of satellite image processing. Satellite image processing has remained the focus of researchers in domains of Precision Agriculture, Climate Change, Disaster Management, etc. Therefore, object detection in satellite imagery is one of the most researched problems in this domain. This paper focuses on aircraft detection. in satellite imagery using deep learning techniques. In this paper, we used YOLO deep learning framework for aircraft detection. This method uses satellite images collected by different sources as learning for the model to perform detection. Object detection in satellite images is mostly complex because objects have many variations, types, poses, sizes, complex and dense background. YOLO has some limitations for small size objects (less than$\sim$32 pixels per object), therefore we upsample the prediction grid to reduce the coarseness of the model and to accurately detect the densely clustered objects. The improved model shows good accuracy and performance on different unknown images having small, rotating, and dense objects to meet the requirements in real-time.

Via

Access Paper or Ask Questions

Beyond cross-entropy: learning highly separable feature distributions for robust and accurate classification

Oct 29, 2020

Arslan Ali, Andrea Migliorati, Tiziano Bianchi, Enrico Magli

Figure 1 for Beyond cross-entropy: learning highly separable feature distributions for robust and accurate classification

Figure 2 for Beyond cross-entropy: learning highly separable feature distributions for robust and accurate classification

Figure 3 for Beyond cross-entropy: learning highly separable feature distributions for robust and accurate classification

Figure 4 for Beyond cross-entropy: learning highly separable feature distributions for robust and accurate classification

Abstract:Deep learning has shown outstanding performance in several applications including image classification. However, deep classifiers are known to be highly vulnerable to adversarial attacks, in that a minor perturbation of the input can easily lead to an error. Providing robustness to adversarial attacks is a very challenging task especially in problems involving a large number of classes, as it typically comes at the expense of an accuracy decrease. In this work, we propose the Gaussian class-conditional simplex (GCCS) loss: a novel approach for training deep robust multiclass classifiers that provides adversarial robustness while at the same time achieving or even surpassing the classification accuracy of state-of-the-art methods. Differently from other frameworks, the proposed method learns a mapping of the input classes onto target distributions in a latent space such that the classes are linearly separable. Instead of maximizing the likelihood of target labels for individual samples, our objective function pushes the network to produce feature distributions yielding high inter-class separation. The mean values of the distributions are centered on the vertices of a simplex such that each class is at the same distance from every other class. We show that the regularization of the latent space based on our approach yields excellent classification accuracy and inherently provides robustness to multiple adversarial attacks, both targeted and untargeted, outperforming state-of-the-art approaches over challenging datasets.

Via

Access Paper or Ask Questions

BioMetricNet: deep unconstrained face verification through learning of metrics regularized onto Gaussian distributions

Aug 13, 2020

Arslan Ali, Matteo Testa, Tiziano Bianchi, Enrico Magli

Figure 1 for BioMetricNet: deep unconstrained face verification through learning of metrics regularized onto Gaussian distributions

Figure 2 for BioMetricNet: deep unconstrained face verification through learning of metrics regularized onto Gaussian distributions

Figure 3 for BioMetricNet: deep unconstrained face verification through learning of metrics regularized onto Gaussian distributions

Figure 4 for BioMetricNet: deep unconstrained face verification through learning of metrics regularized onto Gaussian distributions

Abstract:We present BioMetricNet: a novel framework for deep unconstrained face verification which learns a regularized metric to compare facial features. Differently from popular methods such as FaceNet, the proposed approach does not impose any specific metric on facial features; instead, it shapes the decision space by learning a latent representation in which matching and non-matching pairs are mapped onto clearly separated and well-behaved target distributions. In particular, the network jointly learns the best feature representation, and the best metric that follows the target distributions, to be used to discriminate face images. In this paper we present this general framework, first of its kind for facial verification, and tailor it to Gaussian distributions. This choice enables the use of a simple linear decision boundary that can be tuned to achieve the desired trade-off between false alarm and genuine acceptance rate, and leads to a loss function that can be written in closed form. Extensive analysis and experimentation on publicly available datasets such as Labeled Faces in the wild (LFW), Youtube faces (YTF), Celebrities in Frontal-Profile in the Wild (CFP), and challenging datasets like cross-age LFW (CALFW), cross-pose LFW (CPLFW), In-the-wild Age Dataset (AgeDB) show a significant performance improvement and confirms the effectiveness and superiority of BioMetricNet over existing state-of-the-art methods.

* Accepted at ECCV20

Via

Access Paper or Ask Questions

MagNet: Discovering Multi-agent Interaction Dynamics using Neural Network

Mar 03, 2020

Priyabrata Saha, Arslan Ali, Burhan A. Mudassar, Yun Long, Saibal Mukhopadhyay

Figure 1 for MagNet: Discovering Multi-agent Interaction Dynamics using Neural Network

Figure 2 for MagNet: Discovering Multi-agent Interaction Dynamics using Neural Network

Figure 3 for MagNet: Discovering Multi-agent Interaction Dynamics using Neural Network

Figure 4 for MagNet: Discovering Multi-agent Interaction Dynamics using Neural Network

Abstract:We present the MagNet, a neural network-based multi-agent interaction model to discover the governing dynamics and predict evolution of a complex multi-agent system from observations. We formulate a multi-agent system as a coupled non-linear network with a generic ordinary differential equation (ODE) based state evolution, and develop a neural network-based realization of its time-discretized model. MagNet is trained to discover the core dynamics of a multi-agent system from observations, and tuned on-line to learn agent-specific parameters of the dynamics to ensure accurate prediction even when physical or relational attributes of agents, or number of agents change. We evaluate MagNet on a point-mass system in two-dimensional space, Kuramoto phase synchronization dynamics and predator-swarm interaction dynamics demonstrating orders of magnitude improvement in prediction accuracy over traditional deep learning models.

* Accepted manuscript by ICRA 2020

Via

Access Paper or Ask Questions

Learning mappings onto regularized latent spaces for biometric authentication

Nov 20, 2019

Matteo Testa, Arslan Ali, Tiziano Bianchi, Enrico Magli

Figure 1 for Learning mappings onto regularized latent spaces for biometric authentication

Figure 2 for Learning mappings onto regularized latent spaces for biometric authentication

Figure 3 for Learning mappings onto regularized latent spaces for biometric authentication

Figure 4 for Learning mappings onto regularized latent spaces for biometric authentication

Abstract:We propose a novel architecture for generic biometric authentication based on deep neural networks: RegNet. Differently from other methods, RegNet learns a mapping of the input biometric traits onto a target distribution in a well-behaved space in which users can be separated by means of simple and tunable boundaries. More specifically, authorized and unauthorized users are mapped onto two different and well behaved Gaussian distributions. The novel approach of learning the mapping instead of the boundaries further avoids the problem encountered in typical classifiers for which the learnt boundaries may be complex and difficult to analyze. RegNet achieves high performance in terms of security metrics such as Equal Error Rate (EER), False Acceptance Rate (FAR) and Genuine Acceptance Rate (GAR). The experiments we conducted on publicly available datasets of face and fingerprint confirm the effectiveness of the proposed system.

* Accepted at IEEE MMSP 2019

Via

Access Paper or Ask Questions