Abstract:Multi-agent systems (MAS) constitute a significant role in exploring machine intelligence and advanced applications. In order to deeply investigate complicated interactions within MAS scenarios, we originally propose "GNN for MBRL" model, which utilizes a state-spaced Graph Neural Networks with Model-based Reinforcement Learning to address specific MAS missions (e.g., Billiard-Avoidance, Autonomous Driving Cars). In detail, we firstly used GNN model to predict future states and trajectories of multiple agents, then applied the Cross-Entropy Method (CEM) optimized Model Predictive Control to assist the ego-agent planning actions and successfully accomplish certain MAS tasks.
Abstract:Most computer vision applications aim to identify pixels in a scene and use them for diverse purposes. One intriguing application is car damage detection for insurance carriers which tends to detect all car damages by comparing both pre-trip and post-trip images, even requiring two components: (i) car damage detection; (ii) image alignment. Firstly, we implemented a Mask R-CNN model to detect car damages on custom images. Whereas for the image alignment section, we especially propose a novel self-supervised Patch-to-Patch SimCLR inspired alignment approach to find perspective transformations between custom pre/post car rental images except for traditional computer vision methods.
Abstract:This work addresses the problem of semi-supervised image classification tasks with the integration of several effective self-supervised pretext tasks. Different from widely-used consistency regularization within semi-supervised learning, we explored a novel self-supervised semi-supervised learning framework (Color-$S^{4}L$) especially with image colorization proxy task and deeply evaluate performances of various network architectures in such special pipeline. Also, we demonstrated its effectiveness and optimal performance on CIFAR-10, SVHN and CIFAR-100 datasets in comparison to previous supervised and semi-supervised optimal methods.
Abstract:Mixed Reality (MR) is constantly evolving to inspire new patterns of robot manipulation for more advanced Human- Robot Interaction under the 4th Industrial Revolution Paradigm. Consider that Mixed Reality aims to connect physical and digital worlds to provide special immersive experiences, it is necessary to establish the information exchange platform and robot control systems within the developed MR scenarios. In this work, we mainly present multiple effective motion control methods applied on different interactive robotic arms (e.g., UR5, UR5e, myCobot) for the Unity-based development of MR applications, including GUI control panel, text input control panel, end-effector object dynamic tracking and ROS-Unity digital-twin connection.
Abstract:To facilitate recent advances in robotics and AI for delicate collaboration between humans and machines, we propose the Kinova Gemini, an original robotic system that integrates conversational AI dialogue and visual reasoning to make the Kinova Gen3 lite robot help people retrieve objects or complete perception-based pick-and-place tasks. When a person walks up to Kinova Gen3 lite, our Kinova Gemini is able to fulfill the user's requests in three different applications: (1) It can start a natural dialogue with people to interact and assist humans to retrieve objects and hand them to the user one by one. (2) It detects diverse objects with YOLO v3 and recognize color attributes of the item to ask people if they want to grasp it via the dialogue or enable the user to choose which specific one is required. (3) It applies YOLO v3 to recognize multiple objects and let you choose two items for perception-based pick-and-place tasks such as "Put the banana into the bowl" with visual reasoning and conversational interaction.
Abstract:The problem of hyperparameter optimization exists widely in the real life and many common tasks can be transformed into it, such as neural architecture search and feature subset selection. Without considering various constraints, the existing hyperparameter tuning techniques can solve these problems effectively by traversing as many hyperparameter configurations as possible. However, because of the limited resources and budget, it is not feasible to evaluate so many kinds of configurations, which requires us to design effective algorithms to find a best possible hyperparameter configuration with a finite number of configuration evaluations. In this paper, we simulate human thinking processes and combine the merit of the existing techniques, and thus propose a new algorithm called ExperienceThinking, trying to solve this constrained hyperparameter optimization problem. In addition, we analyze the performances of 3 classical hyperparameter optimization algorithms with a finite number of configuration evaluations, and compare with that of ExperienceThinking. The experimental results show that our proposed algorithm provides superior results and has better performance.