Abstract:Recent advancements in deep learning for 3D models have propelled breakthroughs in generation, detection, and scene understanding. However, the effectiveness of these algorithms hinges on large training datasets. We address the challenge by introducing Efficient 3D Seam Carving (E3SC), a novel 3D model augmentation method based on seam carving, which progressively deforms only part of the input model while ensuring the overall semantics are unchanged. Experiments show that our approach is capable of producing diverse and high-quality augmented 3D shapes across various types and styles of input models, achieving considerable improvements over previous methods. Quantitative evaluations demonstrate that our method effectively enhances the novelty and quality of shapes generated by other subsequent 3D generation algorithms.
Abstract:In this paper, we present a GNN-based Line Segment Parser (GLSP), which uses a junction heatmap to predict line segments' endpoints, and graph neural networks to extract line segments and their categories. Different from previous floor plan recognition methods, which rely on semantic segmentation, our proposed method is able to output vectorized line segment and requires less post-processing steps to be put into practical use. Our experiments show that the methods outperform state-of-the-art line segment detection models on multi-class line segment detection tasks with floor plan images. In the paper, we use our floor plan dataset named Large-scale Residential Floor Plan data (LRFP). The dataset contains a total of 271,035 floor plan images. The label corresponding to each picture contains the scale information, the categories and outlines of rooms, and the endpoint positions of line segments such as doors, windows, and walls. Our augmentation method makes the dataset adaptable to the drawing styles of as many countries and regions as possible.
Abstract:Most of the achievements in artificial intelligence so far were accomplished by supervised learning which requires numerous annotated training data and thus costs innumerable manpower for labeling. Unsupervised learning is one of the effective solutions to overcome such difficulties. In our work, we propose AugNet, a new deep learning training paradigm to learn image features from a collection of unlabeled pictures. We develop a method to construct the similarities between pictures as distance metrics in the embedding space by leveraging the inter-correlation between augmented versions of samples. Our experiments demonstrate that the method is able to represent the image in low dimensional space and performs competitively in downstream tasks such as image classification and image similarity comparison. Specifically, we achieved over 60% and 27% accuracy on the STL10 and CIFAR100 datasets with unsupervised clustering, respectively. Moreover, unlike many deep-learning-based image retrieval algorithms, our approach does not require access to external annotated datasets to train the feature extractor, but still shows comparable or even better feature representation ability and easy-to-use characteristics. In our evaluations, the method outperforms all the state-of-the-art image retrieval algorithms on some out-of-domain image datasets. The code for the model implementation is available at https://github.com/chenmingxiang110/AugNet.
Abstract:In this paper, we propose a novel approach to generate images (or other artworks) by using neural cellular automatas (NCAs). Rather than training NCAs based on single images one by one, we combined the idea with variational autoencoders (VAEs), and hence explored some applications, such as image restoration and style fusion. The code for model implementation is available online.
Abstract:This paper presents a novel framework using neural cellular automata (NCA) to regenerate and predict geographic information. The model extends the idea of using NCA to generate/regenerate a specific image by training the model with various geographic data, and thus, taking the traffic condition map as an example, the model is able to predict traffic conditions by giving certain induction information. Our research verified the analogy between NCA and gene in biology, while the innovation of the model significantly widens the boundary of possible applications based on NCAs. From our experimental results, the model shows great potentials in its usability and versatility which are not available in previous studies. The code for model implementation is available at https://redacted.
Abstract:This paper presents a novel neural network design that learns the heuristic for Large Neighborhood Search (LNS). LNS consists of a destroy operator and a repair operator that specify a way to carry out the neighborhood search to solve the Combinatorial Optimization problems. The proposed approach in this paper applies a Hierarchical Recurrent Graph Convolutional Network (HRGCN) as a LNS heuristic, namely Dynamic Partial Removal, with the advantage of adaptive destruction and the potential to search across a large scale, as well as the context-awareness in both spatial and temporal perspective. This model is generalized as an efficient heuristic approach to different combinatorial optimization problems, especially to the problems with relatively tight constraints. We apply this model to vehicle routing problem (VRP) in this paper as an example. The experimental results show that this approach outperforms the traditional LNS heuristics on the same problem as well. The source code is available at \href{https://github.com/water-mirror/DPR}{https://github.com/water-mirror/DPR}.
Abstract:This paper presents an approach to learn the local-search heuristics that iteratively improves the solution of Vehicle Routing Problem (VRP). A local-search heuristics is composed of a destroy operator that destructs a candidate solution, and a following repair operator that rebuilds the destructed one into a new one. The proposed neural network, as trained through actor-critic framework, consists of an encoder in form of a modified version of Graph Attention Network where node embeddings and edge embeddings are integrated, and a GRU-based decoder rendering a pair of destroy and repair operators. Experiment results show that it outperforms both the traditional heuristics algorithms and the existing neural combinatorial optimization for VRP on medium-scale data set, and is able to tackle the large-scale data set (e.g., over 400 nodes) which is a considerable challenge in this area. Moreover, the need for expertise and handcrafted heuristics design is eliminated due to the fact that the proposed network learns to design the heuristics with a better performance. Our implementation is available online.