Abstract:The introduction of cheap RGB-D cameras, stereo cameras, and LIDAR devices has given the computer vision community 3D information that conventional RGB cameras cannot provide. This data is often stored as a point cloud. In this paper, we present a novel method to apply the concept of convolutional neural networks to this type of data. By creating a mapping of nearest neighbors in a dataset, and individually applying weights to spatial relationships between points, we achieve an architecture that works directly with point clouds, but closely resembles a convolutional neural net in both design and behavior. Such a method bypasses the need for extensive feature engineering, while proving to be computationally efficient and requiring few parameters.