Abstract:We present GO-Surf, a direct feature grid optimization method for accurate and fast surface reconstruction from RGB-D sequences. We model the underlying scene with a learned hierarchical feature voxel grid that encapsulates multi-level geometric and appearance local information. Feature vectors are directly optimized such that after being tri-linearly interpolated, decoded by two shallow MLPs into signed distance and radiance values, and rendered via surface volume rendering, the discrepancy between synthesized and observed RGB/depth values is minimized. Our supervision signals -- RGB, depth and approximate SDF -- can be obtained directly from input images without any need for fusion or post-processing. We formulate a novel SDF gradient regularization term that encourages surface smoothness and hole filling while maintaining high frequency details. GO-Surf can optimize sequences of $1$-$2$K frames in $15$-$45$ minutes, a speedup of $\times60$ over NeuralRGB-D, the most related approach based on an MLP representation, while maintaining on par performance on standard benchmarks. Project page: https://jingwenwang95.github.io/go_surf/