Robotic research is often built on approaches that are motivated by insights from self-examination of how we interface with the world. However, given current theories about human cognition and sensory processing, it is reasonable to assume that the internal workings of the brain are separate from how we interface with the world and ourselves. To amend some of these misconceptions arising from self-examination this article reviews human visual understanding for cognition and action, specifically manipulation. Our focus is on identifying overarching principles such as the separation into visual processing for action and cognition, hierarchical processing of visual input, and the contextual and anticipatory nature of visual processing for action. We also provide a rudimentary exposition of previous theories about visual understanding that shows how self-examination can lead down the wrong path. Our hope is that the article will provide insights for the robotic researcher that can help them navigate the path of self-examination, give them an overview of current theories about human visual processing, as well as provide a source for further relevant reading.