Language is contextual and sheaf theory provides a high level mathematical framework to model contextuality. We show how sheaf theory can model the contextual nature of natural language and how gluing can be used to provide a global semantics for a discourse by putting together the local logical semantics of each sentence within the discourse. We introduce a presheaf structure corresponding to a basic form of Discourse Representation Structures. Within this setting, we formulate a notion of semantic unification --- gluing meanings of parts of a discourse into a coherent whole --- as a form of sheaf-theoretic gluing. We illustrate this idea with a number of examples where it can used to represent resolutions of anaphoric references. We also discuss multivalued gluing, described using a distributions functor, which can be used to represent situations where multiple gluings are possible, and where we may need to rank them using quantitative measures. Dedicated to Jim Lambek on the occasion of his 90th birthday.