Abstract:The text-critical practice of grouping witnesses into families or texttypes often faces two obstacles: Contamination in the manuscript tradition, and co-dependence in identifying characteristic readings and manuscripts. We introduce non-negative matrix factorization (NMF) as a simple, unsupervised, and efficient way to cluster large numbers of manuscripts and readings simultaneously while summarizing contamination using an easy-to-interpret mixture model. We apply this method to an extensive collation of the New Testament epistle of Jude and show that the resulting clusters correspond to human-identified textual families from existing research.