Condition monitoring is central to the efficient operation of wind farms due to the challenging operating conditions, rapid technology development and high number of aging wind turbines. In particular, predictive maintenance planning requires early detection of faults with few false positives. This is a challenging problem due to the complex and weak signatures of some faults, in particular of faults occurring in some of the drivetrain bearings. Here, we investigate recently proposed condition monitoring methods based on unsupervised dictionary learning using vibration data recorded over 46 months under typical industrial operations, thereby contributing novel test results and real world data that is made publicly available. Results of former studies addressing condition--monitoring tasks using dictionary learning indicate that unsupervised feature learning is useful for diagnosis and anomaly detection purposes. However, these studies are based on small sets of labeled data from test rigs operating under controlled conditions that focus on classification tasks, which are useful for quantitative method comparisons but gives little information about how useful these approaches are in practice. In this study dictionaries are learned from gearbox vibrations in six different turbines and the dictionaries are subsequently propagated over a few years of monitoring data when faults are known to occur. We perform the experiment using two different sparse coding algorithms to investigate if the algorithm selected affects the features of abnormal conditions. We calculate the dictionary distance between the initial and propagated dictionaries and find time periods of abnormal dictionary adaptation starting six months before a drivetrain bearing replacement and one year before the resulting gearbox replacement. We also investigate the distance between dictionaries learned from geographically nearby