Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alyssa Herbst

Active Learning by Greedy Split and Label Exploration

Jun 17, 2019

Alyssa Herbst, Bert Huang

Figure 1 for Active Learning by Greedy Split and Label Exploration

Figure 2 for Active Learning by Greedy Split and Label Exploration

Figure 3 for Active Learning by Greedy Split and Label Exploration

Figure 4 for Active Learning by Greedy Split and Label Exploration

Abstract:Annotating large unlabeled datasets can be a major bottleneck for machine learning applications. We introduce a scheme for inferring labels of unlabeled data at a fraction of the cost of labeling the entire dataset. We refer to the scheme as greedy split and label exploration (GSAL). GSAL greedily queries an oracle (or human labeler) and partitions a dataset to find data subsets that have mostly the same label. GSAL can then infer labels by majority vote of the known labels in each subset. GSAL makes the decision to split or label from a subset by maximizing a lower bound on the expected number of correctly labeled examples. GSAL improves upon existing hierarchical labeling schemes by using supervised models to partition the data, therefore avoiding reliance on unsupervised clustering methods that may not accurately group data by label. We design GSAL with strategies to avoid bias that could be introduced through this adaptive partitioning. We evaluate GSAL on labeling of three datasets and find that it outperforms existing strategies for adaptive labeling.

Via

Access Paper or Ask Questions