Abstract:Background: One of the key FDA-approved medications for Opioid Use Disorder (OUD) is buprenorphine. Despite its popularity, individuals often report various information needs regarding buprenorphine treatment on social media platforms like Reddit. However, the key challenge is to characterize these needs. In this study, we propose a theme-based framework to curate and analyze large-scale data from social media to characterize self-reported treatment information needs (TINs). Methods: We collected 15,253 posts from r/Suboxone, one of the largest Reddit sub-community for buprenorphine products. Following the standard protocol, we first identified and defined five main themes from the data and then coded 6,000 posts based on these themes, where one post can be labeled with applicable one to three themes. Finally, we determined the most frequently appearing sub-themes (topics) for each theme by analyzing samples from each group. Results: Among the 6,000 posts, 40.3% contained a single theme, 36% two themes, and 13.9% three themes. The most frequent topics for each theme or theme combination came with several key findings - prevalent reporting of psychological and physical effects during recovery, complexities in accessing buprenorphine, and significant information gaps regarding medication administration, tapering, and usage of substances during different stages of recovery. Moreover, self-treatment strategies and peer-driven advice reveal valuable insights and potential misconceptions. Conclusions: The findings obtained using our proposed framework can inform better patient education and patient-provider communication, design systematic interventions to address treatment-related misconceptions and rumors, and streamline the generation of hypotheses for future research.