Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hyunbyung Park

Dataverse: Open-Source ETL Pipeline for Large Language Models

Mar 28, 2024

Hyunbyung Park, Sukyung Lee, Gyoungjin Gim, Yungi Kim, Dahyun Kim, Chanjun Park

Abstract:To address the challenges associated with data processing at scale, we propose Dataverse, a unified open-source Extract-Transform-Load (ETL) pipeline for large language models (LLMs) with a user-friendly design at its core. Easy addition of custom processors with block-based interface in Dataverse allows users to readily and efficiently use Dataverse to build their own ETL pipeline. We hope that Dataverse will serve as a vital tool for LLM development and open source the entire library to welcome community contribution. Additionally, we provide a concise, two-minute video demonstration of our system, illustrating its capabilities and implementation.

Via

Access Paper or Ask Questions

SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling

Dec 29, 2023

Dahyun Kim, Chanjun Park, Sanghoon Kim, Wonsung Lee, Wonho Song, Yunsu Kim, Hyeonwoo Kim, Yungi Kim, Hyeonju Lee, Jihoo Kim(+8 more)

Figure 1 for SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling

Figure 2 for SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling

Figure 3 for SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling

Figure 4 for SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling

Abstract:We introduce SOLAR 10.7B, a large language model (LLM) with 10.7 billion parameters, demonstrating superior performance in various natural language processing (NLP) tasks. Inspired by recent efforts to efficiently up-scale LLMs, we present a method for scaling LLMs called depth up-scaling (DUS), which encompasses depthwise scaling and continued pretraining. In contrast to other LLM up-scaling methods that use mixture-of-experts, DUS does not require complex changes to train and inference efficiently. We show experimentally that DUS is simple yet effective in scaling up high-performance LLMs from small ones. Building on the DUS model, we additionally present SOLAR 10.7B-Instruct, a variant fine-tuned for instruction-following capabilities, surpassing Mixtral-8x7B-Instruct. SOLAR 10.7B is publicly available under the Apache 2.0 license, promoting broad access and application in the LLM field.

* 13 pages

Via

Access Paper or Ask Questions