Department of Artificial Intelligence, Eötvös Loránd University
Abstract:We present Racka, a lightweight, continually pretrained large language model designed to bridge the resource gap between Hungarian and high-resource languages such as English and German. Racka employs parameter-efficient continual pretraining via Low-Rank Adaptation (LoRA) on a Qwen-3 4B backbone, making the recipe practical on A100 (40GB)-based HPC clusters with low inter-node bandwidth. To better match the training distribution, we replace and adapt the tokenizer, achieving substantially improved tokenization fertility for Hungarian while maintaining competitive performance in English and German. The model is trained on 160B subword tokens drawn from a mixture of internet and high-quality curated sources, with a composition of 44% Hungarian, 24% English, 21% German, and 11% code. This data mix is chosen to mitigate catastrophic forgetting and preserve high-resource language capabilities during continual pretraining. Our preliminary results indicate modest but stable results in language adaptation.
Abstract:Regulatory frameworks for the use of AI are emerging. However, they trail behind the fast-evolving malicious AI technologies that can quickly cause lasting societal damage. In response, we introduce a pioneering Assistive AI framework designed to enhance human decision-making capabilities. This framework aims to establish a trust network across various fields, especially within legal contexts, serving as a proactive complement to ongoing regulatory efforts. Central to our framework are the principles of privacy, accountability, and credibility. In our methodology, the foundation of reliability of information and information sources is built upon the ability to uphold accountability, enhance security, and protect privacy. This approach supports, filters, and potentially guides communication, thereby empowering individuals and communities to make well-informed decisions based on cutting-edge advancements in AI. Our framework uses the concept of Boards as proxies to collectively ensure that AI-assisted decisions are reliable, accountable, and in alignment with societal values and legal standards. Through a detailed exploration of our framework, including its main components, operations, and sample use cases, the paper shows how AI can assist in the complex process of decision-making while maintaining human oversight. The proposed framework not only extends regulatory landscapes but also highlights the synergy between AI technology and human judgement, underscoring the potential of AI to serve as a vital instrument in discerning reality from fiction and thus enhancing the decision-making process. Furthermore, we provide domain-specific use cases to highlight the applicability of our framework.