Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tomaž Martinčič

Generative Model for Less-Resourced Language with 1 billion parameters

Oct 09, 2024

Domen Vreš, Martin Božič, Aljaž Potočnik, Tomaž Martinčič, Marko Robnik-Šikonja

Figure 1 for Generative Model for Less-Resourced Language with 1 billion parameters

Figure 2 for Generative Model for Less-Resourced Language with 1 billion parameters

Figure 3 for Generative Model for Less-Resourced Language with 1 billion parameters

Figure 4 for Generative Model for Less-Resourced Language with 1 billion parameters

Abstract:Large language models (LLMs) are a basic infrastructure for modern natural language processing. Many commercial and open-source LLMs exist for English, e.g., ChatGPT, Llama, Falcon, and Mistral. As these models are trained on mostly English texts, their fluency and knowledge of low-resource languages and societies are superficial. We present the development of large generative language models for a less-resourced language. GaMS 1B - Generative Model for Slovene with 1 billion parameters was created by continuing pretraining of the existing English OPT model. We developed a new tokenizer adapted to Slovene, Croatian, and English languages and used embedding initialization methods FOCUS and WECHSEL to transfer the embeddings from the English OPT model. We evaluate our models on several classification datasets from the Slovene suite of benchmarks and generative sentence simplification task SENTA. We only used a few-shot in-context learning of our models, which are not yet instruction-tuned. For classification tasks, in this mode, the generative models lag behind the existing Slovene BERT-type models fine-tuned for specific tasks. On a sentence simplification task, the GaMS models achieve comparable or better performance than the GPT-3.5-Turbo model.

Via

Access Paper or Ask Questions

Why is the winner the best?

Mar 30, 2023

Matthias Eisenmann, Annika Reinke, Vivienn Weru, Minu Dietlinde Tizabi, Fabian Isensee, Tim J. Adler, Sharib Ali, Vincent Andrearczyk, Marc Aubreville, Ujjwal Baid(+115 more)

Figure 1 for Why is the winner the best?

Figure 2 for Why is the winner the best?

Figure 3 for Why is the winner the best?

Figure 4 for Why is the winner the best?

Abstract:International benchmarking competitions have become fundamental for the comparative performance assessment of image analysis methods. However, little attention has been given to investigating what can be learnt from these competitions. Do they really generate scientific progress? What are common and successful participation strategies? What makes a solution superior to a competing method? To address this gap in the literature, we performed a multi-center study with all 80 competitions that were conducted in the scope of IEEE ISBI 2021 and MICCAI 2021. Statistical analyses performed based on comprehensive descriptions of the submitted algorithms linked to their rank as well as the underlying participation strategies revealed common characteristics of winning solutions. These typically include the use of multi-task learning (63%) and/or multi-stage pipelines (61%), and a focus on augmentation (100%), image preprocessing (97%), data curation (79%), and postprocessing (66%). The "typical" lead of a winning team is a computer scientist with a doctoral degree, five years of experience in biomedical image analysis, and four years of experience in deep learning. Two core general development strategies stood out for highly-ranked teams: the reflection of the metrics in the method design and the focus on analyzing and handling failure cases. According to the organizers, 43% of the winning algorithms exceeded the state of the art but only 11% completely solved the respective domain problem. The insights of our study could help researchers (1) improve algorithm development strategies when approaching new problems, and (2) focus on open research questions revealed by this work.

* accepted to CVPR 2023

Via

Access Paper or Ask Questions

Segmentation of Multiple Myeloma Plasma Cells in Microscopy Images with Noisy Labels

Nov 08, 2021

Álvaro García Faura, Dejan Štepec, Tomaž Martinčič, Danijel Skočaj

Figure 1 for Segmentation of Multiple Myeloma Plasma Cells in Microscopy Images with Noisy Labels

Figure 2 for Segmentation of Multiple Myeloma Plasma Cells in Microscopy Images with Noisy Labels

Figure 3 for Segmentation of Multiple Myeloma Plasma Cells in Microscopy Images with Noisy Labels

Figure 4 for Segmentation of Multiple Myeloma Plasma Cells in Microscopy Images with Noisy Labels

Abstract:A key component towards an improved and fast cancer diagnosis is the development of computer-assisted tools. In this article, we present the solution that won the SegPC-2021 competition for the segmentation of multiple myeloma plasma cells in microscopy images. The labels used in the competition dataset were generated semi-automatically and presented noise. To deal with it, a heavy image augmentation procedure was carried out and predictions from several models were combined using a custom ensemble strategy. State-of-the-art feature extractors and instance segmentation architectures were used, resulting in a mean Intersection-over-Union of 0.9389 on the SegPC-2021 final test set.

* Accepted to SPIE Medical Imaging conference

Via

Access Paper or Ask Questions