Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Eren Dogan

Cosmos-LLaVA: Chatting with the Visual Cosmos-LLaVA: Görselle Sohbet Etmek

Dec 03, 2024

Ahmed Zeer, Eren Dogan, Yusuf Erdem, Elif Ince, Osama Shbib, M. Egemen Uzun, Atahan Uz, M. Kaan Yuce, H. Toprak Kesgin, M. Fatih Amasyali

Abstract:In this study, a Turkish visual instruction model was developed and various model architectures and dataset combinations were analysed to improve the performance of this model. The Cosmos-LLaVA model, which is built by combining different large language models and image coders, is designed to overcome the deficiencies in the Turkish language. In the experiments, the effects of fine-tuning with various datasets on the model performance are analysed in detail. The results show that model architecture and dataset selection have a significant impact on performance. Bu \c{c}al{\i}\c{s}mada bir T\"urk\c{c}e g\"orsel talimat modeli geli\c{s}tirilerek bu modelin performans{\i}n{\i} art{\i}rmaya y\"onelik \c{c}e\c{s}itli model mimarileri ve veri k\"umesi kombinasyonlar{\i} derinlemesine incelenmi\c{s}tir. Farkl{\i} b\"uy\"uk dil modelleri ve g\"or\"unt\"u kodlay{\i}c{\i}lar{\i}n{\i}n bir araya getirilmesiyle olu\c{s}turulan Cosmos-LLaVA modeli, T\"urk\c{c}e dilindeki eksiklikleri gidermeye y\"onelik olarak tasarlanm{\i}\c{s}t{\i}r. Yap{\i}lan deneylerde, \c{c}e\c{s}itli veri k\"umeleri ile yap{\i}lan ince ayarlar{\i}n model performans{\i}n{\i} nas{\i}l etkiledi\u{g}i detayl{\i} olarak ele al{\i}nm{\i}\c{s}t{\i}r. Sonu\c{c}lar, model mimarisi ve veri k\"umesi se\c{c}iminin performans \"uzerinde \"onemli bir etkiye sahip oldu\u{g}unu g\"ostermektedir.

* in Turkish language, 2024 8th International Artificial Intelligence and Data Processing Symposium (IDAP)

Via

Access Paper or Ask Questions

Optimizing Large Language Models for Turkish: New Methodologies in Corpus Selection and Training

Dec 03, 2024

H. Toprak Kesgin, M. Kaan Yuce, Eren Dogan, M. Egemen Uzun, Atahan Uz, Elif Ince, Yusuf Erdem, Osama Shbib, Ahmed Zeer, M. Fatih Amasyali

Figure 1 for Optimizing Large Language Models for Turkish: New Methodologies in Corpus Selection and Training

Figure 2 for Optimizing Large Language Models for Turkish: New Methodologies in Corpus Selection and Training

Figure 3 for Optimizing Large Language Models for Turkish: New Methodologies in Corpus Selection and Training

Figure 4 for Optimizing Large Language Models for Turkish: New Methodologies in Corpus Selection and Training

Abstract:In this study, we develop and assess new corpus selection and training methodologies to improve the effectiveness of Turkish language models. Specifically, we adapted Large Language Model generated datasets and translated English datasets into Turkish, integrating these resources into the training process. This approach led to substantial enhancements in model accuracy for both few-shot and zero-shot learning scenarios. Furthermore, the merging of these adapted models was found to markedly improve their performance. Human evaluative metrics, including task-specific performance assessments, further demonstrated that these adapted models possess a greater aptitude for comprehending the Turkish language and addressing logic-based queries. This research underscores the importance of refining corpus selection strategies to optimize the performance of multilingual models, particularly for under-resourced languages like Turkish.

* 2024 Innovations in Intelligent Systems and Applications Conference (ASYU) published in IEEE Xplore
* 2024 Innovations in Intelligent Systems and Applications Conference (ASYU)

Via

Access Paper or Ask Questions

VLSI Hypergraph Partitioning with Deep Learning

Sep 02, 2024

Muhammad Hadir Khan, Bugra Onal, Eren Dogan, Matthew R. Guthaus

Abstract:Partitioning is a known problem in computer science and is critical in chip design workflows, as advancements in this area can significantly influence design quality and efficiency. Deep Learning (DL) techniques, particularly those involving Graph Neural Networks (GNNs), have demonstrated strong performance in various node, edge, and graph prediction tasks using both inductive and transductive learning methods. A notable area of recent interest within GNNs are pooling layers and their application to graph partitioning. While these methods have yielded promising results across social, computational, and other random graphs, their effectiveness has not yet been explored in the context of VLSI hypergraph netlists. In this study, we introduce a new set of synthetic partitioning benchmarks that emulate real-world netlist characteristics and possess a known upper bound for solution cut quality. We distinguish these benchmarks with the prior work and evaluate existing state-of-the-art partitioning algorithms alongside GNN-based approaches, highlighting their respective advantages and disadvantages.

Via

Access Paper or Ask Questions

GAT-Steiner: Rectilinear Steiner Minimal Tree Prediction Using GNNs

Jul 01, 2024

Bugra Onal, Eren Dogan, Muhammad Hadir Khan, Matthew R. Guthaus

Figure 1 for GAT-Steiner: Rectilinear Steiner Minimal Tree Prediction Using GNNs

Figure 2 for GAT-Steiner: Rectilinear Steiner Minimal Tree Prediction Using GNNs

Figure 3 for GAT-Steiner: Rectilinear Steiner Minimal Tree Prediction Using GNNs

Figure 4 for GAT-Steiner: Rectilinear Steiner Minimal Tree Prediction Using GNNs

Abstract:The Rectilinear Steiner Minimum Tree (RSMT) problem is a fundamental problem in VLSI placement and routing and is known to be NP-hard. Traditional RSMT algorithms spend a significant amount of time on finding Steiner points to reduce the total wire length or use heuristics to approximate producing sub-optimal results. We show that Graph Neural Networks (GNNs) can be used to predict optimal Steiner points in RSMTs with high accuracy and can be parallelized on GPUs. In this paper, we propose GAT-Steiner, a graph attention network model that correctly predicts 99.846% of the nets in the ISPD19 benchmark with an average increase in wire length of only 0.480% on suboptimal wire length nets. On randomly generated benchmarks, GAT-Steiner correctly predicts 99.942% with an average increase in wire length of only 0.420% on suboptimal wire length nets.

* Preprint for The 2024 IEEE/ACM International Conference on Computer-Aided Design (ICCAD 2024)

Via

Access Paper or Ask Questions

Introducing cosmosGPT: Monolingual Training for Turkish Language Models

Apr 26, 2024

H. Toprak Kesgin, M. Kaan Yuce, Eren Dogan, M. Egemen Uzun, Atahan Uz, H. Emre Seyrek, Ahmed Zeer, M. Fatih Amasyali

Abstract:The number of open source language models that can produce Turkish is increasing day by day, as in other languages. In order to create the basic versions of such models, the training of multilingual models is usually continued with Turkish corpora. The alternative is to train the model with only Turkish corpora. In this study, we first introduce the cosmosGPT models that we created with this alternative method. Then, we introduce new finetune datasets for basic language models to fulfill user requests and new evaluation datasets for measuring the capabilities of Turkish language models. Finally, a comprehensive comparison of the adapted Turkish language models on different capabilities is presented. The results show that the language models we built with the monolingual corpus have promising performance despite being about 10 times smaller than the others.

Via

Access Paper or Ask Questions

Türkçe Dil Modellerinin Performans Karşılaştırması Performance Comparison of Turkish Language Models

Apr 25, 2024

Eren Dogan, M. Egemen Uzun, Atahan Uz, H. Emre Seyrek, Ahmed Zeer, Ezgi Sevi, H. Toprak Kesgin, M. Kaan Yuce, M. Fatih Amasyali

Abstract:The developments that language models have provided in fulfilling almost all kinds of tasks have attracted the attention of not only researchers but also the society and have enabled them to become products. There are commercially successful language models available. However, users may prefer open-source language models due to cost, data privacy, or regulations. Yet, despite the increasing number of these models, there is no comprehensive comparison of their performance for Turkish. This study aims to fill this gap in the literature. A comparison is made among seven selected language models based on their contextual learning and question-answering abilities. Turkish datasets for contextual learning and question-answering were prepared, and both automatic and human evaluations were conducted. The results show that for question-answering, continuing pretraining before fine-tuning with instructional datasets is more successful in adapting multilingual models to Turkish and that in-context learning performances do not much related to question-answering performances.

* in Turkish language. Baz{\i} \c{c}al{\i}\c{s}malar{\i} i\c{c}ermedi\u{g}ini s\"oyleyen hakem yorumu nedeniyle bir konferanstan kabul almad{\i}. Ancak hakemin bahsetti\u{g}i \c{c}al{\i}\c{s}malar bildiri g\"onderme son tarihinde yay{\i}nlanmam{\i}\c{s}t{\i}

Via

Access Paper or Ask Questions

Deep Compression for PyTorch Model Deployment on Microcontrollers

Mar 29, 2021

Eren Dogan, H. Fatih Ugurdag, Hasan Unlu

Figure 1 for Deep Compression for PyTorch Model Deployment on Microcontrollers

Figure 2 for Deep Compression for PyTorch Model Deployment on Microcontrollers

Figure 3 for Deep Compression for PyTorch Model Deployment on Microcontrollers

Figure 4 for Deep Compression for PyTorch Model Deployment on Microcontrollers

Abstract:Neural network deployment on low-cost embedded systems, hence on microcontrollers (MCUs), has recently been attracting more attention than ever. Since MCUs have limited memory capacity as well as limited compute-speed, it is critical that we employ model compression, which reduces both memory and compute-speed requirements. In this paper, we add model compression, specifically Deep Compression, and further optimize Unlu's earlier work on arXiv, which efficiently deploys PyTorch models on MCUs. First, we prune the weights in convolutional and fully connected layers. Secondly, the remaining weights and activations are quantized to 8-bit integers from 32-bit floating-point. Finally, forward pass functions are compressed using special data structures for sparse matrices, which store only nonzero weights (without impacting performance and accuracy). In the case of the LeNet-5 model, the memory footprint was reduced by 12.45x, and the inference speed was boosted by 2.57x.

* 7 pages, 1 figure

Via

Access Paper or Ask Questions