Enhancing Bengali Language Processing: LoRA-Driven Adaptation in BLOOM

Md Hasibullah Hasib; Samira Ali; Irfanuddin Ahmed

Enhancing Bengali Language Processing: LoRA-Driven Adaptation in BLOOM

dc.contributor.advisor	Dr. Nabeel Mohammed (NbM)
dc.contributor.author	Md Hasibullah Hasib
dc.contributor.author	Samira Ali
dc.contributor.author	Irfanuddin Ahmed
dc.contributor.id	1811451042
dc.contributor.id	2011423042
dc.contributor.id	1813476642
dc.coverage.department	Electrical and Computer Engineering
dc.date.accessioned	2025
dc.date.accessioned	2025-07-15T06:08:07Z
dc.date.available	2025-07-15T06:08:07Z
dc.date.issued	2023
dc.description.abstract	Enhancing Bengali Language Processing: LoRA-Driven Adaptation in BLOOM In order to optimize a multilingual language model specifically for Bangla, this paper presents a novel method that uses LoRA (Low-Rank Adaptation) to significantly reduce model parameters without compromising performance. The article describes a thorough process that starts with quantizing the original model and ends with the use of LoRA configurations, which incorporate trainable rank decomposition matrices into the Transformer architecture while maintaining pre-trained weights. The astounding outcomes of this optimization show a significant decrease in the number of trainable parameters from 3 billion to 4.9 million, or just 0.16% of the original model. Interestingly, performance remains comparable to the larger, non-optimized version despite this reduction, suggesting that the model’s effectiveness is unaffected. Optimizing the LoRA-configured model on various Bangla datasets demonstrates the model’s flexibility and expertise even more. The Bangla2B+ Dataset, a proprietary conversational dataset, and a corpus of over 300,000 articles were used to tackle next-word prediction, question answering, and conversational tasks. Evaluation metrics showed a significant improvement, and perplexity scores decreased, outperforming the original model’s performance. These developments demonstrate the effectiveness of LoRA in lowering computational complexity and improving the model’s language comprehension abilities when it comes to the Bangla language. The work presents a new paradigm for effective and proficient language modeling in low-resource languages, demonstrating how LoRA can be used to simplify models for better performance without sacrificing computational efficiency. Given the dearth of multilingual models specifically designed for Bengali language processing, we aim to close a significant gap in language processing but also spur the development of customized NLP solutions in vital industries, improving accuracy and accessibility in domains centered around Bengali language.
dc.description.degree	Undergraduate
dc.identifier.cd	600000178
dc.identifier.print-thesis	To be assigned
dc.identifier.uri	https://repository.northsouth.edu/handle/123456789/1264
dc.language.iso	en
dc.publisher	North South University
dc.rights	©NSU Library
dc.title	Enhancing Bengali Language Processing: LoRA-Driven Adaptation in BLOOM
oaire.citation.endPage	27
oaire.citation.startPage	1

Files

Original bundle

Now showing 1 - 2 of 2

Name:: 600000178.Abstract.pdf
Size:: 986.36 KB
Format:: Adobe Portable Document Format
Description:

Download

Name:: 600000178.pdf
Size:: 9.29 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.93 KB
Format:: Item-specific license agreed to upon submission
Description:

Download

Collections

Theses - Undergraduate