Enhancing Bengali Language Processing: LoRA-Driven Adaptation in BLOOM

dc.contributor.advisorDr. Nabeel Mohammed (NbM)
dc.contributor.authorMd Hasibullah Hasib
dc.contributor.authorSamira Ali
dc.contributor.authorIrfanuddin Ahmed
dc.contributor.id1811451042
dc.contributor.id2011423042
dc.contributor.id1813476642
dc.coverage.departmentElectrical and Computer Engineering
dc.date.accessioned2025
dc.date.accessioned2025-07-15T06:08:07Z
dc.date.available2025-07-15T06:08:07Z
dc.date.issued2023
dc.description.abstractEnhancing Bengali Language Processing: LoRA-Driven Adaptation in BLOOM In order to optimize a multilingual language model specifically for Bangla, this paper presents a novel method that uses LoRA (Low-Rank Adaptation) to significantly reduce model parameters without compromising performance. The article describes a thorough process that starts with quantizing the original model and ends with the use of LoRA configurations, which incorporate trainable rank decomposition matrices into the Transformer architecture while maintaining pre-trained weights. The astounding outcomes of this optimization show a significant decrease in the number of trainable parameters from 3 billion to 4.9 million, or just 0.16% of the original model. Interestingly, performance remains comparable to the larger, non-optimized version despite this reduction, suggesting that the model’s effectiveness is unaffected. Optimizing the LoRA-configured model on various Bangla datasets demonstrates the model’s flexibility and expertise even more. The Bangla2B+ Dataset, a proprietary conversational dataset, and a corpus of over 300,000 articles were used to tackle next-word prediction, question answering, and conversational tasks. Evaluation metrics showed a significant improvement, and perplexity scores decreased, outperforming the original model’s performance. These developments demonstrate the effectiveness of LoRA in lowering computational complexity and improving the model’s language comprehension abilities when it comes to the Bangla language. The work presents a new paradigm for effective and proficient language modeling in low-resource languages, demonstrating how LoRA can be used to simplify models for better performance without sacrificing computational efficiency. Given the dearth of multilingual models specifically designed for Bengali language processing, we aim to close a significant gap in language processing but also spur the development of customized NLP solutions in vital industries, improving accuracy and accessibility in domains centered around Bengali language.
dc.description.degreeUndergraduate
dc.identifier.cd600000178
dc.identifier.print-thesisTo be assigned
dc.identifier.urihttps://repository.northsouth.edu/handle/123456789/1264
dc.language.isoen
dc.publisherNorth South University
dc.rights©NSU Library
dc.titleEnhancing Bengali Language Processing: LoRA-Driven Adaptation in BLOOM
oaire.citation.endPage27
oaire.citation.startPage1
Files
Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
600000178.Abstract.pdf
Size:
986.36 KB
Format:
Adobe Portable Document Format
Description:
Loading...
Thumbnail Image
Name:
600000178.pdf
Size:
9.29 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
1.93 KB
Format:
Item-specific license agreed to upon submission
Description: