Browsing by Author "Md Shahriar Karim"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- ItemOpen AccessA Multichannel Localization Method for Camouflaged Object Detection(North South University, 2022) Mohammad Rakibur Rahman; Md Mafri Chowdhury; Md Shohanur Rahaman Sarker; Md Shahriar Karim; 1812964042; 1813393402; 1812457402Camouflaged objects can be difficult to detect because they blend in with their surroundings. There have been numerous studies on detecting camouflaged objects, and many of these have been recognized as effective approaches. This paper presents a new algorithm for detecting camouflaged objects by focusing on identifying the region of interest, which is crucial for detecting these objects. The algorithm uses Phase Fourier Transformation to create a filtered image, and Entropy to generate a feature map from the filtered image. The feature map is then used to determine the region of interest. This paper proposes a multichannel method for discriminative region localization in Camouflaged Object Detection (COD) tasks. In one channel, processing the phase and amplitude of a 2-D Fourier spectrum generates modified form of the original image, used later for a pixel-wise optimal local entropy analysis. The other channel implements a class activation map (CAM) and Global Average Pooling (GAP) for object localization. We combine the channels linearly to form the final localized version of the COD images. Experimentation in multiple COD datasets demonstrates that the proposed method successfully localizes regions containing more than 80% of the camouflaged objects. Our proposed method does not require memoryintensive devices or prior training on particular image features, making it easily integrated into a resource-constrained environment. Theproposed approach is also applicable to non-COD images
- ItemOpen AccessBangla Text to Speech With Emotion(North South University, 2022) Shaif Hossain Emon; Maruf Mustar Moon; Md Farhad Gazi; Md. Riazat Kabir Shuvo; Md Shahriar Karim; 1811603642; 1811009642; 1621760042; 1621535042Spoken language technology improved a lot. There are many text to speech models like tacotron, tacotron 2, deep voice, Fastspeech, and wavelet are used for synthesizing speech. Tacotron 2 has a 4.58 mean opinion score, mostly human-like speech generated by a computer. There is no work done for an emotional Bangla text-to-speech. This paper proposes a transfer learning approach for generating emotional speech with the respective tacotron two models. We have created a Bangla Emotional Text To speech web app. It generates Bangla speech for a text or audio input with a specific emotion. Initially, it gives speeches for three types of emotions and the poem. Users can choose which kind of emotions they want in the speech. Then their text will go to one of our tacotron two models. We have three models for creating sad, neutral, and happy speech. We have created another model for reading poems. For training our sad and happy model, we have built our dataset.