NSU IR :: Browsing by Author "1721427042"

Now showing 1 - 1 of 1

Open Access
Generate A Textual Description Based on An Image
(North South University, 2022) Name: Shah Alvi Hossain; Md. Tahmidur Rahman; Dr. Mohammad Ashrafuzzaman Khan; 1721427042; 1721370042
The project goal is to get the computer to detect what was going on in the image and provide a general description. Our approach is to build a dataset of images with in-depth descriptions to train an appropriate model to tell what objects are in the picture and make their relations relevant. The dataset is based on images from the web, where we collect the images and get a near accurate description. We collected the images and descriptions from authentic news websites and stored them on google sheets and GitHub. Grammar checking tools were used to test the description's grammar and generate better words. We used the encoderdecoder system to encode the image with a pre-trained Convolutional Neural Network (VGG16) in a hidden state. It would then use an LSTM to decode this concealed state and generate a caption.