Generate A Textual Description Based on An Image

Date
2022
Editor
Journal Title
Volume
Issue
Journal Title
Journal ISSN
Volume Title
Abstract
The project goal is to get the computer to detect what was going on in the image and provide a general description. Our approach is to build a dataset of images with in-depth descriptions to train an appropriate model to tell what objects are in the picture and make their relations relevant. The dataset is based on images from the web, where we collect the images and get a near accurate description. We collected the images and descriptions from authentic news websites and stored them on google sheets and GitHub. Grammar checking tools were used to test the description's grammar and generate better words. We used the encoderdecoder system to encode the image with a pre-trained Convolutional Neural Network (VGG16) in a hidden state. It would then use an LSTM to decode this concealed state and generate a caption.
Description
Keywords
TECHNOLOGY::Electrical engineering, electronics and photonics::Electrical engineering
Citation
Department Name
Electrical and Computer Engineering
Publisher
North South University
Printed Thesis
DOI
ISSN
ISBN