This project works on Flicker8k dataset to generate caption to given image using pretrained encoder is used.(InceptionV3)
It is a model that generates sequences that uses an encoder to encode input
image to fixed compressed form and then decoder decodes it, word by word into
a sequence.