When it comes to applying deep learning principles to natural language processing, contextual information weighs in a lot! In the past few years, it has been shown that various improvement in existing neural network architectures concerned with NLP has shown an amazing performance in extracting featured information from textual data and performing various operations for a day to day life. One of the models which we will be discussing in this article is encoder-decoder architecture along with the attention model.
The encoder-decoder architecture for recurrent neural networks is actually proving to be powerful for sequence-to-sequence-based prediction problems in the field of natural language processing such as neural machine translation and image caption generation.
We will try to discuss the drawbacks of the existing encoder-decoder model and try to develop a small version of the encoder-decoder with an attention model to understand why it signifies so much for modern-day NLP applications!