RNNs are very powerful and popular too. However, often, we only need to look at recent information to perform the present task rather than information that was stored a long time ago. This is frequent in NLP for language modeling. Let's see a common example:
![](/api/v2/epubs/9781788479042/files/assets/907bff1c-57b0-4927-9a45-0a2922d87fea.png)
Suppose a language model is trying to predict the next word based on the previous words. As a human being, if we try to predict the last word in the sky is blue, without further context, it's most likely the next word ...