Why attention: seq2seq the bottleneck problem

A more general definition of attention:
Given a set of vector values, and a vector query, attention is a technique to compute a weighted sum of the values, dependent on the query.

alt text alt text alt text alt text

Reference

Updated: