Flying code monkey
a detailed explanation of the Transformer architecture and self-attention mechanism in deep learning.