Attention is the key mechanism in the [[transformer model]]. It mimics the cognitive attention mechanism that enhances the important part of input data and fades out the rest. [[positional encoding]] < [[Hands-on LLMs]]/[[2 LLMs and Transformers]] > [[self-attention]]