Keys are derived from the input embeddings of the sequence elements (e.g., words in a sentence). They are used to project the information of each element (word) into a space that can be compared with other words' representations. The keys help the model understand the relationships between different words in the input sequence. [[query]] < [[Hands-on LLMs]]/[[2 LLMs and Transformers]] > [[value]]