This section provides a primer on the fundamental concepts underlying transformers and large language models. We aim to facilitate a foundational understanding of these advanced technologies by introducing these key notions.
## Concepts
- [[large language model]]
- [[transformer model]]
- [[token]]
- [[input embedding]]
- [[positional encoding]]
- [[learnius/llms/2 LLMs and Transformers/attention]]
- [[self-attention]]
- [[query]]
- [[key]]
- [[value]]
- [[attention score]]
- [[dot product]]
- [[softmax function]]
- [[logit]]
- [[attention head]]
- [[multi-head self-attention]]
- [[residual connection]]
- [[transformer encoder]]
- [[transformer decoder]]
- [[masked multi-head self-attention]]
- [[encoder-decoder attention]]
- [[transformer output layer]]
- [[learnius/llms/2 LLMs and Transformers/prompt|prompt]]
- [[hallucination]]
- [[base LLM]]
- [[instruction-tuned LLM]]
[[1 Machine Learning Basics]] < [[Hands-on LLMs]] > [[3 Instruction-Tuned LLM Systems]]