instruction-tuned LLM

An instruction-tuned LLM is a [[machine learning model]] that has been trained to follow instructions. If you ask an instruction-tuned LLM "What is the capital of France?" it is much more likely to output "The capital of France is Paris" instead of a [[base LLM]] output. Instruction-tuned LLMs are typically trained on a massive [[dataset]] of text, then fine-tuned with instructions and attempts to follow them, often with [[reinforcement learning from human feedback]]. This makes them better at following instructions and being helpful. The human feedback also contributes to making the system less likely to output problematic or toxic text that may have been included in the original dataset. ## References [[ChatGPT Prompt Engineering for Developers|Fulford&Ng-2023]] [[base LLM]] < [[Hands-on LLMs]]/[[2 LLMs and Transformers]] >