Testing is the final evaluation step and is performed after the [[machine learning model]]'s architecture and [[hyperparameter tuning]] on the [[validation]] set. The testing set is used to provide an unbiased estimate of the model's performance on new, unseen data. Here's how the testing process works: 1. **Model Evaluation:** The model is evaluated on the testing set, which it has never seen during training or validation. 2. **Performance Metrics:** Various performance metrics are used to assess how well the model generalizes to new data. Common metrics include accuracy, precision, recall, F1 score, mean squared error, etc. 3. **Generalization Assessment:** The testing results provide insights into how well the model is expected to perform in real-world scenarios. If the testing performance is consistent with validation performance, it's an indication that the model has generalized well. Testing is the final step to evaluate the model's generalization performance on unseen data. [[validation]] < [[Hands-on LLMs]]/[[1 Machine Learning Basics]] > [[supervised learning]]