Machine learning systems require data to learn from. A dataset is a collection of related information sets used to train [[machine learning algorithm|machine-learning algorithms]] It can include tabular data, where each column corresponds to different values of a variable and each row to a given record, or it can be simply a collection of documents or files. The first case is usually called a structured dataset (organized in tables, databases, etc.), while the second is an unstructured dataset (text, images, audio). [[machine learning]] < [[Hands-on LLMs]]/[[1 Machine Learning Basics]] > [[input feature]]