In [[natural language processing|NLP]] a word embedding is a representation of a word by a high-dimensional vector. It is typically a real-valued vector that encodes the meaning of the word in such a way that words with similar meanings are closer in the vector space. This concept can be generalized for any type of element of the input sequence of the neural network.
In the case of [[natural language processing|NLP]] and for practical reasons, the sequence of words is first converted into a sequence of [[token]]s from a fixed-size vocabulary. Each token is embedded into a high-dimensional vector using an embedding layer.
The input embeddings can be pre-trained using methods like Word2Vec, GloVe, or be learned as part of the model during training.
[[token]] < [[Hands-on LLMs]]/[[2 LLMs and Transformers]] > [[positional encoding]]