A numerical representation of a word in a multi-dimensional space, capturing its meaning, context, and relationships to other words. It’s typically expressed as a dense, fixed-size array of real numbers.
Key Characteristics:
- Also known as a word embedding
- Encodes semantic and syntactic properties
- Similar words have similar vectors (e.g., king and queen)
Role in LLMs:
In large language models:
- Each token (word or subword) is mapped to a word vector.
- These vectors are the model’s input and are adjusted during training to capture language pattern
Purpose:
Word vectors let models:
- Understand relationships (e.g., Paris is to France as Tokyo is to Japan)
- Perform arithmetic reasoning (e.g., king – man + woman ≈ queen)
- Generalize across similar terms
A word vector is a mathematical encoding of a word’s meaning, enabling LLMs to process language in a way that captures nuance, similarity, and context.
