How LLMs Work

[TODO: Intro sentence — what this page is and who it's for]

[TODO: Overview of structure — mention basic concepts, training, and inference as the three areas covered]

Tokenisation

[TODO: Explain what tokenisation is and why LLMs use tokens rather than characters or words]


Token Embeddings

Each word is represented as a point in a high-dimensional space — its position encodes meaning. Words with similar meanings end up nearby. Below we show a 3D projection of 50-dimensional GloVe vectors, so some distance relationships are approximate.

Try: cat, dog, fish — or king, queen, prince

These are static word embeddings (GloVe). Modern LLMs use contextual embeddings — the same word gets a different vector depending on surrounding words — but the core idea of meaning as position in space is the same.

← Back to home