Back to Glossary

Tokenization

トークン化(トークンか)

IntermediateModels & Architecture

The process of breaking text into smaller units (tokens) that an AI model can process, like splitting words into subwords.

Why It Matters

How text is tokenized affects model performance, multilingual capabilities, and token-based pricing.

Example in Practice

The word 'unhappiness' might be tokenized as ['un', 'happiness'] or ['un', 'hap', 'pi', 'ness'].

Want to understand AI, not just define it?

Our courses teach you to build with these concepts, not just memorize them.