Back to Glossary
Tokenization
トークン化(トークンか)
IntermediateModels & Architecture
The process of breaking text into smaller units (tokens) that an AI model can process, like splitting words into subwords.
Why It Matters
How text is tokenized affects model performance, multilingual capabilities, and token-based pricing.
Example in Practice
The word 'unhappiness' might be tokenized as ['un', 'happiness'] or ['un', 'hap', 'pi', 'ness'].