Skip to content

Index

Tiny Stories BPE Tokenizer

An example BPE tokenizer trained on the Tiny Stories dataset.

2k:

  • vocabulary_size: 2000
  • model_max_length: 2048

8k

  • vocabulary_size: 8000
  • model_max_length: 2048