TinyStories: Tiny models are coherent and understand instructions
If their data is very simple
What is simple?
What 3-4 year old vocabularies allow (according to LLMs...)
TinyStories: Tiny models are coherent and understand instructions
If their data is very simple
What is simple?
What 3-4 year old vocabularies allow (according to LLMs...)
There are three tracks. Two of them require you to use a small training corpus (that we provide) inspired by the input to children. One of them loosens the restrictions: you can pre-train on a small natural language dataset of your choosing, and use unlimited non-linguistic data.
Interested? The training datasets are already out! Evaluation pipeline to come soon!
Call for Papers: https://arxiv.org/abs/2301.11796
Website: http://babylm.github.io
#nlp #nlproc #pretraining #pretrain #babylm