FedSearch - Federated network search engine

FedSearch

Leshem Choshen · @LChoshen

1040 followers · 293 posts · Server sigmoid.social

Open media

TinyStories: Tiny models are coherent and understand instructions
If their data is very simple

What is simple?
What 3-4 year old vocabularies allow (according to LLMs...)

https://arxiv.org/abs/2305.07759
#NLProc #LLM #babyLM

#nlproc #llm #babylm

Last updated 1 year ago

Original post

Leshem Choshen · @LChoshen

1040 followers · 293 posts · Server sigmoid.social

arXiv.org - TinyStories: How Small Can Language Models Be and Still Speak Coherent English?

TinyStories: Tiny models are coherent and understand instructions
If their data is very simple

What is simple?
What 3-4 year old vocabularies allow (according to LLMs...)

https://arxiv.org/abs/2305.07759
#NLProc #LLM #babyLM

#nlproc #llm #babylm

Last updated 1 year ago

Original post

Leshem Choshen · @LChoshen

957 followers · 214 posts · Server sigmoid.social

There are three tracks. Two of them require you to use a small training corpus (that we provide) inspired by the input to children. One of them loosens the restrictions: you can pre-train on a small natural language dataset of your choosing, and use unlimited non-linguistic data.

Interested? The training datasets are already out! Evaluation pipeline to come soon!

Call for Papers: https://arxiv.org/abs/2301.11796

Website: http://babylm.github.io

#NLP #nlproc #pretraining #pretrain #babyLM

#nlp #nlproc #pretraining #pretrain #babylm

Last updated 2 years ago

Original post