Leshem Choshen · @LChoshen
957 followers · 214 posts · Server sigmoid.social

There are three tracks. Two of them require you to use a small training corpus (that we provide) inspired by the input to children. One of them loosens the restrictions: you can pre-train on a small natural language dataset of your choosing, and use unlimited non-linguistic data.

Interested? The training datasets are already out! Evaluation pipeline to come soon!

Call for Papers: arxiv.org/abs/2301.11796

Website: babylm.github.io

#nlp #nlproc #pretraining #pretrain #babylm

Last updated 3 years ago

Leshem Choshen · @LChoshen
753 followers · 95 posts · Server sigmoid.social

We want to pretrain🤞
Instead we finetune🚮😔
Could we collaborate?🤗

ColD Fusion:
🔄Recycle finetuning to multitask
➡️evolve pretrained models forever

On 35 datasets
+2% improvement over RoBERTa
+7% in few shot settings
🧵

#nlproc #machinlearning #nlp #ml #modelrecyclying #CollaborativeAI #scientivism #pretrain

Last updated 3 years ago