Leshem Choshen · @LChoshen
1086 followers · 353 posts · Server sigmoid.social

This entity wasn't in the pretraining😢
Don't cry, little ML researcher

Take new term definitions
Continue them with follow-up sentences
Distill your model (D-KL) to continue the same, without the definition
You know the new terms now
Go prompt them tiger🐯

arxiv.org/abs/2306.09306

#nlproc #modelrecycling #modelediting #machinelearning

Last updated 1 year ago

Leshem Choshen · @LChoshen
1081 followers · 344 posts · Server sigmoid.social

Got multiple datasets and you want to improve your own task
Proposed: multitask improvement to the fuse and tune
But more importantly

weight fused model without gradients!

arxiv.org/abs/2307.03506

#nlproc #fusing #modelrecycling #machinelearning

Last updated 1 year ago