Published papers at TMLR · @tmlrpub
520 followers · 442 posts · Server sigmoid.social

The Stack: 3 TB of permissively licensed source code

Denis Kocetkov, Raymond Li, Loubna Ben allal et al.

Action editor: Swarat Chaudhuri.

openreview.net/forum?id=pxpbTd

#bigcode #text2code #dataset

Last updated 2 years ago

VentureBeat :press: · @VentureBeat
69 followers · 53 posts · Server press.coop

A new report from @Sourcegraph warns the issue with will hit crisis mode if companies don't get a handle on how their use at work. venturebeat.com/ai/developers-

#bigcode #developers #ai #press

Last updated 2 years ago

Vincent HETRU · @vh
142 followers · 249 posts · Server sigmoid.social

just released the models for the holiday season. Part of the project, these 1.1B parameter models are trained on , , and and use advanced techniques like near-deduplication and comment-to-code ratio.

huggingface.co/bigcode/santaco

🤗

#huggingface #santacoder #bigcode #Python #java #javascript #ai #deeplearning

Last updated 3 years ago

Matthias Stürmer · @maemst
467 followers · 84 posts · Server swiss.social

137 million repositories of 92 terabyte source code data from : Very impressive how much code is being processed for the @huggingface project! huggingface.co/bigcode Presented by at in Bern: dinacon.ch

#OpenSource #GitHub #bigcode #leandrovonwerra #dinacon22 #dinacon

Last updated 3 years ago