Some "open AI" is much more open than the industry dominating offerings. There's #EleutherAI, a donor-supported nonprofit whose model comes with documentation and code, licensed #Apache2. There are also some smaller academic offerings: #Vicuna (UCSD/CMU/Berkeley); #Koala (Berkeley) and #Alpaca (Stanford).
These are indeed more open (though Alpaca - which ran on a laptop - had to be withdrawn because it "hallucinated" so profusely).
40/
#eleutherai #apache2 #vicuna #koala #alpaca
Gizmodo: Anti-Piracy Group Takes Massive AI Training Dataset 'Books3′ Offline https://gizmodo.com/anti-piracy-group-takes-ai-training-dataset-books3-off-1850743763 #generativepretrainedtransformer #artificialneuralnetworks #artificialintelligence #largelanguagemodels #technologyinternet #mariafredenslund #sarahsilverman #shawnpresser #deeplearning #eleutherai #microsoft #chatgpts #thepile #chatgpt #openai #llama #gpt3 #gpt4 #meta
#generativepretrainedtransformer #artificialneuralnetworks #artificialintelligence #largelanguagemodels #technologyinternet #mariafredenslund #sarahsilverman #shawnpresser #deeplearning #eleutherai #microsoft #chatgpts #thepile #chatgpt #openai #llama #gpt3 #gpt4 #meta
Neu: »Algorithmische Affären und Binärcodebekenntnisse oder Wie schaffen wir gemeinsam Text?« von #ClaraCosimaWolff mit #EleutherAI und #GPT3 und einer Umschlagzeichnung von #LukasGütnher (#AufklärungundKritik 530)
#claracosimawolff #eleutherai #gpt3 #lukasgutnher #aufklarungundkritik
The article points out we'll see lawsuits against artificial intelligence models for years to come. Good for her & them; these companies should start over w/o copywritten sources.
#AI #artificial #intelligence #copyright #law is a #thing | #dataset #large #language #models #illegal #sources #Bibliotik #fragrantly #illegal #programmers #artists #suing #similar #case #EleutherAI #ThePile #publishers #writers #songwriters #stolen #works #Meta #llama #ChatGPT #library #OpenAI https://www.theverge.com/2023/7/9/23788741/sarah-silverman-openai-meta-chatgpt-llama-copyright-infringement-chatbots-artificial-intelligence-ai
#ai #artificial #intelligence #copyright #law #thing #dataset #large #language #models #illegal #sources #bibliotik #fragrantly #programmers #artists #suing #similar #case #eleutherai #thepile #publishers #writers #songwriters #stolen #works #meta #llama #chatgpt #library #openai
Ars Technica: “A really big deal”—Dolly is a free, open source, ChatGPT-style AI model https://arstechnica.com/?p=1931693 #Tech #arstechnica #IT #Technology #largelanguagemodels #machinelearning #textsynthesis #ApacheSpark #Databricks #EleutherAI #finetuning #Biz&IT #pythia #Dolly #LLaMA #meta #AI
#Tech #arstechnica #it #technology #largelanguagemodels #machinelearning #textsynthesis #apachespark #databricks #eleutherai #finetuning #biz #pythia #dolly #llama #meta #ai
“A really big deal”—Dolly is a free, open source, ChatGPT-style AI model - Enlarge (credit: Databricks)
On Wednesday, Databricks released... - https://arstechnica.com/?p=1931693 #largelanguagemodels #machinelearning #textsynthesis #apachespark #databricks #eleutherai #finetuning #biz #pythia #dolly #llama #meta #ai
#ai #meta #llama #dolly #pythia #biz #finetuning #eleutherai #databricks #apachespark #textsynthesis #machinelearning #largelanguagemodels
💻 We are ready to train with massive compute resources and state-of-the-art open source models from our partner community #EleutherAI 4/5
There are some really good papers that have sought to make the best of the current situation, but #EleutherAI had the compute to do it the right way and so we did.
https://arxiv.org/abs/2211.08411
https://arxiv.org/abs/2202.07646
https://arxiv.org/abs/2202.07206
https://arxiv.org/abs/2207.14251
We hope that this work will empower more people to work on questions in interpretability, especially the causal impact of training data on model behavior!
What do LLMs learn over the course of training? How do these patterns change as you scale? To help answer these questions, we are releasing a Pythia, suite of LLMs + checkpoints designed for research on interpretability and training dynamics!
The models have sizes ranging from 19M to 13B parameters, contain 143 intermediate checkpoints, and were trained on the same exact data in the same exact order.
#ml #ai #nlproc #interpretability #eleutherai
#AI #machinelearning #nlproc #openscience #opendata #ScienceMastodon #EleutherAI #academicmastodon #math
arxiv.org/abs/2210.06413
#ai #machinelearning #nlproc #OpenScience #opendata #ScienceMastodon #eleutherai #academicmastodon #math