hlfshell · @hlfshell
74 followers · 324 posts · Server hachyderm.io

Due to travel I've had time to sit down and start working on my large research paper to-read list. The 2 I chose to start with were incredible:

First up is SayCan- using RL to guide LLMs to controlling robots in a complex kitchen environment
say-can.github.io/

...then we haveGenerative Agents: Interactive Simulacra of Human Behavior- where authors used LLMs to control multiple agents and had them simulate a small town of unique personalities
arxiv.org/abs/2304.03442

#robotics #llm #rl #ai

Last updated 1 year ago

Rene Schulte · @rschu
594 followers · 287 posts · Server arvr.social

WOAH! 🤯 The first autonomous vision-based drone that beats human world champions in head-to-head races.

They use Reinforcement Learning to achieve this groundbreaking mobile milestone. 🤖

Open access paper in Nature: nature.com/articles/s41586-023
Author post: twitter.com/davsca1/status/169
Full video: youtube.com/watch?v=fBiataDpGI

Cool stuff but also terrifying implications if you see how these FPS drones are right now used in active conflicts.

#robotics #ai #rl #cv #deeplearning

Last updated 1 year ago

New Submissions to TMLR · @tmlrsub
205 followers · 743 posts · Server sigmoid.social

RLTF: Reinforcement Learning from Unit Test Feedback

openreview.net/forum?id=hjYmsV

#rl #code #reinforcement

Last updated 1 year ago

Andreas Preissig · @bahntiger
10 followers · 30 posts · Server zug.network

Aktuell fährt die S6 zwischen und über das Gütergleis der Strecke 3401. Wenn da noch jemand Strecken und Weichen sammeln will...

@nordkommission zum Beispiel? Bevor Du heute nur traurig am Flughafen sitzt...

#RheinNeckar #rl #rm

Last updated 1 year ago

hlfshell · @hlfshell
67 followers · 247 posts · Server hachyderm.io

I finally finished my writeup on utilizing PPO to control a robotic arm to attempt to solve a pick and place problem.

hlfshell.ai/posts/ppo-pick-and

In the post I discuss my successes, failures, how everything works, and how I debugged the problem.

It's my first attempt at an in depth tech blogpost.

#robotics #rl #reinforcementlearning #ai

Last updated 1 year ago

Thibaut Bulle Darchi 🫧 · @Thibulle
10 followers · 97 posts · Server ludosphere.fr

Paré à se faire rouler dessus par les autres streamers pour l’Opensub de RTBF_Ixpé avec l’incroyable participation de Prof_Poncho 👌

👉Twitch.tv/thibulle

#stream #twitch #jeuxvideo #rl #belgium #Belgique

Last updated 1 year ago

New Submissions to TMLR · @tmlrsub
198 followers · 689 posts · Server sigmoid.social

Bridging the Gap Between Offline and Online Reinforcement Learning Evaluation Methodologies

openreview.net/forum?id=J3veZd

#offline #reinforcement #rl

Last updated 1 year ago

Mihai Lazarescu · @mtl
1 followers · 139 posts · Server techhub.social
Games Japan · @games
5 followers · 143 posts · Server wakoka.com
Games Japan · @games
1 followers · 6 posts · Server wakoka.com
Todor Stoyanov · @todor
49 followers · 26 posts · Server sigmoid.social

Looking for a position in and ? I have an open position focusing on learning full-body manipulation affordances. We will look into and try both supervised and approaches for learning on a real mobile YuMi robot. To apply please go through our online system here:
oru.se/english/career/availabl

Thanks for the re-tooth 😊

#phd #ml #robotics #nerfs #rl

Last updated 1 year ago

Andreas Preissig · @bahntiger
1 followers · 2 posts · Server zug.network

Heute mal ein wenig Umleitungen mitfahren. Als erstes mal einer der letzten Fernverkehrszüge aus , heute passend bis Esslingen...

#rl

Last updated 1 year ago

Antjh1981 · @Antjh
0 followers · 87 posts · Server mas.to
Aran Komatsuzaki · @aran
920 followers · 871 posts · Server sigmoid.social

Jumanji: a Diverse Suite of Scalable Reinforcement Learning Environments in JAX

arxiv.org/abs/2306.09884

RT @instadeepai@twitter.com

1/ Exciting news! Our team has just released a major update of Jumanji, our suite of diverse and challenging environments written in 🔥 Check it out now and take your research to the next level 🚀

⭐ Github: tinyurl.com/code-jumanji
📚 Doc: tinyurl.com/doc-jumanji

🐦🔗: twitter.com/instadeepai/status

#rl #jax

Last updated 1 year ago

Brandon Rohrer · @brohrer
1230 followers · 1055 posts · Server recsys.social

Does one algorithm perform better than another? It is notoriously hard to answer this question well, even when your intentions are good. The authors here give recommendations that are clear, sensible, and practicable.

To me, this paper is the pinnacle of scholarship. Statistics-savvy researchers extend a hand to those that are less so, without talking down or getting aggressively technical. This is the work of bridge builders.

arxiv.org/pdf/2304.01315.pdf

#reinforcementlearning #rl

Last updated 1 year ago

Thibaut Bulle Darchi 🫧 · @Thibulle
6 followers · 75 posts · Server ludosphere.fr
652349 · @652349
126 followers · 4020 posts · Server ruhr.social

Bin zwar nicht Robby W. , entertaine trotzdem gern

#rl

Last updated 2 years ago

Ben Waber · @bwaber
534 followers · 1588 posts · Server hci.social

Next was a great talk by Anne Collins on bridging , , and computation in at the Learning Salon. After some bombastic claims that "RL is all you need" to explain cognition, Collins and the broader group dissect what's missing from this picture youtube.com/watch?v=YLbZh-bH8V (3/9)

#cognition #neuroscience #rl #reinforcementlearning

Last updated 2 years ago

5h15h · @shish
95 followers · 616 posts · Server techhub.social

uses from human feedback (), an established technique, to enhance the safety, usefulness, and alignment of its models openai.com/research/instructio

#openai #reinforcementlearning #rlhf #AI #genai #generativeAI #chatgpt #gpt #rl

Last updated 2 years ago

hlfshell · @hlfshell
40 followers · 132 posts · Server hachyderm.io

Work in progress reinforcement learning project. One block, no "blocker bar" blocking the goal. Each colored zone awards points for a shape being pushed into it, but a specific shape gets extra points in particular zones.

All trained w/ PPO, about 12 million timesteps.

#robotics #reinforcementlearning #deeplearning #rl

Last updated 2 years ago