Published papers at TMLR · @tmlrpub
564 followers · 594 posts · Server sigmoid.social

An Option-Dependent Analysis of Regret Minimization Algorithms in Finite-Horizon Semi-MDP

Gianluca Drappo, Alberto Maria Metelli, Marcello Restelli

Action editor: Matthieu Geist.

openreview.net/forum?id=VP9p4u

#reinforcement #planning #regret

Last updated 1 year ago

New Submissions to TMLR · @tmlrsub
206 followers · 764 posts · Server sigmoid.social

RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

openreview.net/forum?id=m7p5O7

#reward #reinforcement #generative

Last updated 1 year ago

Published papers at TMLR · @tmlrpub
565 followers · 587 posts · Server sigmoid.social

Reinforcement Learning with Delayed, Composite, and Partially Anonymous Reward

Washim Uddin Mondal, Vaneet Aggarwal

Action editor: Jiantao Jiao.

openreview.net/forum?id=ubCoTA

#reinforcement #reward #Rewards

Last updated 1 year ago

WIST Quotations · @WISTquote
78 followers · 1177 posts · Server zirk.us

A quotation from Joubert, Joseph:

«
To teach is to learn twice.

[Enseigner, c’est apprendre deux lois.]
»

Full quote, sourcing, notes:
wist.info/joubert-joseph/2194/

#quote #quotes #quotation #education #learning #reinforcement #teaching

Last updated 1 year ago

Published papers at TMLR · @tmlrpub
564 followers · 573 posts · Server sigmoid.social

One-Step Distributional Reinforcement Learning

Mastane Achab, Reda ALAMI, YASSER ABDELAZIZ DAHOU DJILALI, Kirill Fedyanin, Eric Moulines

Action editor: Marc Lanctot.

openreview.net/forum?id=ZPMf53

#reinforcement #distributional #Agent

Last updated 1 year ago

New Submissions to TMLR · @tmlrsub
205 followers · 743 posts · Server sigmoid.social

RLTF: Reinforcement Learning from Unit Test Feedback

openreview.net/forum?id=hjYmsV

#rl #code #reinforcement

Last updated 1 year ago

Published papers at TMLR · @tmlrpub
564 followers · 573 posts · Server sigmoid.social

Cyclophobic Reinforcement Learning

Stefan Sylvius Wagner, Peter Arndt, Jan Robine, Stefan Harmeling

Action editor: Josh Merel.

openreview.net/forum?id=83rgSF

#reinforcement #exploration #reward

Last updated 1 year ago

New Submissions to TMLR · @tmlrsub
203 followers · 734 posts · Server sigmoid.social

Population-based Evaluation in Repeated Rock-Paper-Scissors as a Benchmark for Multiagent Reinforcement Learning

openreview.net/forum?id=gQnJ7O

#agents #Agent #reinforcement

Last updated 1 year ago

JMLR · @jmlr
704 followers · 294 posts · Server sigmoid.social

'Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity', by Kaiqing Zhang, Sham M. Kakade, Tamer Basar, Lin F. Yang.

jmlr.org/papers/v24/20-1131.ht

#complexity #reinforcement #planning

Last updated 1 year ago

Published papers at TMLR · @tmlrpub
557 followers · 557 posts · Server sigmoid.social

Using Confounded Data in Latent Model-Based Reinforcement Learning

Maxime Gasse, Damien GRASSET, Guillaume Gaudron, Pierre-Yves Oudeyer

Action editor: Martha White.

openreview.net/forum?id=nFWRuJ

#causal #causality #reinforcement

Last updated 1 year ago

LOVE NBA · @lovenba
25 followers · 10475 posts · Server channels.im
LOVE NBA · @lovenba
25 followers · 9523 posts · Server channels.im
New Submissions to TMLR · @tmlrsub
200 followers · 713 posts · Server sigmoid.social

Detecting danger in gridworlds using Gromov’s Link Condition

openreview.net/forum?id=t4p612

#gridworlds #gridworld #reinforcement

Last updated 1 year ago

LOVE NBA · @lovenba
25 followers · 9421 posts · Server channels.im
New Submissions to TMLR · @tmlrsub
199 followers · 704 posts · Server sigmoid.social

The Multiquadric Kernel for Moment-Matching Distributional Reinforcement Learning

openreview.net/forum?id=z49eaB

#distributional #reinforcement #distributions

Last updated 1 year ago

New Submissions to TMLR · @tmlrsub
199 followers · 699 posts · Server sigmoid.social

SkillS: Adaptive Skill Sequencing for Efficient Temporally-Extended Exploration

openreview.net/forum?id=JwGKVp

#learns #reinforcement #exploration

Last updated 1 year ago

LOVE NBA · @lovenba
23 followers · 8275 posts · Server channels.im
streenamonica · @streenamonica
2 followers · 16 posts · Server opalstack.social
New Submissions to TMLR · @tmlrsub
198 followers · 689 posts · Server sigmoid.social

Bridging the Gap Between Offline and Online Reinforcement Learning Evaluation Methodologies

openreview.net/forum?id=J3veZd

#offline #reinforcement #rl

Last updated 1 year ago

New Submissions to TMLR · @tmlrsub
198 followers · 681 posts · Server sigmoid.social

Offline Reinforcement Learning with Additional Covering Distributions

openreview.net/forum?id=AfXq3x

#coverage #sampling #reinforcement

Last updated 1 year ago