'q-Learning in Continuous Time', by Yanwei Jia, Xun Yu Zhou.
http://jmlr.org/papers/v24/22-0755.html
#reinforcement #martingale #critic
#reinforcement #martingale #critic
If you play an infinite number of rounds you'll have infinite payoff (#probability theory #ExpectedValue)
But if you paid $1k to play each round your real world expected value is negative rather than the #infinity $ that math says it is.
Because...
Like #Martingale betting strategy, you don't have an infinite bankroll...
Nor infinite time.
As you keep tossing tails, you have exponential payoff growth, BUT you also have linear cost growth
And you cant buy time with $
#martingale #infinity #ExpectedValue #Probability