3 papers - avg viability 2.3
Formalizing agent-bounded indistinguishability to create canonical abstractions for capacity-limited observers in POMDPs.
Develops theoretical optimal regret bounds for infinite-horizon reinforcement learning problems.
Develop advanced offline RL algorithms with extended theoretical guarantees for parameterized policies in large action spaces.