on-policy reverse KL divergence

on-policy reverse KL divergence is a research_field technology tracked in AI research papers.