PDF Viewer

BUILDER'S SANDBOX

Build This Paper

Use an AI coding agent to implement this research.

OpenAI Codex
OpenAI CodexAI Agent

Lightweight coding agent in your terminal.

Claude Code
Claude CodeAI Agent

Agentic coding tool for terminal workflows.

AntiGravity IDE
AntiGravity IDEScaffolding

AI agent mindset installer and workflow scaffolder.

Cursor
CursorIDE

AI-first code editor built on VS Code.

VS Code
VS CodeIDE

Free, open-source editor by Microsoft.

Estimated $9K - $13K over 6-10 weeks.

See exactly what it costs to build this -- with 3 comparable funded startups.

7-day free trial. Cancel anytime.

Discover the researchers behind this paper and find similar experts.

7-day free trial. Cancel anytime.

References (28)

[1]
Neural quasiprobabilistic likelihood ratio estimation with negatively weighted data
2024M. Drnevich, S. Jiggins et al.
[2]
Relaxed Stationary Distribution Correction Estimation for Improved Offline Policy Optimization
2024Woo-Seong Kim, Donghyeon Ki et al.
[3]
BridgeData V2: A Dataset for Robot Learning at Scale
2023H. Walke, Kevin Black et al.
[4]
Chat GPT & Google Bard AI: A Review
2023Shashi Kant Singh, Shubham Kumar et al.
[5]
Offline RL with No OOD Actions: In-Sample Learning via Implicit Value Regularization
2023Haoran Xu, Li Jiang et al.
[6]
Offline Imitation Learning with Suboptimal Demonstrations via Relaxed Distribution Matching
2023Lantao Yu, Tianhe Yu et al.
[7]
Dual RL: Unification and New Methods for Reinforcement and Imitation Learning
2023Harshit S. Sikchi, Qinqing Zheng et al.
[8]
Extreme Q-Learning: MaxEnt RL without Entropy
2023Divyansh Garg, Joey Hejna et al.
[9]
A Dataset Perspective on Offline Reinforcement Learning
2021Kajetan Schweighofer, Andreas Radler et al.
[10]
Offline Reinforcement Learning with Implicit Q-Learning
2021Ilya Kostrikov, Ashvin Nair et al.
[11]
OptiDICE: Offline Policy Optimization via Stationary Distribution Correction Estimation
2021Jongmin Lee, Wonseok Jeon et al.
[12]
A Minimalist Approach to Offline Reinforcement Learning
2021Scott Fujimoto, S. Gu
[13]
Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism
2021Paria Rashidinejad, Banghua Zhu et al.
[14]
How to train your robot with deep reinforcement learning: lessons we have learned
2021Julian Ibarz, Jie Tan et al.
[15]
Conservative Q-Learning for Offline Reinforcement Learning
2020Aviral Kumar, Aurick Zhou et al.
[16]
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
2020S. Levine, Aviral Kumar et al.
[17]
D4RL: Datasets for Deep Data-Driven Reinforcement Learning
2020Justin Fu, Aviral Kumar et al.
[18]
GenDICE: Generalized Offline Estimation of Stationary Values
2020Ruiyi Zhang, Bo Dai et al.
[19]
AlgaeDICE: Policy Gradient from Arbitrary Experience
2019Ofir Nachum, Bo Dai et al.
[20]
DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections
2019Ofir Nachum, Yinlam Chow et al.

Showing 20 of 28 references

Founder's Pitch

"Improve offline RL performance using flexible $f$-divergence constraints for better dataset adaptation."

Reinforcement LearningScore: 5View PDF ↗

Commercial Viability Breakdown

0-10 scale

High Potential

1/4 signals

2.5

Quick Build

3/4 signals

7.5

Series A Potential

1/4 signals

2.5

Sources used for this analysis

arXiv Paper

Full-text PDF analysis of the research paper

GitHub Repository

Code availability, stars, and contributor activity

Citation Network

Semantic Scholar citations and co-citation patterns

Community Predictions

Crowd-sourced unicorn probability assessments

Analysis model: GPT-4o · Last scored: 2/11/2026

Explore the full citation network and related research.

7-day free trial. Cancel anytime.

Understand the commercial significance and market impact.

7-day free trial. Cancel anytime.

Get detailed profiles of the research team.

7-day free trial. Cancel anytime.