Public Dataset

Schema, update cadence, and download links. Exports are generated daily by our pipeline.

Schema

  • arxiv_id — string
  • title — string
  • abstract — string
  • published_date — ISO date
  • viability_score — number (1–10)
  • cluster_label — string (research field)
  • has_code — boolean (repo_url present)

Data is served on-demand from the API (no static file commit). License: CC BY 4.0 (attribution required).