ScienceToStartup
TrendsTopicsSavedArticlesChangelogCareersAbout

113 Cherry St #92768

Seattle, WA 98104-2205

Backed by Research Labs
All systems operational

Product

  • Dashboard
  • Workspace
  • Build Loop
  • Research Map
  • Trends
  • Topics
  • Articles

Enterprise

  • TTO Dashboard
  • Scout Reports
  • RFP Marketplace
  • API

Resources

  • All Resources
  • Benchmark
  • Database
  • Dataset
  • Calculator
  • Glossary
  • State Reports
  • Industry Index
  • Directory
  • Templates
  • Alternatives
  • Changelog
  • FAQ
  • Docs

Company

  • About
  • Careers
  • For Media
  • Privacy Policy
  • Legal
  • Contact

Community

  • Open Source
  • Community
ScienceToStartup

Copyright © 2026 ScienceToStartup. All rights reserved.

Privacy Policy|Legal

Papers

250

With code

195

Suggested Build

152

Suggested Watch

26

🔔

Preview from your Build/Watch decisions. Set up Scout for daily delivery.

HiSpatial: Taming Hierarchical 3D Spatial Understanding in Vision-Language Models

Morning brief

High conviction build candidate

PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference

Morning brief

High conviction build candidate

PICon: A Multi-Turn Interrogation Framework for Evaluating Persona Agent Consistency

48h review

Needs sharper wedge before committing

Saved thesis

Find deployable ai papers with public code, proof pass, and a wedge that can ship inside 6 weeks.

🔔Run morning brief

Novelty / saturation by cluster

Uses the current paper cohort to show whether a lane looks crowded or sparse, with named comparable papers from the same slice.

  • Computer Vision

    MegaFlow: Zero-Shot Large Displacement Optical Flow · AnyHand: A Large-Scale Synthetic Dataset for RGB(-D) Hand Pose Estimation

    12

    Crowded

  • Medical AI

    Longitudinal Digital Phenotyping for Early Cognitive-Motor Screening · DeepFAN, a transformer-based deep learning model for human-artificial intelligence collaborative assessment of incidental pulmonary nodules in CT scans: a multi-reader, multi-case trial

    12

    Crowded

  • Generative Video

    ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling · RefAlign: Representation Alignment for Reference-to-Video Generation

    6

    Balanced

  • Agents

    Natural-Language Agent Harnesses · From Intent to Evidence: A Categorical Approach for Structural Evaluation of Deep Research Agents

    4

    Balanced

  • Vision-Language Models

    No Hard Negatives Required: Concept Centric Learning Leads to Compositionality without Degrading Zero-shot Capabilities of Contrastive Models · CLIP-RD: Relational Distillation for Efficient CLIP Knowledge Distillation

    4

    Balanced

  • Federated Learning

    Social Hippocampus Memory Learning · Supercharging Federated Intelligence Retrieval

    4

    Balanced

  • Multimodal AI

    Multimodal Dataset Distillation via Phased Teacher Models · Probabilistic Concept Graph Reasoning for Multimodal Misinformation Detection

    4

    Balanced

  • LLM Training

    Train at Moving Edge: Online-Verified Prompt Selection for Efficient RL Training of Large Reasoning Model · Revisiting On-Policy Distillation: Empirical Failure Modes and Simple Fixes

    4

    Balanced

  • LLM Evaluation

    Do LLMs Know What They Know? Measuring Metacognitive Efficiency with Signal Detection Theory · Measuring What Matters -- or What's Convenient?: Robustness of LLM-Based Scoring Systems to Construct-Irrelevant Factors

    4

    Balanced

  • Robotics and Automation

    LaMP: Learning Vision-Language-Action Policies with 3D Scene Flow as Latent Motion Prior · LILAC: Language-Conditioned Object-Centric Optical Flow for Open-Loop Trajectory Generation

    3

    Rarer lane

  • LLM Security

    Prompt Attack Detection with LLM-as-a-Judge and Mixture-of-Models · Unveiling the Resilience of LLM-Enhanced Search Engines against Black-Hat SEO Manipulation

    3

    Rarer lane

  • Multimodal LLMs

    Seeing to Ground: Visual Attention for Hallucination-Resilient MDLLMs · Visual Attention Drifts,but Anchors Hold:Mitigating Hallucination in Multimodal Large Language Models via Cross-Layer Visual Anchors

    3

    Rarer lane

HiSpatial: Taming Hierarchical 3D Spatial Understanding in Vision-Language Models

3D Spatial Intelligence2026-03-26Build NowNo Code
Commercial100
Deployability—
Reproducibility0
Novelty100
View full paper →

No dossier data.