A Multidisciplinary AI Board for Multimodal Dementia Characterization and Risk Assessment Build Now
An interactive multi-agent AI system that synthesizes patient data from EHR, notes, and imaging to provide clinicians with enhanced dementia characterization and risk assessment.
Medical AI Mar 23 Code High viability
SHARP: Spectrum-aware Highly-dynamic Adaptation for Resolution Promotion in Remote Sensing Synthesis Build Now
A novel training-free method for high-resolution remote sensing image synthesis that dynamically adapts positional embeddings during denoising, outperforming existing baselines.
Generative Image Mar 23 Pending High viability
Extending Precipitation Nowcasting Horizons via Spectral Fusion of Radar Observations and Foundation Model Priors Build Now
A novel frequency-domain fusion framework that integrates radar observations with weather foundation model forecasts to significantly extend precipitation nowcasting horizons.
Weather Forecasting Mar 23 Pending High viability
RefracGS: Novel View Synthesis Through Refractive Water Surfaces with 3D Gaussian Ray Tracing Build Now
A novel framework for high-fidelity novel view synthesis through refractive water surfaces by jointly modeling the water surface and the scene beneath using 3D Gaussian ray tracing.
Novel View Synthesis Mar 23 Code High viability
EnterpriseLab: A Full-Stack Platform for developing and deploying agents in Enterprises Build Now
EnterpriseLab is a full-stack platform enabling enterprises to develop and deploy specialized, cost-effective AI agents that match frontier model performance while ensuring data sovereignty.
Agents Mar 23 Code High viability
WorldCache: Content-Aware Caching for Accelerated Video World Models Build Now
WorldCache accelerates video generation by intelligently reusing intermediate model features, achieving significant speedups with minimal quality loss.
Video Generation Mar 23 Code High viability
VideoDetective: Clue Hunting via both Extrinsic Query and Intrinsic Relevance for Long Video Understanding Build Now
A framework for precise clue hunting in long videos by integrating query relevance with intrinsic video structure, improving MLLM performance.
Video Understanding Mar 23 Code High viability
End-to-End Training for Unified Tokenization and Latent Denoising Build Now
A unified training approach for latent diffusion models that simplifies the training pipeline and improves performance across modalities.
Generative Models Mar 23 Pending High viability
ThinkJEPA: Empowering Latent World Models with Large Vision-Language Reasoning Model Build Now
A VLM-guided latent world model framework that combines dense-frame dynamics with long-horizon semantic guidance for improved trajectory prediction.
World Models Mar 23 Code High viability
DualCoT-VLA: Visual-Linguistic Chain of Thought via Parallel Reasoning for Vision-Language-Action Models Build Now
DualCoT-VLA enhances robotic action planning by enabling parallel visual and linguistic reasoning for complex multi-step tasks, achieving state-of-the-art performance.
Vision-Language-Action Mar 23 Code High viability
3D-Layout-R1: Structured Reasoning for Language-Instructed Spatial Editing Build Now
A framework for precise, text-instructed 3D scene editing using scene-graph reasoning to improve spatial understanding and layout consistency.
Spatial Editing Mar 23 Code High viability
Scaling DoRA: High-Rank Adaptation via Factored Norms and Fused Kernels Build Now
Accelerate and reduce memory usage for high-rank LoRA adaptations in large language models with optimized kernels and factored norm computations.
LLM Training Mar 23 Pending High viability
Repurposing Geometric Foundation Models for Multi-view Diffusion Build Now
Repurpose geometric foundation models to accelerate and improve multi-view image generation for novel view synthesis.
Generative Vision Mar 23 Code High viability
DUO-VSR: Dual-Stream Distillation for One-Step Video Super-Resolution Build Now
A novel three-stage framework for one-step video super-resolution that significantly improves visual quality and efficiency by unifying distribution matching and adversarial supervision.
Video Super-Resolution Mar 23 Code High viability
GenOpticalFlow: A Generative Approach to Unsupervised Optical Flow Learning Build Now
A generative framework for unsupervised optical flow learning that synthesizes perfectly aligned frame-flow data, eliminating the need for human annotations and achieving state-of-the-art results.
Computer Vision Mar 23 Code High viability
UniDex: A Robot Foundation Suite for Universal Dexterous Hand Control from Egocentric Human Videos Build Now
A foundation suite for universal dexterous hand control using egocentric human videos, enabling cross-hand generalization and reducing reliance on costly robot demonstrations.
Robotics Mar 23 Code High viability
DexDrummer: In-Hand, Contact-Rich, and Long-Horizon Dexterous Robot Drumming Build Now
A hierarchical policy for dexterous robotic drumming, trained in simulation with sim-to-real transfer, demonstrating complex in-hand, contact-rich, and long-horizon manipulation.
Robotics Mar 23 Code High viability
EgoGroups: A Benchmark For Detecting Social Groups of People in the Wild Build Now
A new dataset and benchmark for detecting social groups in real-world, first-person video, enabling more socially intelligent AI agents.
Computer Vision Mar 23 Code High viability
MemDLM: Memory-Enhanced DLM Training Build Now
MemDLM enhances Diffusion Language Models by embedding a simulated denoising process into training, leading to faster convergence, lower loss, and emergent in-weight retrieval capabilities for improved long-context understanding.
LLM Training Mar 23 Pending High viability
One Model, Two Markets: Bid-Aware Generative Recommendation Build Now
A generative recommendation system that optimizes for both user engagement and ad revenue by integrating bid awareness directly into the generation process.
Recommendation Systems Mar 23 Code High viability
Riverine Land Cover Mapping through Semantic Segmentation of Multispectral Point Clouds Build Now
Leveraging transformer-based semantic segmentation of multispectral point clouds for precise riverine land cover mapping and environmental monitoring.
Geospatial AI Mar 23 Code High viability
Benchmarking Deep Learning Models for Aerial LiDAR Point Cloud Semantic Segmentation under Real Acquisition Conditions: A Case Study in Navarre Build Now
Benchmarking state-of-the-art deep learning models for aerial LiDAR point cloud semantic segmentation to identify optimal solutions for real-world acquisition conditions.
3D Semantic Segmentation Mar 23 Code High viability
SpatialReward: Verifiable Spatial Reward Modeling for Fine-Grained Spatial Consistency in Text-to-Image Generation Build Now
A verifiable reward model that enforces fine-grained spatial consistency in text-to-image generation, improving accuracy and controllability.
Text-to-Image Generation Mar 23 Code High viability
Dyadic: A Scalable Platform for Human-Human and Human-AI Conversation Research Watch
Dyadic is a no-code web platform enabling scalable research into human-human and human-AI conversations with multi-modal support and real-time monitoring.
Conversation Research Tools Mar 23 High viability
Noise Titration: Exact Distributional Benchmarking for Probabilistic Time Series Forecasting Build Now
A novel benchmarking framework for time series forecasting that rigorously evaluates model robustness to non-stationarity and noise, outperforming foundation models with a specialized probabilistic generative architecture.
Time Series Forecasting Mar 23 Code High viability
Gumbel Distillation for Parallel Text Generation Build Now
A novel distillation technique that significantly improves the generation quality of parallel language models, enabling faster and more efficient text generation.
LLM Training Mar 23 Pending High viability
Evaluating the Reliability and Fidelity of Automated Judgment Systems of Large Language Models Build Now
Automate LLM quality and security assessments using LLMs as judges, achieving high correlation with human evaluations.
LLM Evaluation Mar 23 Code High viability
SPA: A Simple but Tough-to-Beat Baseline for Knowledge Injection Build Now
A simple, prompt-engineered data augmentation method that significantly improves LLM knowledge in specialized domains, outperforming complex baselines.
LLM Knowledge Injection Mar 23 Pending High viability
Omni-WorldBench: Towards a Comprehensive Interaction-Centric Evaluation for World Models Build Now
A new benchmark and evaluation framework to assess the interactive response capabilities of 4D world models, addressing a critical gap in current AI research.
World Models Mar 23 Code High viability
Make Tracking Easy: Neural Motion Retargeting for Humanoid Whole-body Control Build Now
A neural framework that retargets human motion to humanoid robots, eliminating physical artifacts and accelerating control policy training.
Robotics Control Mar 23 Code High viability
Mixture of Mini Experts: Overcoming the Linear Layer Bottleneck in Multiple Instance Learning Build Now
A parameter-efficient module that significantly boosts the performance of computational pathology AI models by optimizing the transformation of patch features.
Computational Pathology AI Mar 23 Pending High viability
PAM: A Pose-Appearance-Motion Engine for Sim-to-Real HOI Video Generation Build Now
PAM is a unified engine for generating realistic hand-object interaction videos, improving existing methods and enabling sim-to-real applications.
Generative Video Mar 23 Pending High viability
Seeing is Improving: Visual Feedback for Iterative Text Layout Refinement Build Now
A framework that uses visual feedback to iteratively refine text layouts, ensuring aesthetic and readable designs.
Layout Generation Mar 23 Pending High viability
Revisiting Quantum Code Generation: Where Should Domain Knowledge Live? Build Now
Leverage advanced LLMs with RAG and agentic feedback to significantly improve quantum code generation, outperforming specialized models without costly fine-tuning.
Code Generation Mar 23 Code High viability
Cross-Modal Reinforcement Learning for Navigation with Degraded Depth Measurements Build Now
A cross-modal learning framework enhances robot navigation by inferring depth from grayscale images when depth sensors fail, ensuring robust performance in challenging environments.
Robotics Navigation Mar 23 Code High viability
Feasibility of Augmented Reality-Guided Robotic Ultrasound with Cone-Beam CT Integration for Spine Procedures Build Now
An augmented reality system that guides robotic ultrasound for spine procedures, improving accuracy and efficiency.
Medical AI Mar 23 Code High viability
Closed-Loop Verbal Reinforcement Learning for Task-Level Robotic Planning Watch
A closed-loop verbal reinforcement learning framework for interpretable and adaptive task-level robotic planning, leveraging LLMs and VLM for symbolic policy refinement.
Robotics Planning Mar 23 High viability
ACPO: Counteracting Likelihood Displacement in Vision-Language Alignment with Asymmetric Constraints Build Now
ACPO is a novel alignment mechanism for vision-language models that prevents hallucinations by asymmetrically constraining preference optimization, leading to improved performance on benchmark tasks.
Vision-Language Alignment Mar 23 Code High viability
Causal Evidence that Language Models use Confidence to Drive Behavior Build Now
This research demonstrates that LLMs can be trained to actively use internal confidence signals to regulate their behavior, paving the way for more reliable and autonomous AI agents.
LLM Behavior Mar 23 Code High viability
Beyond Matching to Tiles: Bridging Unaligned Aerial and Satellite Views for Vision-Only UAV Navigation Build Now
A vision-only UAV navigation system that accurately predicts location and heading from aerial and satellite imagery, enabling GNSS-denied operation.
UAV Navigation Mar 23 Code High viability
OpenEarth-Agent: From Tool Calling to Tool Creation for Open-Environment Earth Observation Build Now
An agent framework that creates its own tools for Earth Observation tasks, outperforming specialized agents with fewer resources.
Agents Mar 23 Code High viability
ROBOGATE: Adaptive Failure Discovery for Safe Robot Policy Deployment via Two-Stage Boundary-Focused Sampling Build Now
A framework for adaptive failure discovery in robot policy deployment, enabling efficient risk management and safe industrial integration.
Robotics Safety Mar 23 Pending High viability
Biophysics-Enhanced Neural Representations for Patient-Specific Respiratory Motion Modeling Build Now
Develop patient-specific respiratory motion models using physics-regularized implicit neural representations for improved radiotherapy precision.
Medical AI Mar 23 Code High viability
Mamba-VMR: Multimodal Query Augmentation via Generated Videos for Precise Temporal Grounding Build Now
Generate short videos from text queries to precisely ground moments in long videos, improving retrieval accuracy and efficiency.
Multimodal Retrieval Mar 23 Code High viability
On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation Build Now
This research proposes a novel method to analyze and improve LLM reasoning by focusing on the direction of reinforcement learning updates, enabling test-time extrapolation and more efficient training.
LLM Reasoning Mar 23 Code High viability
Multiperspectivity as a Resource for Narrative Similarity Prediction Build Now
Leveraging diverse LLM personas to predict narrative similarity by embracing interpretive plurality, outperforming single-ground-truth approaches.
LLM Applications Mar 23 Code High viability
FreeArtGS: Articulated Gaussian Splatting Under Free-moving Scenario Build Now
Develop a real-time 3D motion capture tool that uses Articulated Gaussian Splatting for free-moving scenarios.
Computer Graphics / Motion Capture Mar 23 Code High viability
GSEM: Graph-based Self-Evolving Memory for Experience Augmented Clinical Reasoning Build Now
A graph-based memory system that enhances clinical reasoning agents by organizing and reusing past experiences, significantly improving accuracy.
Medical AI Mar 23 Pending High viability
Principled Steering via Null-space Projection for Jailbreak Defense in Vision-Language Models Build Now
A principled defense framework for vision-language models that enhances safety against jailbreak attacks without degrading performance on benign inputs.
Vision-Language Models Mar 23 Code High viability
P-Flow: Prompting Visual Effects Generation Build Now
P-Flow simplifies visual effects generation using cutting-edge prompting techniques.
Visual Effects & Media Production Mar 23 Pending High viability
A Context Engineering Framework for Improving Enterprise AI Agents based on Digital-Twin MDP Build Now
A framework for improving enterprise AI agents using offline reinforcement learning and digital twins to overcome data limitations and enhance reasoning.
Agents Mar 23 Code High viability
Do World Action Models Generalize Better than VLAs? A Robustness Study Build Now
This research compares world action models and vision-language-action models for robot control, demonstrating superior robustness of world action models in challenging scenarios and providing insights for future development.
Robotics Mar 23 Code High viability
Autoregressive vs. Masked Diffusion Language Models: A Controlled Comparison Build Now
This research provides a controlled comparison of autoregressive and masked diffusion language models, releasing code and checkpoints to enable the development of more diverse and fluent text generation systems.
LLM Training Mar 23 Pending High viability
MIHT: A Hoeffding Tree for Time Series Classification using Multiple Instance Learning Build Now
A novel, interpretable algorithm for classifying complex time series data that significantly outperforms existing state-of-the-art models.
Time Series Classification Mar 23 Code High viability
Adapting Point Cloud Analysis via Multimodal Bayesian Distribution Learning Build Now
A Bayesian framework for adapting 3D vision-language models to domain shifts using online distribution learning.
3D Vision-Language Mar 23 Code High viability
SpatialBoost: Enhancing Visual Representation through Language-Guided Reasoning Build Now
Enhance existing vision models with 3D spatial understanding by leveraging language-guided reasoning and LLMs.
Vision-Language Models Mar 23 Code High viability
MineRobot: A Unified Framework for Kinematics Modeling and Solving of Underground Mining Robots in Virtual Environments Build Now
A unified framework for modeling and solving the kinematics of underground mining robots in virtual environments, enabling real-time performance and robustness for training, planning, and digital-twin applications.
Robotics Mar 23 Code High viability
FontCrafter: High-Fidelity Element-Driven Artistic Font Creation with Visual In-Context Generation Build Now
FontCrafter enables high-fidelity artistic font creation by using visual elements as style references and offering fine-grained control over glyph shape and texture.
Generative Art Mar 23 Code High viability
AnimalCLAP: Taxonomy-Aware Language-Audio Pretraining for Species Recognition and Trait Inference Build Now
A taxonomy-aware language-audio model and dataset for species recognition and trait inference from animal vocalizations, outperforming existing methods.
Audio AI Mar 23 Code High viability
MAGPI: Multifidelity-Augmented Gaussian Process Inputs for Surrogate Modeling from Scarce Data Build Now
Develops a novel multifidelity approach for Gaussian Process Regression to create more accurate and cost-effective surrogate models from scarce high-fidelity data, augmented by cheaper low-fidelity data.
Surrogate Modeling Mar 23 Code High viability
DTVI: Dual-Stage Textual and Visual Intervention for Safe Text-to-Image Generation Build Now
A dual-stage defense framework for safe text-to-image generation that intervenes at both textual and visual stages to capture and attenuate unsafe content.
Generative AI Safety Mar 23 Code High viability
GTSR: Subsurface Scattering Awared 3D Gaussians for Translucent Surface Reconstruction Build Now
Reconstruct translucent 3D objects from images with a novel Gaussian-based pipeline that models subsurface scattering for improved detail and real-time rendering.
3D Reconstruction Mar 23 Code High viability
Future-Interactions-Aware Trajectory Prediction via Braid Theory Build Now
Leveraging braid theory for more accurate multi-agent trajectory prediction in autonomous vehicles, improving safety and reducing computational overhead.
Autonomous Vehicles Mar 23 Code High viability
MEVIUS2: Practical Open-Source Quadruped Robot with Sheet Metal Welding and Multimodal Perception Build Now
An open-source, large-scale, and durable quadruped robot with multimodal perception, built using readily available sheet metal welding and machining.
Robotics Mar 23 Pending High viability
Tuning Real-World Image Restoration at Inference: A Test-Time Scaling Paradigm for Flow Matching Models Build Now
A novel framework for real-world image restoration that leverages test-time scaling with flow matching models to achieve state-of-the-art performance.
Image Restoration Mar 23 Code High viability
Do Papers Match Code? A Benchmark and Framework for Paper-Code Consistency Detection in Bioinformatics Software Build Now
A framework and benchmark for automatically detecting consistency between scientific papers and their code implementations, starting with bioinformatics.
AI Reproducibility Tools Mar 23 Code High viability
AdditiveLLM2: A Multi-modal Large Language Model for Additive Manufacturing Build Now
A specialized multi-modal LLM for additive manufacturing, achieving over 90% accuracy on domain-specific knowledge tasks.
LLM Specialization Mar 23 Code High viability
ROM: Real-time Overthinking Mitigation via Streaming Detection and Intervention Build Now
ROM is a lightweight, real-time system that mitigates overthinking in large language models, reducing response length and improving efficiency without retraining the backbone.
LLM Optimization Mar 23 Pending High viability
Retrieving Climate Change Disinformation by Narrative Build Now
A framework to detect emerging climate change disinformation narratives by reframing it as a retrieval task, outperforming traditional methods on high-variance narratives.
Disinformation Detection Mar 23 Code High viability
VP-VLA: Visual Prompting as an Interface for Vision-Language-Action Models Build Now
VP-VLA decouples high-level reasoning from low-level execution in robotics using structured visual prompts, improving spatial precision and robustness.
Robotics Mar 23 Code High viability
SegMaFormer: A Hybrid State-Space and Transformer Model for Efficient Segmentation Build Now
A lightweight hybrid Mamba-Transformer model for efficient and high-performing 3D medical image segmentation, reducing parameters and FLOPs significantly.
Medical AI Mar 23 Code High viability
CRPS-Optimal Binning for Conformal Regression Build Now
A novel method for non-parametric conditional distribution estimation that provides narrower prediction intervals with guaranteed coverage, outperforming existing conformal methods on benchmarks.
Conformal Prediction Mar 23 Code High viability
STENet: Superpixel Token Enhancing Network for RGB-D Salient Object Detection Build Now
A novel network for RGB-D salient object detection that uses superpixels to enhance global and local feature extraction, outperforming state-of-the-art methods.
Computer Vision Mar 23 Pending High viability
GeoFusion-CAD: Structure-Aware Diffusion with Geometric State Space for Parametric 3D Design Build Now
A diffusion model that generates complex 3D CAD designs by understanding hierarchical structure, overcoming limitations of existing Transformer-based methods for long command sequences.
Generative 3D Design Mar 23 Code High viability
BOOST-RPF: Boosted Sequential Trees for Radial Power Flow Build Now
A novel AI method using boosted decision trees to achieve state-of-the-art, scalable, and robust power flow analysis for distribution systems.
Power Systems AI Mar 23 Code High viability
Demystifying Reinforcement Learning for Long-Horizon Tool-Using Agents: A Comprehensive Recipe Build Now
A systematic recipe for scaling Reinforcement Learning to build advanced LLM agents for complex, long-horizon tasks.
Agents Mar 23 Pending High viability
Parameter-Efficient Fine-Tuning for Medical Text Summarization: A Comparative Study of Lora, Prompt Tuning, and Full Fine-Tuning Build Now
Achieve state-of-the-art medical text summarization with significantly reduced computational cost using parameter-efficient fine-tuning techniques like LoRA.
Medical AI Mar 23 Pending High viability
Unified Spatiotemporal Token Compression for Video-LLMs at Ultra-Low Retention Build Now
A plug-and-play module for Video-LLMs that drastically reduces computational costs by intelligently compressing visual tokens, enabling faster inference and lower memory usage without retraining.
Video LLMs Mar 23 Code High viability
Group3D: MLLM-Driven Semantic Grouping for Open-Vocabulary 3D Object Detection Build Now
A new approach to 3D object detection using open-vocabulary models for semantic grouping in diverse environments.
3D Object Detection Mar 23 Code High viability
GeoFlow: Real-Time Fine-Grained Cross-View Geolocalization via Iterative Flow Prediction Build Now
A real-time, lightweight geolocalization system that iteratively refines location hypotheses for safe autonomous navigation in GPS-denied areas.
Geolocalization Mar 23 Code High viability
SLURP-TN : Resource for Tunisian Dialect Spoken Language Understanding Build Now
A new dataset and baseline models for Tunisian dialect spoken language understanding to unlock task-oriented dialogue systems for under-resourced languages.
Spoken Language Understanding Mar 23 Code High viability
FeatDistill: A Feature Distillation Enhanced Multi-Expert Ensemble Framework for Robust AI-generated Image Detection Build Now
A robust AI-generated image detection framework using feature distillation and a multi-expert ensemble to combat deepfakes in real-world conditions.
AI-generated Image Detection Mar 23 Code High viability
MultiBind: A Benchmark for Attribute Misbinding in Multi-Subject Generation Build Now
A new benchmark and evaluation protocol to diagnose and improve attribute binding in multi-subject image generation, addressing a key failure mode in current generative models.
Generative Image Editing Mar 23 Code High viability
Chronological Contrastive Learning: Few-Shot Progression Assessment in Irreversible Diseases Build Now
Leveraging chronological patient imaging data with contrastive learning to dramatically reduce expert annotation needs for disease severity assessment.
Medical AI Mar 23 Pending High viability
Camera-Agnostic Pruning of 3D Gaussian Splats via Descriptor-Based Beta Evidence Build Now
A camera-agnostic method for pruning 3D Gaussian splats using descriptor-based evidence, enabling efficient storage and transmission of 3D scene data.
3D Reconstruction Mar 23 Code High viability
SatGeo-NeRF: Geometrically Regularized NeRF for Satellite Imagery Build Now
A geometrically regularized NeRF model that significantly reduces overfitting artifacts in satellite imagery for more accurate 3D reconstructions.
3D Reconstruction Mar 23 Code High viability
The Golden Subspace: Where Efficiency Meets Generalization in Continual Test-Time Adaptation Build Now
A novel method for efficient and generalized online adaptation of AI models to changing data distributions, enabling robust performance in real-world scenarios.
Continual Learning Mar 23 Pending High viability
Guideline-grounded retrieval-augmented generation for ophthalmic clinical decision support Watch
A multimodal RAG system for ophthalmology that retrieves and reasons over guideline images to improve clinical decision support, outperforming existing models on challenging cases.
Medical AI Mar 23 High viability
A Latent Representation Learning Framework for Hyperspectral Image Emulation in Remote Sensing Build Now
A latent representation learning framework for generating synthetic hyperspectral images, enabling faster and more accurate remote sensing data simulation and analysis.
Remote Sensing AI Mar 23 Code High viability
SHAPE: Structure-aware Hierarchical Unsupervised Domain Adaptation with Plausibility Evaluation for Medical Image Segmentation Build Now
A medical image segmentation framework that ensures global anatomical plausibility through structure-aware adaptation and hypergraph-based validation, outperforming existing methods.
Medical AI Mar 23 Pending High viability
CLEAR: Context-Aware Learning with End-to-End Mask-Free Inference for Adaptive Video Subtitle Removal Build Now
A mask-free AI framework for adaptive video subtitle removal that achieves end-to-end inference and strong zero-shot generalization across multiple languages.
Video Editing AI Mar 23 Code High viability
Ara-Best-RQ: Multi Dialectal Arabic SSL Build Now
A family of self-supervised learning models for multi-dialectal Arabic speech processing that achieves state-of-the-art performance on dialect identification.
Speech AI Mar 23 Code High viability
IGV-RRT: Prior-Real-Time Observation Fusion for Active Object Search in Changing Environments Build Now
A probabilistic planning framework for real-time object search in dynamic indoor environments, leveraging scene priors and vision-language models to improve efficiency and success rates.
Robotics Mar 23 Code High viability
ADaFuSE: Adaptive Diffusion-generated Image and Text Fusion for Interactive Text-to-Image Retrieval Build Now
A lightweight fusion model that significantly improves interactive text-to-image retrieval by adaptively combining multi-modal feedback, outperforming existing methods with minimal parameter increase.
Interactive Text-to-Image Retrieval Mar 23 Code High viability
Not All Layers Are Created Equal: Adaptive LoRA Ranks for Personalized Image Generation Build Now
Adaptive LoRA ranks for personalized image generation that balances performance and memory by dynamically adjusting layer importance.
Generative Image Mar 23 Pending High viability
Deep S2P: Integrating Learning Based Stereo Matching Into the Satellite Stereo Pipeline Build Now
Integrate advanced learning-based stereo matching into satellite imagery pipelines to generate more accurate and detailed digital surface models.
Computer Vision Mar 23 Code High viability
SmaAT-QMix-UNet: A Parameter-Efficient Vector-Quantized UNet for Precipitation Nowcasting Build Now
A parameter-efficient UNet model using vector quantization and mixed kernel convolutions for improved precipitation nowcasting, with publicly available code.
Weather AI Mar 23 Pending High viability
P^2O: Joint Policy and Prompt Optimization Build Now
Optimize LLM reasoning by jointly evolving prompts and policies to efficiently learn from challenging examples.
LLM Reasoning Mar 23 Code High viability
Thermal Topology Collapse: Universal Physical Patch Attacks on Infrared Vision Systems Build Now
Develops a universal physical patch attack for infrared vision systems that bypasses adversarial defenses with no online computation.
Adversarial Attacks Mar 23 Code High viability
Manifold-Aware Exploration for Reinforcement Learning in Video Generation Build Now
A novel reinforcement learning approach for video generation that stabilizes alignment by constraining exploration to the data manifold, achieving superior quality and reward maximization.
Generative Video Mar 23 Code High viability
Adversarial Camouflage Build Now
Develops an adversarial camouflage technique to protect user privacy against facial recognition systems by maximizing recognition error across multiple architectures.
Adversarial Attacks Mar 23 Code High viability
Adaptive Video Distillation: Mitigating Oversaturation and Temporal Collapse in Few-Step Generation Build Now
A novel distillation framework for efficient and high-fidelity few-step video generation, overcoming oversaturation and temporal collapse.
Generative Video Mar 23 Code High viability
Sim-to-Real of Humanoid Locomotion Policies via Joint Torque Space Perturbation Injection Build Now
A novel sim-to-real method for humanoid locomotion that injects state-dependent joint torque perturbations to improve policy robustness in real-world deployments.
Robotics Mar 23 Code High viability
Agentic Personas for Adaptive Scientific Explanations with Knowledge Graphs Build Now
Develop adaptive AI explanations for complex domains using agentic personas trained with significantly reduced feedback.
Explainable AI Mar 23 Code High viability
Select, Label, Evaluate: Active Testing in NLP Build Now
Reduce NLP model evaluation costs by up to 95% by intelligently selecting the most informative test samples for annotation.
NLP Evaluation Mar 23 Code High viability
Deriving Health Metrics from the Photoplethysmogram: Benchmarks and Insights from MIMIC-III-Ext-PPG Build Now
A comprehensive benchmark for PPG-based clinical prediction, establishing baselines for multi-class heart rhythm classification and physiological parameter estimation, with strong generalizability and insights into subgroup performance.
Medical AI Mar 23 Code High viability
CoRA: Boosting Time Series Foundation Models for Multivariate Forecasting through Correlation-aware Adapter Build Now
A lightweight adapter that significantly boosts multivariate time series forecasting performance by capturing complex channel correlations, designed for easy integration with existing foundation models.
Time Series Forecasting Mar 23 Code High viability
SteelDefectX: A Coarse-to-Fine Vision-Language Dataset and Benchmark for Generalizable Steel Surface Defect Detection Build Now
A vision-language dataset and benchmark for generalizable steel surface defect detection, enabling explainable and transferable AI models.
Industrial AI Mar 23 Pending High viability
Beyond Strict Pairing: Arbitrarily Paired Training for High-Performance Infrared and Visible Image Fusion Build Now
A framework for infrared and visible image fusion that drastically reduces data acquisition costs by enabling training on unaligned image pairs, achieving comparable performance to methods requiring 100x more data.
Image Fusion Mar 23 Pending High viability
Ctrl-A: Control-Driven Online Data Augmentation Build Now
Automated image data augmentation that dynamically adjusts augmentation strength using control theory, eliminating manual policy engineering for new vision tasks.
Computer Vision Augmentation Mar 23 Code High viability
Clinical Graph-Mediated Distillation for Unpaired MRI-to-CFI Hypertension Prediction Build Now
A framework that transfers hypertension prediction knowledge from expensive MRI scans to low-cost retinal fundus images using a clinical similarity graph, enabling better screening with unpaired data.
Medical AI Mar 23 Pending High viability
Cascade-Free Mandarin Visual Speech Recognition via Semantic-Guided Cross-Representation Alignment Build Now
A cascade-free Mandarin visual speech recognition system that improves accuracy and efficiency by jointly aligning multiple intermediate representations.
Speech Recognition Mar 23 Code High viability
Anatomical Token Uncertainty for Transformer-Guided Active MRI Acquisition Build Now
An active MRI acquisition framework using anatomical token uncertainty to significantly accelerate scans and improve image quality.
Medical AI Mar 23 Pending High viability
The Universal Normal Embedding Build Now
Unlock controllable image editing and semantic understanding by unifying generative models and vision encoders through a shared Gaussian latent space.
Generative Models Mar 23 Code High viability
Dynamic Exposure Burst Image Restoration Build Now
Dynamically predict optimal exposure times for burst photography to achieve state-of-the-art image restoration quality, validated on real-world camera systems.
Image Restoration Mar 23 Code High viability
Show Me What You Don't Know: Efficient Sampling from Invariant Sets for Model Validation Build Now
A training-free method to efficiently sample from model feature invariances using pretrained diffusion models, enabling rapid validation of model behavior.
Model Validation Mar 23 Code High viability
Memory-Efficient Boundary Map for Large-Scale Occupancy Grid Mapping Build Now
A novel memory-efficient boundary map representation for large-scale 3D occupancy grid mapping in robotics, with open-source code available.
Robotics AI Mar 23 Pending High viability
Let's Think with Images Efficiently! An Interleaved-Modal Chain-of-Thought Reasoning Framework with Dynamic and Precise Visual Thoughts Build Now
A framework for more efficient and effective multimodal reasoning by dynamically integrating and precisely representing visual information, significantly reducing token consumption.
Multimodal Reasoning Mar 23 Code High viability
EvoIdeator: Evolving Scientific Ideas through Checklist-Grounded Reinforcement Learning Build Now
A framework that uses checklist-grounded reinforcement learning to enable LLMs to systematically evolve and refine scientific ideas based on fine-grained feedback.
LLM Agents Mar 23 Code High viability
FISformer: Replacing Self-Attention with a Fuzzy Inference System in Transformer Models for Time Series Forecasting Build Now
FISFormer replaces self-attention in Transformers with a fuzzy inference system for more accurate, robust, and interpretable time series forecasting.
Time Series Forecasting Mar 23 Code High viability
Can a Robot Walk the Robotic Dog: Triple-Zero Collaborative Navigation for Heterogeneous Multi-Agent Systems Build Now
A collaborative navigation framework for heterogeneous robots that requires no training or simulation, enabling real-world deployment of robot cooperation.
Robotics Mar 23 Pending High viability
SemEval-2026 Task 12: Abductive Event Reasoning: Towards Real-World Event Causal Inference for Large Language Models Build Now
A new benchmark and dataset for abductive event reasoning to enable LLMs to infer direct causes of real-world events from evidence.
Causal Inference Mar 23 Pending High viability
Probing How Scalable Table Data Enhances General Long-Context Reasoning Build Now
Synthesize structured table data to significantly boost LLM long-context reasoning capabilities, improving performance on both in-domain and out-of-domain tasks.
LLM Training Mar 23 Code High viability
Compensating Visual Insufficiency with Stratified Language Guidance for Long-Tail Class Incremental Learning Build Now
Leveraging large language models to guide incremental learning for imbalanced datasets, improving performance on rare classes.
Incremental Learning Mar 23 Code High viability
Data-Free Layer-Adaptive Merging via Fisher Information for Long-to-Short Reasoning LLMs Build Now
A novel data-free model merging technique that leverages Fisher Information to significantly improve long-to-short reasoning in LLMs, outperforming existing methods without calibration data.
LLM Merging Mar 23 Code High viability
Rethinking Token Reduction for Large Vision-Language Models Build Now
A novel learning-based method to efficiently reduce visual tokens in large vision-language models for multi-turn question answering, improving inference costs without sacrificing accuracy.
Vision-Language Models Mar 23 Pending High viability
MIND: Multi-agent inference for negotiation dialogue in travel planning Build Now
A framework for multi-agent negotiation that infers user preferences to achieve consensus in complex planning scenarios.
Agents Mar 23 Code High viability
Deterministic Hallucination Detection in Medical VQA via Confidence-Evidence Bayesian Gain Build Now
A deterministic method to detect hallucinations in medical AI by analyzing token-level confidence and visual evidence, offering a computationally efficient and self-contained solution.
Medical AI Mar 23 Code High viability
Mirage The Illusion of Visual Understanding Build Now
We expose fundamental vulnerabilities in visual-language AI by demonstrating 'mirage reasoning' where models hallucinate visual understanding, and introduce a solution for robust, vision-grounded evaluation.
Multimodal AI Evaluation Mar 23 Code High viability
BiPreManip: Learning Affordance-Based Bimanual Preparatory Manipulation through Anticipatory Collaboration Build Now
A visual affordance-based framework for bimanual robotic manipulation that anticipates and facilitates complex preparatory actions for goal-directed tasks.
Robotics Mar 23 Code High viability
CoNBONet: Conformalized Neuroscience-inspired Bayesian Operator Network for Reliability Analysis Build Now
A neuroscience-inspired Bayesian network for fast, energy-efficient, and uncertainty-aware reliability analysis of complex engineering systems.
Reliability Analysis Mar 23 Code High viability
Optimizing Multi-Agent Weather Captioning via Text Gradient Descent: A Training-Free Approach with Consensus-Aware Gradient Fusion Build Now
A training-free multi-agent LLM framework that generates interpretable, domain-specific weather captions by fusing textual gradients from specialized agents.
Multi-Agent LLM Mar 23 Code High viability
PRM-as-a-Judge: A Dense Evaluation Paradigm for Fine-Grained Robotic Auditing Build Now
A new dense evaluation paradigm for robotic auditing using process reward models to provide fine-grained insights beyond binary success rates.
Robotics Evaluation Mar 23 Code High viability
HumanOmni-Speaker: Identifying Who said What and When Build Now
A novel multimodal AI system that accurately identifies who said what and when in complex conversations by analyzing high-frequency visual cues, overcoming limitations of current models.
Multimodal AI Mar 23 Code High viability
TAMTRL: Teacher-Aligned Reward Reshaping for Multi-Turn Reinforcement Learning in Long-Context Compression Build Now
TAMTRL improves long-context LLM performance by providing fine-grained, teacher-aligned rewards during multi-turn memory updates, overcoming temporal credit assignment challenges.
LLM Context Management Mar 23 Code High viability
Cross-Scenario Deraining Adaptation with Unpaired Data: Superpixel Structural Priors and Multi-Stage Pseudo-Rain Synthesis Build Now
A plug-and-play module that adapts image deraining models to unseen real-world conditions using unpaired data and synthesized pseudo-rain, significantly improving performance and training speed.
Computer Vision Mar 23 Code High viability
OmniFM: Toward Modality-Robust and Task-Agnostic Federated Learning for Heterogeneous Medical Imaging Build Now
A modality- and task-agnostic federated learning framework for heterogeneous medical imaging that unifies diverse downstream tasks by leveraging frequency-domain insights.
Medical AI Mar 23 Code High viability
TrustFed: Enabling Trustworthy Medical AI under Data Privacy Constraints Build Now
A federated learning framework for trustworthy medical AI that ensures privacy and reliable uncertainty quantification across diverse healthcare data.
Medical AI Mar 23 Code High viability
MISApp: Multi-Hop Intent-Aware Session Graph Learning for Next App Prediction Build Now
A profile-free framework for next app prediction using multi-hop session graph learning to capture evolving user intent, outperforming baselines in real-world scenarios.
Mobile AI Mar 23 Code High viability
FedCVU: Federated Learning for Cross-View Video Understanding Build Now
A federated learning framework that enables privacy-preserving cross-view video understanding by aligning representations and reducing communication overhead.
Federated Learning for Video Understanding Mar 23 Code High viability
No Dense Tensors Needed: Fully Sparse Object Detection on Event-Camera Voxel Grids Build Now
A fully sparse object detection system for event cameras that dramatically reduces memory and storage requirements while maintaining high accuracy for detecting fast-moving objects.
Event Camera Object Detection Mar 23 Code High viability
Dual-level Adaptation for Multi-Object Tracking: Building Test-Time Calibration from Experience and Intuition Build Now
A test-time adaptation framework for multi-object tracking that leverages memory and experience to improve performance under distribution shifts.
Multi-Object Tracking Mar 23 Pending High viability
PGR-Net: Prior-Guided ROI Reasoning Network for Brain Tumor MRI Segmentation Build Now
A novel MRI segmentation network that leverages data-driven spatial priors to precisely identify brain tumors, outperforming existing methods with a compact model.
Medical AI Mar 23 Pending High viability
Efficient Zero-Shot AI-Generated Image Detection Build Now
A computationally efficient, training-free method for detecting AI-generated images with significantly improved accuracy over state-of-the-art.
AI-Generated Content Detection Mar 23 Code High viability
4DGS360: 360° Gaussian Reconstruction of Dynamic Objects from a Single Video Build Now
Reconstruct dynamic 360° objects from single videos with improved 3D geometry and occluded region handling.
3D Reconstruction Mar 23 Code High viability
AdaEdit: Adaptive Temporal and Channel Modulation for Flow-Based Image Editing Build Now
AdaEdit offers a plug-and-play framework for training-free, text-guided image editing that adaptively balances source feature preservation with synthesized content generation, outperforming existing methods on key metrics.
Generative Image Editing Mar 23 Pending High viability
AgenticRec: End-to-End Tool-Integrated Policy Optimization for Ranking-Oriented Recommender Agents Build Now
AgenticRec optimizes end-to-end recommender agent decision-making for improved ranking accuracy by integrating tool use and refining user preferences.
Recommender Agents Mar 23 Code High viability
Towards Multimodal Time Series Anomaly Detection with Semantic Alignment and Condensed Interaction Build Now
A multimodal time series anomaly detection model that aligns semantic information across time and text to identify critical system failures.
Multimodal Time Series Anomaly Detection Mar 23 Pending High viability
SARe: Structure-Aware Large-Scale 3D Fragment Reassembly Build Now
A generative framework for robustly reassembling large numbers of 3D fragments into complete shapes, outperforming existing methods in challenging scenarios.
3D Reconstruction Mar 23 Code High viability
Rule-State Inference (RSI): A Bayesian Framework for Compliance Monitoring in Rule-Governed Domains Build Now
A Bayesian framework for compliance monitoring that infers rule activation and drift from noisy data, offering significant speedups over traditional methods.
Compliance AI Mar 23 Code High viability
INTRYGUE: Induction-Aware Entropy Gating for Reliable RAG Uncertainty Estimation Build Now
A novel method to improve the reliability of LLMs in retrieval-augmented generation by accurately detecting hallucinations through induction-aware uncertainty estimation.
RAG Mar 23 Code High viability
mSFT: Addressing Dataset Mixtures Overfiting Heterogeneously in Multi-task SFT Build Now
An overfitting-aware algorithm that optimizes multi-task SFT by dynamically adjusting data mixtures, improving model performance and efficiency.
LLM Training Mar 23 Code High viability
Chimera: Latency- and Performance-Aware Multi-agent Serving for Heterogeneous LLMs Watch
A predictive scheduling system for heterogeneous LLM clusters that optimizes end-to-end latency and task performance in multi-agent workflows.
LLM Serving Mar 23
MARCUS: An agentic, multimodal vision-language model for cardiac diagnosis and management Watch
Develop a multimodal AI tool for cardiac diagnosis, leveraging state-of-the-art vision-language models.
Healthcare AI Mar 23 Code
LRC-WeatherNet: LiDAR, RADAR, and Camera Fusion Network for Real-time Weather-type Classification in Autonomous Driving Watch
Real-time weather classification using a fusion of LiDAR, RADAR, and camera data for autonomous vehicles.
Autonomous Driving & Data Fusion Mar 23 Pending
Speed by Simplicity: A Single-Stream Architecture for Fast Audio-Video Generative Foundation Model Watch
A streamlined architecture that speeds up audio-video generative models with state-of-the-art performance.
Audio-Video AI Mar 23
TiCo: Time-Controllable Training for Spoken Dialogue Models Watch
A post-training method to enable spoken dialogue models to control response duration for improved voice assistant interaction.
Spoken Dialogue Models Mar 23
Greater accessibility can amplify discrimination in generative AI Watch
This research reveals gender discrimination in voice-enabled LLMs and proposes pitch manipulation as a mitigation strategy to ensure equitable AI accessibility.
LLM Bias and Accessibility Mar 23 Code
Adapting Self-Supervised Speech Representations for Cross-lingual Dysarthria Detection in Parkinson's Disease Watch
Adapting self-supervised speech models to detect speech impairments across languages for Parkinson's disease patients.
Medical AI Mar 23 Code
Identification of physiological shock in intensive care units via Bayesian regime switching models Watch
A Bayesian regime switching model for early detection of internal bleeding in ICU patients using vital signs and lab trends.
Medical AI Mar 23 Code
Multimodal Survival Analysis with Locally Deployable Large Language Models Watch
A multimodal survival analysis model that generates evidence-based prognoses using locally deployable LLMs, addressing privacy and computational constraints in healthcare.
Medical AI Mar 23
Computationally lightweight classifiers with frequentist bounds on predictions Watch
A computationally efficient classifier for safety-critical applications that provides actionable uncertainty bounds, suitable for real-time medical monitoring.
Medical AI Mar 23 Code
DA-VAE: Plug-in Latent Compression for Diffusion via Detail Alignment Watch
A novel VAE adaptation technique that enables higher resolution image generation for diffusion models with significantly reduced token counts and faster inference.
LLM Training Mar 23
StreamingClaw Technical Report Watch
A unified agent framework for real-time streaming video understanding and embodied intelligence with long-term multimodal memory and proactive interaction.
Embodied Intelligence Mar 23
Programming Manufacturing Robots with Imperfect AI: LLMs as Tuning Experts for FDM Print Configuration Selection Watch
Leveraging LLMs as tuning experts within a Bayesian optimization loop to significantly improve FDM 3D print configuration selection and reduce failure rates.
Robotics Mar 23
SpecTM: Spectral Targeted Masking for Trustworthy Foundation Models Watch
A physics-informed masking technique for Earth observation foundation models that significantly improves predictive accuracy and label efficiency.
Earth Observation AI Mar 23
Uncertainty-guided Compositional Alignment with Part-to-Whole Semantic Representativeness in Hyperbolic Vision-Language Models Watch
Enhancing accuracy in hyperbolic vision-language models through uncertainty-guided alignment.
Vision-Language Models Mar 23 Pending
On the Challenges and Opportunities of Learned Sparse Retrieval for Code Watch
A new family of sparse retrieval models for codebases that aims to improve LLM-based software engineering systems.
Code Retrieval Mar 23
SecureBreak -- A dataset towards safe and secure models Watch
A new safety-oriented dataset to detect and block harmful LLM outputs, improving model security and robustness.
LLM Security Mar 23 Code
Cross-Instance Gaussian Splatting Registration via Geometry-Aware Feature-Guided Alignment Watch
Develop an advanced registration tool for aligning 3D models using Gaussian splatting techniques.
3D Computer Vision Mar 23 Code
Riding Brainwaves in LLM Space: Understanding Activation Patterns Using Individual Neural Signatures Watch
This research explores the potential for language models to encode individual neural responses, suggesting a path towards personalized brain-computer interfaces.
Brain-Computer Interface Mar 23
BadminSense: Enabling Fine-Grained Badminton Stroke Evaluation on a Single Smartwatch Watch
A smartwatch system for fine-grained badminton stroke evaluation and quality prediction.
Sports Analytics Mar 23 Code
Benchmarking Recurrent Event-Based Object Detection for Industrial Multi-Class Recognition on MTEvent Watch
Benchmarking recurrent event-based object detection for industrial multi-class recognition to improve performance over non-recurrent baselines.
Industrial Computer Vision Mar 23 Code
Image-Conditioned Adaptive Parameter Tuning for Visual Odometry Frontends Watch
An AI system that dynamically tunes visual odometry parameters for robots based on real-time image analysis, improving tracking and reducing computational cost.
Robotics AI Mar 23
CellFluxRL: Biologically-Constrained Virtual Cell Modeling via Reinforcement Learning Watch
A reinforcement learning framework for generating biologically plausible virtual cells to accelerate drug discovery.
Biotech AI Mar 23
CurvZO: Adaptive Curvature-Guided Sparse Zeroth-Order Optimization for Efficient LLM Fine-Tuning Watch
A novel optimization method for memory-efficient LLM fine-tuning that reduces variance and speeds up convergence using adaptive curvature guidance.
LLM Fine-Tuning Mar 23
When Exploration Comes for Free with Mixture-Greedy: Do we need UCB in Diversity-Aware Multi-Armed Bandits? Watch
A new 'Mixture-Greedy' strategy for generative AI model selection outperforms traditional UCB methods by leveraging intrinsic exploration, leading to faster convergence and better performance without explicit exploration bonuses.
Generative AI Model Selection Mar 23 Code
PPGL-Swarm: Integrated Multimodal Risk Stratification and Hereditary Syndrome Detection in Pheochromocytoma and Paraganglioma Watch
An agentic diagnostic system for pheochromocytomas and paragangliomas that automates risk stratification and detects hereditary syndromes by integrating multimodal data and providing auditable reasoning.
Medical AI Mar 23
LipsAM: Lipschitz-Continuous Amplitude Modifier for Audio Signal Processing and its Application to Plug-and-Play Dereverberation Watch
Develop Lipschitz-continuous amplitude modifiers for robust audio signal processing, enhancing applications like speech dereverberation.
Audio AI Mar 23 Code
Are AI-assisted Development Tools Immune to Prompt Injection? Watch
This research analyzes prompt injection vulnerabilities in AI-assisted development tools, identifying security gaps and providing guidance for building safer AI workflows.
AI Security Mar 23
Engineering Distributed Governance for Regional Prosperity: A Socio-Technical Framework for Mitigating Under-Vibrancy via Human Data Engines Watch
An AI-driven socio-technical framework to optimize regional economic flow and mitigate under-vibrancy by analyzing spending and sentiment data.
Regional Economic Optimization Mar 23 Code
In-network Attack Detection with Federated Deep Learning in IoT Networks: Real Implementation and Analysis Watch
A federated learning framework for real-time, privacy-preserving in-network attack detection on resource-constrained IoT devices.
IoT Security Mar 23 Code
UniMotion: A Unified Framework for Motion-Text-Vision Understanding and Generation Ignore
A unified framework for understanding and generating human motion, text, and images by treating motion as a continuous modality.
Multimodal AI Mar 23
ShapDBM: Exploring Decision Boundary Maps in Shapley Space Ignore
A novel method for visualizing machine learning decision boundaries by transforming data into Shapley space, leading to more compact and interpretable decision zones.
Explainable AI Mar 23 Code
A Backbone Benchmarking Study on Self-supervised Learning as a Auxiliary Task with Texture-based Local Descriptors for Face Analysis Ignore
This research benchmarks self-supervised learning with texture descriptors for face analysis, finding backbone performance is task-dependent.
Face Analysis Mar 23 Code
Enhancing Document-Level Machine Translation via Filtered Synthetic Corpora and Two-Stage LLM Adaptation Ignore
Leveraging LLMs for document-level machine translation by creating and filtering synthetic parallel data, overcoming data scarcity and generation issues.
Machine Translation Mar 23
Data Curation for Machine Learning Interatomic Potentials by Determinantal Point Processes Ignore
Leveraging determinantal point processes to intelligently curate training data for machine learning interatomic potentials, reducing computational costs and improving model accuracy.
ML for Materials Science Mar 23 Code
More Isn't Always Better: Balancing Decision Accuracy and Conformity Pressures in Multi-AI Advice Ignore
Design systems that present multi-AI advice to improve human decision-making without increasing conformity pressure.
Human-AI Interaction Mar 23
TALUS: Threshold ML-DSA with One-Round Online Signing via Boundary Clearance and Carry Elimination Ignore
A novel threshold ML-DSA construction enabling one-round online signing with high success rates, overcoming theoretical limitations in cryptographic schemes.
Cryptography Mar 23
Dual-Space Knowledge Distillation with Key-Query Matching for Large Language Models with Vocabulary Mismatch Ignore
A novel generative adversarial approach to improve knowledge distillation between large language models with different tokenizers, showing modest gains in text generation quality.
LLM Training Mar 23
RAFL: Generalizable Sim-to-Real of Soft Robots with Residual Acceleration Field Learning Ignore
A framework that improves the accuracy of soft robot simulations across different shapes by learning a transferable corrective dynamics field.
Robotics Simulation Mar 23
6D Robotic OCT Scanning of Curved Tissue Surfaces Ignore
Enables precise 6D robotic scanning of curved tissue surfaces for improved medical imaging by eliminating reliance on image registration.
Robotics Mar 23 Code
λ-GELU: Learning Gating Hardness for Controlled ReLU-ization in Deep Networks Ignore
A novel activation function that bridges smooth neural network training with ReLU-compatible deployment pipelines by learning a controllable 'hardness' parameter.
LLM Training Mar 23 Code
TREX: Trajectory Explanations for Multi-Objective Reinforcement Learning Ignore
A framework for explaining the decision-making process of multi-objective reinforcement learning agents by attributing trajectory segments to specific objectives.
Multi-Objective Reinforcement Learning Mar 23 Code
BHDD: A Burmese Handwritten Digit Dataset Ignore
A new dataset of Burmese handwritten digits with code and benchmark results to enable research and development in optical character recognition for underrepresented scripts.
Computer Vision Datasets Mar 23 Pending
SparseDVFS: Sparse-Aware DVFS for Energy-Efficient Edge Inference Ignore
A framework for optimizing energy efficiency in edge device inference by dynamically scaling voltage and frequency based on operator sparsity.
Edge AI Optimization Mar 23
Optimal Solutions for the Moving Target Vehicle Routing Problem with Obstacles via Lazy Branch and Price Ignore
An optimization algorithm for efficient agent routing in dynamic environments with moving targets and obstacles.
Operations Research / Logistics Mar 23 Code
Climate Prompting: Generating the Madden-Julian Oscillation using Video Diffusion and Low-Dimensional Conditioning Ignore
A video diffusion model generates climate simulations conditioned on key metrics to bridge theoretical frameworks and improve tropical atmosphere prediction.
Climate AI Mar 23
On the Number of Conditional Independence Tests in Constraint-based Causal Discovery Ignore
A new causal discovery algorithm significantly reduces the number of conditional independence tests required, improving efficiency for learning causal relationships from data.
Causal Discovery Mar 23 Code
Multi-View Deformable Convolution Meets Visual Mamba for Coronary Artery Segmentation Ignore
A novel two-stage framework for coronary artery segmentation in CTA images, combining multi-view deformable convolution with visual Mamba to improve accuracy and efficiency.
Medical AI Mar 23
Timing In stand-up Comedy: Text, Audio, Laughter, Kinesics (TIC-TALK): Pipeline and Database for the Multimodal Study of Comedic Timing Ignore
A multimodal dataset and pipeline for analyzing comedic timing using language, gesture, and audience laughter.
Multimodal AI Mar 23
Getting to the Point: Why Pointing Improves LVLMs Ignore
This research explores how explicit object pointing in Large Vision-Language Models improves zero-shot counting accuracy and generalization by encoding spatial information.
Large Vision-Language Models Mar 23
Uncertainty Quantification for Distribution-to-Distribution Flow Matching in Scientific Imaging Ignore
A unified framework for uncertainty quantification in generative models for scientific imaging, improving reliability and detecting out-of-distribution cases.
Generative Models Mar 23
A Blueprint for Self-Evolving Coding Agents in Vehicle Aerodynamic Drag Prediction Ignore
Develops a framework for self-evolving coding agents to automate and accelerate vehicle aerodynamic drag prediction through surrogate model discovery.
AI for Engineering Design Mar 23
Structured Visual Narratives Undermine Safety Alignment in Multimodal Large Language Models Ignore
This research identifies a novel vulnerability in multimodal LLMs using comic-based jailbreaks, highlighting the need for more robust safety alignment.
LLM Safety Mar 23 Code
Reasoning Provenance for Autonomous AI Agents: Structured Behavioral Analytics Beyond State Checkpoints and Execution Traces Ignore
A new primitive for structured reasoning provenance in AI agents enables population-level behavioral analytics beyond traditional debugging tools.
AI Agents Mar 23
A Comparative Analysis of LLM Memorization at Statistical and Internal Levels: Cross-Model Commonalities and Model-Specific Signatures Ignore
This research analyzes LLM memorization across multiple model families to uncover universal patterns and model-specific behaviors, aiming for a fundamental understanding of how LLMs retain information.
LLM Internals Mar 23 Code
Towards Secure Retrieval-Augmented Generation: A Comprehensive Review of Threats, Defenses and Benchmarks Ignore
This paper provides a comprehensive review of security threats and defenses for Retrieval-Augmented Generation (RAG) systems, aiming to foster the development of robust and trustworthy RAG applications.
RAG Security Mar 23 Code
TLS Certificate and Domain Feature Analysis of Phishing Domains in the Danish .dk Namespace Ignore
Leveraging TLS certificate and domain features to detect phishing domains in the Danish .dk namespace.
Cybersecurity AI Mar 23 Code
Auditing MCP Servers for Over-Privileged Tool Capabilities Ignore
A security auditing toolkit for LLM tool integration protocols to detect and mitigate over-privileged capabilities.
LLM Security Mar 23
Silicon Bureaucracy and AI Test-Oriented Education: Contamination Sensitivity and Score Confidence in LLM Benchmarks Ignore
A framework to audit LLM benchmarks for contamination sensitivity, providing confidence scores for model evaluations.
LLM Evaluation Mar 23 Code
Rateless DeepJSCC for Broadcast Channels: a Rate-Distortion-Complexity Tradeoff Ignore
A novel deep learning framework for adaptive wireless broadcasting that optimizes image quality, transmission rate, and processing complexity for heterogeneous edge devices.
Wireless Communications AI Mar 23 Code
The Dual Mechanisms of Spatial Reasoning in Vision-Language Models Ignore
This research clarifies how vision-language models process spatial relationships, identifying two key mechanisms within their architecture.
Vision-Language Models Mar 23
Decoupling Exploration and Policy Optimization: Uncertainty Guided Tree Search for Hard Exploration Ignore
Develop an Uncertainty Guided Tree Search tool for enhancing exploration in reinforcement learning tasks.
Reinforcement Learning Mar 23 Code
Confidence-Based Decoding is Provably Efficient for Diffusion Language Models Ignore
This paper provides a theoretical analysis of confidence-based decoding strategies for diffusion language models to improve sampling efficiency.
LLM Training Mar 23
Calibeating Made Simple Ignore
A theoretical framework for improving external forecasts through online post-processing to minimize cumulative losses.
Online Learning Mar 23 Code
RAMPAGE: RAndomized Mid-Point for debiAsed Gradient Extrapolation Ignore
A new randomized method for unbiased gradient extrapolation in variational inequalities.
Optimization Algorithms Mar 23 Code
dynActivation: A Trainable Activation Family for Adaptive Nonlinearity Ignore
A novel trainable activation function that interpolates between non-linearity and linearity to improve training efficiency and robustness in deep neural networks.
Model Architecture Mar 23
The Semantic Ladder: A Framework for Progressive Formalization of Natural Language Content for Knowledge Graphs and AI Systems Ignore
A framework for progressively formalizing natural language into machine-actionable knowledge graphs.
Knowledge Graphs Mar 23
On the Failure of Topic-Matched Contrast Baselines in Multi-Directional Refusal Abliteration Ignore
This research investigates the failure of topic-matched contrast baselines in removing refusal behavior from language models, suggesting a need for revised methodologies in abliteration research.
LLM Safety & Alignment Mar 23
On the Interplay of Priors and Overparametrization in Bayesian Neural Network Posteriors Ignore
This research investigates how overparameterization and priors reshape Bayesian neural network posteriors, offering theoretical insights into their geometric properties.
Bayesian Neural Networks Mar 23 Code
Disengagement Analysis and Field Tests of a Prototypical Open-Source Level 4 Autonomous Driving System Ignore
This research analyzes disengagements in an open-source Level 4 autonomous driving system to identify robustness issues missed by standard metrics.
Autonomous Driving Mar 23
Structural Concentration in Weighted Networks: A Class of Topology-Aware Indices Ignore
A new framework for measuring concentration in weighted networks that accounts for both weight distribution and network topology.
Network Analysis Mar 23 Code
Collision-Free Velocity Scheduling for Multi-Agent Systems on Predefined Routes via Inexact-Projection ADMM Ignore
Optimizing waypoint passage times for multi-agent systems on predefined routes to improve efficiency and avoid collisions.
Robotics Mar 23
Albank -- a case study on the use of ethereum blockchain technology and smart contracts for secure decentralized bank application Ignore
A theoretical proposal for a decentralized banking application using Ethereum smart contracts to enhance security and transparency in traditional banking.
Decentralized Finance (DeFi) Mar 23
Holistic Scaling Laws for Optimal Mixture-of-Experts Architecture Optimization Ignore
A framework for optimizing Mixture-of-Experts LLM architectures by establishing new scaling laws based on compute and parameter constraints.
LLM Training Mar 23
Reasoning or Rhetoric? An Empirical Analysis of Moral Reasoning Explanations in Large Language Models Ignore
This research empirically analyzes LLM responses to moral dilemmas to understand if they exhibit genuine moral reasoning or merely mimic it through alignment training, revealing a 'moral ventriloquism' phenomenon.
LLM Behavior Analysis Mar 23
All elementary functions from a single binary operator Ignore
A novel binary operator and constant can generate all scientific calculator functions, enabling gradient-based symbolic regression for exact function recovery from data.
Mathematical AI Mar 23 Code
Publicly Understandable Electronic Voting: A Non-Cryptographic, End-to-End Verifiable Scheme Ignore
A non-cryptographic voting system allowing voters to verify election integrity using only basic arithmetic and physical receipts, restoring public trust in democratic processes.
E-Voting Security Mar 23 Code
Directional Mollification for Controlled Smooth Path Generation Ignore
A novel theoretical framework for generating smooth, waypoint-interpolating paths for autonomous and industrial robots.
Robotics Path Generation Mar 23
Politics of Questions in News: A Mixed-Methods Study of Interrogative Stances as Markers of Voice and Power Ignore
This paper analyzes the function and distribution of interrogative sentences in French news articles to understand how they structure discourse and foreground specific actors.
NLP Research Mar 23
Connecting Distributed Ledgers: Surveying Novel Interoperability Solutions in On-chain Finance Ignore
This paper surveys novel interoperability solutions for on-chain finance, proposing metrics and models for empirical research.
Blockchain Interoperability Mar 23
Identifiability and amortized inference limitations in Kuramoto models Ignore
A novel amortized Bayesian inference approach for fast and scalable parameter estimation in complex dynamical systems like Kuramoto models.
Bayesian Inference Mar 23
Cybersecurity Guidance for Smart Homes: A Cross-National Review of Government Sources Ignore
A review of government guidance for smart home cybersecurity incidents reveals a lack of structured incident response support for users.
Cybersecurity Mar 23
AI Token Futures Market: Commoditization of Compute and Derivatives Contract Design Ignore
This paper proposes a financial market for AI tokens to hedge against compute cost volatility for enterprises.
AI Compute Markets Mar 23
Thinking Deeper, Not Longer: Depth-Recurrent Transformers for Compositional Generalization Ignore
A depth-recurrent Transformer architecture designed for improved compositional generalization in tasks requiring variable-depth reasoning.
LLM Architecture Mar 23
RTD-RAX: Fast, Safe Trajectory Planning for Systems under Unknown Disturbances Ignore
A framework for safe, real-time trajectory planning that accounts for unknown disturbances.
Robotics Mar 23
Neyman-Pearson multiclass classification under label noise via empirical likelihood Ignore
A statistical method for multiclass classification that accounts for noisy labels using empirical likelihood, offering theoretical guarantees.
Statistical Learning Mar 23
Proximal Policy Optimization in Path Space: A Schrödinger Bridge Perspective Ignore
A theoretical framework for optimizing generative reinforcement learning policies by reformulating proximal policy optimization in path space.
Reinforcement Learning Mar 23
Framework for Risk-Based IoT Cybersecurity Audit Engagements Ignore
A framework for auditing the cybersecurity risks of Internet of Things devices in corporate environments.
IoT Cybersecurity Mar 23
Asymptotically Ideal Hierarchical Secret Sharing Based on CRT for Integer Ring Ignore
A theoretical cryptographic scheme for hierarchical secret sharing using the Chinese Remainder Theorem.
Cryptography Mar 23
Asymptotically Ideal Conjunctive Hierarchical Secret Sharing Scheme Based on CRT for Polynomial Ring Ignore
A theoretical cryptographic scheme for secure secret sharing with improved information rates.
Cryptography Mar 23
Deep Reinforcement Learning and The Tale of Two Temporal Difference Errors Ignore
This paper theoretically analyzes the nuances of temporal difference errors in deep reinforcement learning, revealing potential performance impacts in deep differential RL methods.
Deep Reinforcement Learning Mar 23
Tacit Knowledge Management with Generative AI: Proposal of the GenAI SECI Model Ignore
A new model for managing tacit and explicit knowledge using generative AI, introducing the concept of 'Digital Fragmented Knowledge'.
Knowledge Management AI Mar 23
Instruction Set and Language for Symbolic Regression Ignore
A novel representation framework for symbolic regression that encodes expression DAGs as strings and computes a pruned canonical string to collapse equivalent representations.
Symbolic Regression Mar 23
The Reasoning Error About Reasoning: Why Different Types of Reasoning Require Different Representational Structures Ignore
A theoretical framework analyzing the structural demands of different reasoning types to understand limitations of current AI approaches.
AI Theory Mar 23
Cognitive Agency Surrender: Defending Epistemic Sovereignty via Scaffolded AI Friction Ignore
This paper theorizes and quantifies the risk of cognitive agency surrender due to frictionless AI interfaces, proposing 'Scaffolded Cognitive Friction' as a technical prerequisite for AI governance and societal cognitive resilience.
AI Governance Mar 23
Riemannian Geometry Speaks Louder Than Words: From Graph Foundation Model to Next-Generation Graph Intelligence Ignore
A theoretical framework for graph foundation models using Riemannian geometry to capture complex structural patterns.
Graph Intelligence Mar 23
The Presupposition Problem in Representation Genesis Ignore
This paper analyzes the philosophical underpinnings of representation genesis in AI, identifying structural limitations in current theories.
AI Theory Mar 23
Bridges connecting Encryption Schemes Ignore
This paper explores theoretical bridges between encryption schemes, inspired by homomorphic encryption, with security guarantees based on existing schemes.
Cryptography Mar 23