Memory-Augmented Vision-Language Agents for Persistent and Semantically Consistent Object Captioning Build Now
A memory-augmented vision-language model ensuring consistent multi-view object captioning for better embodied agent navigation.
Vision-Language Systems Mar 25 Pending High viability
OneSearch-V2: The Latent Reasoning Enhanced Self-distillation Generative Search Framework Build Now
OneSearch-V2 enhances e-commerce search with reasoning and self-distillation, boosting conversion rates and reducing search biases.
search_retrieval Mar 25 Pending High viability
When AI Meets Early Childhood Education: Large Language Models as Assessment Teammates in Chinese Preschools Build Now
AI tool automating teacher-child interaction quality assessments in Chinese preschools for scalable, continuous monitoring.
EdTech AI Mar 25 Code High viability
LightSplat: Fast and Memory-Efficient Open-Vocabulary 3D Scene Understanding in Five Seconds Build Now
LightSplat dramatically speeds up and optimizes 3D scene understanding with a lightweight indexing framework, making real-time open-vocabulary segmentation feasible.
3D Scene Understanding Mar 25 Code High viability
Claudini: Autoresearch Discovers State-of-the-Art Adversarial Attack Algorithms for LLMs Build Now
Claudini autonomously discovers advanced adversarial attacks on LLMs, offering cutting-edge cybersecurity solutions.
Cybersecurity-AI Mar 25 Pending High viability
Toward Physically Consistent Driving Video World Models under Challenging Trajectories Build Now
A world model for autonomous driving that generates physically consistent videos even from challenging or invalid trajectories, improving simulation realism.
Autonomous Driving Simulation Mar 25 Code High viability
Boosting Document Parsing Efficiency and Performance with Coarse-to-Fine Visual Processing Build Now
PaddleOCR-VL enhances document parsing efficiency by focusing on semantically relevant regions with a coarse-to-fine processing framework.
Document Parsing AI Mar 25 Pending High viability
MARCH: Multi-Agent Reinforced Self-Check for LLM Hallucination Build Now
A multi-agent framework that uses deliberate information asymmetry and reinforcement learning to significantly reduce LLM hallucinations in RAG systems.
LLM Hallucination Mitigation Mar 25 Pending High viability
LLMpedia: A Transparent Framework to Materialize an LLM's Encyclopedic Knowledge at Scale Build Now
LLMpedia empowers enterprises with auditable AI-generated encyclopedic content across diverse topics, enhancing knowledge bases.
AI Knowledge Generation Mar 25 Code High viability
VFIG: Vectorizing Complex Figures in SVG with Vision-Language Models Build Now
VFIG converts rasterized images of complex figures into editable SVGs using a novel vision-language model and a large-scale dataset, bridging the gap for designers and researchers.
Vision-Language Models Mar 25 Code High viability
CliPPER: Contextual Video-Language Pretraining on Long-form Intraoperative Surgical Procedures for Event Recognition Build Now
A pretraining framework for understanding long-form surgical videos, enabling zero-shot recognition of surgical events and improving multimodal alignment.
Medical AI Mar 25 Pending High viability
Decentralized End-to-End Multi-AAV Pursuit Using Predictive Spatio-Temporal Observation via Deep Reinforcement Learning Build Now
Develop an advanced aerial swarm pursuit system using deep reinforcement learning for autonomous navigation in cluttered environments.
AI for Autonomous Systems Mar 25 Code High viability
OmniWeaving: Towards Unified Video Generation with Free-form Composition and Reasoning Build Now
OmniWeaving offers a state-of-the-art open-source framework for unified video generation with advanced multimodal composition and reasoning capabilities.
Video Synthesis Mar 25 Code High viability
SOMA: Strategic Orchestration and Memory-Augmented System for Vision-Language-Action Model Robustness via In-Context Adaptation Build Now
SOMA enhances existing robotic vision-language-action models for robust performance in challenging, out-of-distribution environments without retraining.
Robotics AI Mar 25 Pending High viability
A^3: Towards Advertising Aesthetic Assessment Build Now
A framework and multimodal LLM for objective, scalable, and interpretable assessment of advertising image aesthetics to improve commercial conversion rates.
Advertising AI Mar 25 Pending High viability
Schema on the Inside: A Two-Phase Fine-Tuning Method for High-Efficiency Text-to-SQL at Scale Build Now
A self-hosted, specialized text-to-SQL model that drastically cuts API costs and latency for conversational data querying in large-scale applications.
Text-to-SQL Mar 25 Code High viability
QuadFM: Foundational Text-Driven Quadruped Motion Dataset for Generation and Control Build Now
A foundational dataset and unified framework for text-driven quadruped motion generation and control, enabling real-time, expressive robot movements.
Robotics Motion Generation Mar 25 Pending High viability
Thinking with Tables: Enhancing Multi-Modal Tabular Understanding via Neuro-Symbolic Reasoning Build Now
A neuro-symbolic reasoning system that significantly enhances multi-modal understanding of tabular data, outperforming existing baselines and rivaling commercial LLMs.
Multimodal AI Mar 25 Pending High viability
Leave No Stone Unturned: Uncovering Holistic Audio-Visual Intrinsic Coherence for Deepfake Detection Build Now
A novel deepfake detection system that leverages intrinsic audio-visual coherence, outperforming state-of-the-art with a new high-fidelity dataset and available code.
Deepfake Detection Mar 25 Pending High viability
Uncertainty-Aware Vision-based Risk Object Identification via Conformal Risk Tube Prediction Build Now
A novel AI system for hazard detection in intelligent driving that quantifies risk uncertainty to improve safety and reduce false alarms.
Autonomous Driving AI Mar 25 Code High viability
Polynomial Speedup in Diffusion Models with the Multilevel Euler-Maruyama Method Build Now
Accelerate diffusion model sampling by orders of magnitude using a novel multilevel approximation method for SDEs.
Diffusion Models Mar 25 Code High viability
DreamerAD: Efficient Reinforcement Learning via Latent World Model for Autonomous Driving Build Now
Accelerate autonomous driving reinforcement learning by 80x using a latent world model that compresses diffusion sampling and enables efficient exploration.
Autonomous Driving RL Mar 25 Code High viability
TAG: Target-Agnostic Guidance for Stable Object-Centric Inference in Vision-Language-Action Models Build Now
Enhance the reliability of robotic vision-language-action policies by reducing distractors and improving object instance grounding at inference time.
Robotics Mar 25 Code High viability
Latent-WAM: Latent World Action Modeling for End-to-End Autonomous Driving Build Now
An efficient end-to-end autonomous driving framework that achieves state-of-the-art trajectory planning using spatially-aware and dynamics-informed latent world representations.
Autonomous Driving Mar 25 Code High viability
Vision-Language Models vs Human: Perceptual Image Quality Assessment Build Now
Leveraging Vision-Language Models to automate perceptual image quality assessment, outperforming traditional methods in specific attributes and offering a scalable alternative to costly psychophysical experiments.
Computer Vision Mar 25 Code High viability
EndoVGGT: GNN-Enhanced Depth Estimation for Surgical 3D Reconstruction Build Now
A geometry-centric framework using graph neural networks to achieve highly accurate 3D reconstruction of deformable surgical tissues, outperforming state-of-the-art and generalizing to unseen domains.
Medical AI Mar 25 Code High viability
Chameleon: Episodic Memory for Long-Horizon Robotic Manipulation Build Now
Chameleon provides robots with geometry-grounded episodic memory to overcome perceptual aliasing and improve long-horizon manipulation reliability.
Robotics AI Mar 25 Pending High viability
Towards Training-Free Scene Text Editing Build Now
A training-free framework for high-fidelity scene text editing that achieves state-of-the-art results without requiring task-specific data.
Generative Image Editing Mar 25 Pending High viability
Anti-I2V: Safeguarding your photos from malicious image-to-video generation Build Now
A novel defense system that safeguards personal photos from malicious image-to-video generation by operating in both Lab and frequency domains.
AI Safety Mar 25 Code High viability
Scaling Recurrence-aware Foundation Models for Clinical Records via Next-Visit Prediction Build Now
A generative pretraining strategy for electronic health records that predicts future patient visits, outperforming existing methods and generalizing to new cohorts.
Medical AI Mar 25 Code High viability
LensWalk: Agentic Video Understanding by Planning How You See in Videos Build Now
LensWalk is an agentic framework that allows LLMs to actively control their visual observation of videos, improving understanding accuracy by dynamically planning how they see.
Agentic Video Understanding Mar 25 Code High viability
SEGAR: Selective Enhancement for Generative Augmented Reality Build Now
SEGAR enables real-time, temporally coherent augmented reality experiences by selectively generating and correcting future visual frames.
Generative AR Mar 25 Code High viability
UI-Voyager: A Self-Evolving GUI Agent Learning via Failed Experience Build Now
A self-evolving GUI agent that learns from failures to achieve high success rates in mobile automation tasks.
Agents Mar 25 Pending High viability
Cross-Modal Prototype Alignment and Mixing for Training-Free Few-Shot Classification Build Now
A novel approach to few-shot image classification by mixing and aligning text and image prototypes, outperforming existing methods on benchmarks.
Few-Shot Image Classification Mar 25 Code High viability
TuneShift-KD: Knowledge Distillation and Transfer for Fine-tuned Models Build Now
A novel method to distill specialized knowledge from fine-tuned LLMs to new models without access to original training data, using perplexity differences to generate synthetic training examples.
LLM Knowledge Transfer Mar 25 Code High viability
AVO: Agentic Variation Operators for Autonomous Evolutionary Search Watch
Autonomous agents discover and implement performance-critical AI kernel optimizations, outperforming state-of-the-art expert-engineered solutions.
AI Optimization Mar 25 High viability
Project and Generate: Divergence-Free Neural Operators for Incompressible Flows Build Now
A novel framework for neural operators that enforces physical constraints for stable and accurate fluid dynamics simulations.
Physics-Informed AI Mar 25 Code High viability
Video-Only ToM: Enhancing Theory of Mind in Multimodal Large Language Models Build Now
A framework to enhance the Theory of Mind capabilities of multimodal LLMs by aligning visual representations with semantic targets, improving reasoning and explanations in video-based scenarios.
Multimodal AI Mar 25 Code High viability
Multi-Agent Reasoning with Consistency Verification Improves Uncertainty Calibration in Medical MCQA Build Now
A multi-agent AI system that significantly improves the reliability of medical diagnosis confidence scores, enabling safer deployment in clinical settings.
Medical AI Mar 25 Code High viability
Positive-First Most Ambiguous: A Simple Active Learning Criterion for Interactive Retrieval of Rare Categories Build Now
A novel active learning method for rapidly discovering rare visual categories in large unlabeled datasets with minimal user feedback.
Interactive Retrieval Mar 25 Code High viability
Composer 2 Technical Report Build Now
A specialized AI model for agentic software engineering that demonstrates strong long-term planning and coding intelligence, with benchmark results comparable to state-of-the-art systems.
Agentic Software Engineering Mar 25 Code High viability
Counting Without Numbers \& Finding Without Words Build Now
A multimodal AI system that reunites lost pets with their families using both visual and acoustic biometrics, inspired by animal communication.
Animal Reunification AI Mar 25 Code High viability
Mechanic: Sorrifier-Driven Formal Decomposition Workflow for Automated Theorem Proving Build Now
A novel agent system that uses a sorry-driven formal decomposition strategy to efficiently solve complex mathematical reasoning problems in automated theorem proving.
Automated Theorem Proving Mar 25 Code High viability
Unleashing Vision-Language Semantics for Deepfake Video Detection Build Now
A novel deepfake video detection framework that leverages rich vision-language semantics and identity-aware prompting to significantly outperform state-of-the-art methods.
Deepfake Video Detection Mar 25 Pending High viability
CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents Build Now
A large-scale dataset and benchmark for training and evaluating computer-use agents that automate complex desktop workflows.
Computer-Use Agents Mar 25 Code High viability
The Gait Signature of Frailty: Transfer Learning based Deep Gait Models for Scalable Frailty Assessment Build Now
Leveraging transfer learning on a new gait dataset to create a scalable, non-invasive frailty assessment tool for aging medicine.
Medical AI Mar 25 Code High viability
Marchuk: Efficient Global Weather Forecasting from Mid-Range to Sub-Seasonal Scales via Flow Matching Build Now
A highly efficient generative model for accurate global weather forecasting up to 30 days, outperforming larger models with faster inference.
Weather Forecasting Mar 25 Code High viability
IPsec based on Quantum Key Distribution: Adapting non-3GPP access to 5G Networks to the Quantum Era Build Now
This paper proposes and experimentally validates a quantum-resistant security mechanism for 5G non-3GPP access by integrating Quantum Key Distribution with IPsec, offering faster and more secure connections.
Quantum Security for 5G Mar 25 Code High viability
ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers Build Now
ClawKeeper provides comprehensive, real-time security for autonomous agents by integrating skill, plugin, and watcher-based protection mechanisms.
Agent Security Mar 25 Code High viability
Enhancing Drone Light Shows Performances: Optimal Allocation and Trajectories for Swarm Drone Formations Build Now
A real-time framework for optimally assigning and generating collision-free trajectories for large drone swarms, enabling complex aerial light show choreography.
Drone Swarm Coordination Mar 25 Code High viability
AI-Supervisor: Autonomous AI Research Supervision via a Persistent Research World Model Build Now
An autonomous AI research supervision framework that uses a persistent knowledge graph and multi-agent system to discover, develop, and validate AI research.
AI Agents Mar 25 Code High viability
3D-Mix for VLA: A Plug-and-Play Module for Integrating VGGT-based 3D Information into Vision-Language-Action Models Build Now
A plug-and-play module that significantly enhances the 3D spatial understanding of Vision-Language-Action models for improved robotic control.
Robotics AI Mar 25 Code High viability
ViHOI: Human-Object Interaction Synthesis with Visual Priors Build Now
Generate realistic 3D human-object interactions by extracting visual priors from 2D images using a diffusion-based framework.
3D Motion Generation Mar 25 Code High viability
MolEvolve: LLM-Guided Evolutionary Search for Interpretable Molecular Optimization Build Now
An LLM-guided evolutionary search framework that autonomously discovers interpretable molecular optimizations by planning chemical symbolic operations.
Molecular Optimization Mar 25 Code High viability
GeoRouter: Dynamic Paradigm Routing for Worldwide Image Geolocalization Build Now
A dynamic routing framework for image geolocalization that adaptively selects the best paradigm (retrieval or generation) for precise GPS coordinate prediction.
Image Geolocalization Mar 25 Code High viability
PP-OCRv5: A Specialized 5M-Parameter Model Rivaling Billion-Parameter Vision-Language Models on OCR Tasks Build Now
A highly efficient 5M-parameter OCR system that rivals billion-parameter models by focusing on data quality and diversity, offering superior localization and reduced hallucinations.
OCR Mar 25 Pending High viability
CoordLight: Learning Decentralized Coordination for Network-Wide Traffic Signal Control Build Now
A decentralized AI framework for optimizing city-wide traffic signals using novel state representation and neighbor-aware policy optimization.
Traffic Signal Control Mar 25 Pending High viability
LATS: Large Language Model Assisted Teacher-Student Framework for Multi-Agent Reinforcement Learning in Traffic Signal Control Build Now
A novel framework that uses LLMs to distill knowledge into simpler RL agents for more efficient and generalizable traffic signal control.
Multi-Agent Reinforcement Learning Mar 25 Code High viability
Language-Guided Structure-Aware Network for Camouflaged Object Detection Build Now
A language-guided network that uses text prompts to improve the detection of camouflaged objects in images.
Computer Vision Mar 25 Code High viability
Enhancing Efficiency and Performance in Deepfake Audio Detection through Neuron-level dropin & Neuroplasticity Mechanisms Build Now
A novel approach inspired by brain plasticity to significantly improve the efficiency and accuracy of deepfake audio detection models.
Audio Deepfake Detection Mar 25 Code High viability
GameplayQA: A Benchmarking Framework for Decision-Dense POV-Synced Multi-Video Understanding of 3D Virtual Agents Build Now
A new benchmark and dataset for evaluating multimodal LLMs in 3D virtual agents, revealing significant gaps in current AI capabilities for embodied perception and reasoning.
Embodied AI Mar 25 Code High viability
Le MuMo JEPA: Multi-Modal Self-Supervised Representation Learning with Learnable Fusion Tokens Build Now
A multi-modal self-supervised learning framework that unifies RGB and LiDAR depth representations for improved performance on downstream driving tasks.
Multi-Modal Representation Learning Mar 25 Code High viability
Heuristic Self-Paced Learning for Domain Adaptive Semantic Segmentation under Adverse Conditions Build Now
An AI-powered scheduler that autonomously optimizes the learning order of semantic classes for improved image segmentation under challenging conditions.
Domain Adaptive Semantic Segmentation Mar 25 Code High viability
Refining time-space traffic diagrams: A neighborhood-adaptive linear regression method Build Now
A neighborhood-adaptive linear regression method to refine low-resolution time-space traffic diagrams, improving accuracy and capturing complex traffic dynamics.
Traffic Analysis Mar 25 Code High viability
Samasāmayik: A Parallel Dataset for Hindi-Sanskrit Machine Translation Build Now
A new parallel dataset for Hindi-Sanskrit translation, enabling significant performance gains for contemporary language models.
Machine Translation Mar 25 Code High viability
RS-SSM: Refining Forgotten Specifics in State Space Model for Video Semantic Segmentation Build Now
A novel state space model that refines forgotten specifics for state-of-the-art video semantic segmentation with high computational efficiency.
Video Semantic Segmentation Mar 25 Pending High viability
VERIA: Verification-Centric Multimodal Instance Augmentation for Long-Tailed 3D Object Detection Build Now
VERIA enhances 3D object detection for rare classes in autonomous driving by synthesizing diverse and contextually relevant multimodal instances using off-the-shelf foundation models.
3D Object Detection Mar 25 Code High viability
TopoMesh: High-Fidelity Mesh Autoencoding via Topological Unification Build Now
TopoMesh enables high-fidelity 3D mesh generation by unifying topological representations between ground truth and predicted meshes, leading to superior preservation of sharp features and geometric details.
3D Generation Mar 25 Code High viability
Language-Assisted Image Clustering Guided by Discriminative Relational Signals and Adaptive Semantic Centers Build Now
A novel framework for image clustering that leverages language to create more discriminative features and adaptive semantic centers, outperforming state-of-the-art methods.
Image Clustering Mar 25 Code High viability
ScrollScape: Unlocking 32K Image Generation With Video Diffusion Priors Build Now
ScrollScape enables ultra-high-resolution (32K) image generation at extreme aspect ratios by leveraging video diffusion priors for structural integrity.
Generative Image Mar 25 Code High viability
DeepDTF: Dual-Branch Transformer Fusion for Multi-Omics Anticancer Drug Response Prediction Build Now
A dual-branch Transformer fusion framework predicts anticancer drug response from multi-omics data, outperforming baselines and providing biological explanations.
Medical AI Mar 25 Code High viability
Forecasting with Guidance: Representation-Level Supervision for Time Series Forecasting Build Now
A plug-in method that enhances time series forecasting accuracy by aligning intermediate representations with pretrained foundation models.
Time Series Forecasting Mar 25 Code High viability
B-MoE: A Body-Part-Aware Mixture-of-Experts "All Parts Matter" Approach to Micro-Action Recognition Build Now
A novel body-part-aware Mixture-of-Experts framework for highly accurate micro-action recognition, outperforming state-of-the-art on challenging benchmarks.
Action Recognition Mar 25 Code High viability
InstanceRSR: Real-World Super-Resolution via Instance-Aware Representation Alignment Build Now
A super-resolution framework that preserves fine-grained details and semantic consistency at the instance level for real-world images.
Image Super-Resolution Mar 25 Code High viability
Attack Assessment and Augmented Identity Recognition for Human Skeleton Data Build Now
A novel framework to defend machine learning models against adversarial attacks using GAN-generated synthetic data, improving robustness without sacrificing performance.
Adversarial Attack Defense Mar 25 Code High viability
UniScale: Synergistic Entire Space Data and Model Scaling for Search Ranking Build Now
UniScale co-designs data and model architecture to unlock superior performance in search ranking by synergistically scaling both elements.
Search Ranking Mar 25 Code High viability
RVLM: Recursive Vision-Language Models with Adaptive Depth Build Now
An adaptive, auditable vision-language model for medical diagnostics that generates and executes code for iterative reasoning.
Medical AI Mar 25 Pending High viability
Environment-Grounded Multi-Agent Workflow for Autonomous Penetration Testing Build Now
An AI-powered multi-agent system for automated penetration testing of robotic systems, offering high reliability and traceability.
AI Security Mar 25 Code High viability
Who Benefits from RAG? The Role of Exposure, Utility and Attribution Bias Build Now
This research quantifies and addresses fairness disparities in Retrieval-Augmented Generation (RAG) systems, offering a path to more equitable LLM applications.
LLM Fairness Mar 25 Code High viability
Uncovering Memorization in Timeseries Imputation models: LBRM Membership Inference and its link to attribute Leakage Build Now
Develops novel attacks to uncover privacy vulnerabilities in time series imputation models, enabling more secure AI deployments.
AI Security Mar 25 Code High viability
HEART-PFL: Stable Personalized Federated Learning under Heterogeneity with Hierarchical Directional Alignment and Adversarial Knowledge Transfer Build Now
A dual-sided framework for stable personalized federated learning that enhances client specificity and global model accuracy through hierarchical alignment and adversarial knowledge transfer.
Federated Learning Mar 25 Pending High viability
Powerful Teachers Matter: Text-Guided Multi-view Knowledge Distillation with Visual Prior Enhancement Build Now
Enhance knowledge distillation by using dual-modality teachers with visual prior enhancement and text-guided adaptive fusion to significantly improve student model performance.
Knowledge Distillation Mar 25 Code High viability
IPatch: A Multi-Resolution Transformer Architecture for Robust Time-Series Forecasting Build Now
A multi-resolution Transformer architecture that improves time-series forecasting accuracy and robustness by integrating both point-wise and patch-wise temporal representations.
Time-Series Forecasting Mar 25 Code High viability
RefReward-SR: LR-Conditioned Reward Modeling for Preference-Aligned Super-Resolution Build Now
A novel reward modeling framework for super-resolution that aligns with human perceptual preferences by using the low-resolution input as a semantic anchor, supported by a new large-scale dataset and code release.
Generative Image Mar 25 Code High viability
TsetlinWiSARD: On-Chip Training of Weightless Neural Networks using Tsetlin Automata on FPGAs Build Now
Enable on-chip training of weightless neural networks for efficient edge AI with TsetlinWiSARD, offering significant hardware efficiency improvements.
Edge AI Training Mar 25 Code High viability
Unlocking Few-Shot Capabilities in LVLMs via Prompt Conditioning and Head Selection Build Now
Unlock state-of-the-art few-shot and zero-shot image classification for Large Vision Language Models by intelligently combining their internal representations.
LVLM Classification Mar 25 Code High viability
Walma: Learning to See Memory Corruption in WebAssembly Build Now
Walma uses machine learning to detect memory corruption and tampering in WebAssembly applications, offering a practical solution for runtime integrity verification.
WebAssembly Security Mar 25 Code High viability
CarePilot: A Multi-Agent Framework for Long-Horizon Computer Task Automation in Healthcare Build Now
A multi-agent framework for automating complex, long-horizon computer tasks in healthcare, outperforming existing models.
Healthcare AI Agents Mar 25 Code High viability
Goal-Oriented Reactive Simulation for Closed-Loop Trajectory Prediction Build Now
Develops a closed-loop simulation for trajectory prediction that trains autonomous agents to react and recover from their own errors, significantly improving collision avoidance.
Autonomous Driving Prediction Mar 25 Code High viability
Linear-Nonlinear Fusion Neural Operator for Partial Differential Equations Build Now
A novel neural operator that significantly accelerates PDE solving by decoupling linear and nonlinear effects, offering faster training and comparable or better accuracy.
Scientific Computing AI Mar 25 Code High viability
Tutor-Student Reinforcement Learning: A Dynamic Curriculum for Robust Deepfake Detection Build Now
A reinforcement learning framework that dynamically optimizes deepfake detection training by prioritizing challenging samples for improved generalization.
Deepfake Detection Mar 25 Pending High viability
Spectral Scalpel: Amplifying Adjacent Action Discrepancy via Frequency-Selective Filtering for Skeleton-Based Action Segmentation Build Now
A frequency-selective filtering framework that amplifies action-specific frequencies to improve skeleton-based action segmentation accuracy and boundary definition.
Skeleton-based Action Segmentation Mar 25 Pending High viability
Accelerated Spline-Based Time-Optimal Motion Planning with Continuous Safety Guarantees for Non-Differentially Flat Systems Build Now
A novel motion planning method that significantly reduces trajectory computation time for autonomous robots by decoupling safety constraints, enabling faster and safer navigation in complex environments.
Robotics Motion Planning Mar 25 Code High viability
MedAidDialog: A Multilingual Multi-Turn Medical Dialogue Dataset for Accessible Healthcare Build Now
A multilingual, multi-turn medical dialogue dataset and model for accessible preliminary healthcare consultations, leveraging parameter-efficient fine-tuning for broad deployment.
Medical AI Mar 25 Code High viability
Reservoir-Based Graph Convolutional Networks Build Now
A novel graph convolutional network integrating reservoir computing for improved graph classification and generation, with faster convergence and reduced over-smoothing.
Graph Neural Networks Mar 25 Pending High viability
Alignment Reduces Expressed but Not Encoded Gender Bias: A Unified Framework and Study Build Now
A unified framework to analyze and mitigate gender bias in LLMs by correlating internal representations with expressed outputs, demonstrating that current debiasing methods don't fully remove latent bias.
LLM Bias Mitigation Mar 25 Code High viability
The Alignment Tax: Response Homogenization in Aligned LLMs and Its Implications for Uncertainty Estimation Build Now
This research identifies and quantifies an 'alignment tax' causing response homogenization in LLMs, proposing a novel uncertainty estimation method to improve selective prediction accuracy and reduce computational costs.
LLM Alignment & Uncertainty Mar 25 Pending High viability
Retinal Layer Segmentation in OCT Images With 2.5D Cross-slice Feature Fusion Module for Glaucoma Assessment Build Now
A 2.5D framework for accurate retinal layer segmentation in OCT images, improving glaucoma assessment by fusing cross-slice features for better consistency and robustness.
Medical AI Mar 25 Code High viability
KCLNet: Electrically Equivalence-Oriented Graph Representation Learning for Analog Circuits Build Now
KCLNet provides an electrically-aware graph representation learning framework for analog circuits, enabling significant performance improvements in critical EDA tasks like classification and subcircuit detection.
Analog Circuit AI Mar 25 Code High viability
LaDy: Lagrangian-Dynamic Informed Network for Skeleton-based Action Segmentation via Spatial-Temporal Modulation Build Now
A physics-informed neural network that leverages Lagrangian dynamics for highly accurate and discriminative skeleton-based action segmentation.
Skeleton-based Action Segmentation Mar 25 Pending High viability
LGTM: Training-Free Light-Guided Text-to-Image Diffusion Model via Initial Noise Manipulation Build Now
A training-free method to control lighting in text-to-image generation by manipulating initial noise, offering dynamic user guidance.
Generative Image Mar 25 Code High viability
Knowledge-Guided Manipulation Using Multi-Task Reinforcement Learning Build Now
A multi-task reinforcement learning framework for robotic manipulation that leverages real-time 3D scene graphs and knowledge to achieve robust generalization and sample efficiency.
Robotics Mar 25 Code High viability
When Understanding Becomes a Risk: Authenticity and Safety Risks in the Emerging Image Generation Paradigm Build Now
This research identifies and quantifies new safety risks in multimodal large language models for image generation, showing they are more prone to generating unsafe content and harder to detect than diffusion models.
AI Safety Mar 25 Code High viability
PosterIQ: A Design Perspective Benchmark for Poster Understanding and Generation Build Now
A benchmark and diagnostic tool for building generative AI that understands and creates visually compelling posters with human-centered design principles.
Generative Vision-Language Mar 25 Pending High viability
AD-Reasoning: Multimodal Guideline-Guided Reasoning for Alzheimer's Disease Diagnosis Build Now
A multimodal AI framework for Alzheimer's diagnosis that integrates neuroimaging and clinical data with explicit guideline adherence, offering transparent and accurate diagnostic reasoning.
Medical AI Mar 25 Code High viability
Mitigating Object Hallucinations in LVLMs via Attention Imbalance Rectification Build Now
A lightweight decoding-time intervention method to significantly reduce object hallucinations in Large Vision-Language Models, improving their reliability for critical applications.
LVLM Object Hallucination Mitigation Mar 25 Code High viability
Beyond Semantic Priors: Mitigating Optimization Collapse for Generalizable Visual Forensics Build Now
A novel transformer architecture that mitigates optimization collapse in visual forensics, achieving state-of-the-art generalization for deepfake detection.
Visual Forensics Mar 25 Code High viability
Hierarchical Spatial-Temporal Graph-Enhanced Model for Map-Matching Build Now
A novel hierarchical graph-enhanced model for improved map-matching performance using self-supervised and supervised learning.
Map Matching Mar 25 Pending High viability
FinToolSyn: A forward synthesis Framework for Financial Tool-Use Dialogue Data with Dynamic Tool Retrieval Build Now
A framework for generating realistic financial tool-use dialogue data to improve LLM capabilities in complex financial scenarios.
LLM Tool Use Mar 25 Code High viability
PCHC: Enabling Preference Conditioned Humanoid Control via Multi-Objective Reinforcement Learning Build Now
A reinforcement learning framework for humanoid robots that allows real-time adaptation of control objectives based on user preferences, enabling diverse and optimized behaviors.
Robotics Control Mar 25 Code High viability
LGEST: Dynamic Spatial-Spectral Expert Routing for Hyperspectral Image Classification Build Now
A novel framework for hyperspectral image classification that dynamically routes spatial-spectral features through expert networks to overcome limitations of existing methods.
Hyperspectral Image Classification Mar 25 Code High viability
HAM: A Training-Free Style Transfer Approach via Heterogeneous Attention Modulation for Diffusion Models Build Now
A training-free method for diffusion model style transfer that preserves content identity and captures complex styles, outperforming existing approaches.
Diffusion Models Mar 25 Code High viability
Minimal Sufficient Representations for Self-interpretable Deep Neural Networks Build Now
A self-interpretable neural network framework that identifies minimal representations to improve prediction accuracy and uncover human-interpretable patterns.
Interpretable AI Mar 25 Code High viability
SemLayer: Semantic-aware Generative Segmentation and Layer Construction for Abstract Icons Build Now
SemLayer reconstructs editable semantic layers from flattened vector icons, enabling advanced design workflows.
Generative Design Tools Mar 25 Code High viability
SpectralSplats: Robust Differentiable Tracking via Spectral Moment Supervision Build Now
SpectralSplats offers robust 3D object tracking by using frequency domain supervision to overcome vanishing gradients in differentiable rendering, enabling recovery from severe misalignments.
3D Vision Mar 25 Code High viability
From Oracle to Noisy Context: Mitigating Contextual Exposure Bias in Speech-LLMs Build Now
This research introduces a novel training framework to significantly improve the robustness of speech-to-LLM models against noisy and error-prone contextual information during inference, leading to more reliable real-world performance.
Speech AI Mar 25 Pending High viability
Lagrangian Relaxation Score-based Generation for Mixed Integer linear Programming Build Now
A generative framework using Lagrangian relaxation and SDEs to produce diverse, high-quality solution candidates for mixed-integer linear programming, outperforming existing ML baselines and achieving competitive optimality with exact solvers at reduced computational cost.
Optimization AI Mar 25 Code High viability
ELITE: Experiential Learning and Intent-Aware Transfer for Self-improving Embodied Agents Build Now
An embodied agent framework that continuously learns from environment interaction to improve task execution and generalize to new tasks.
Embodied Agents Mar 25 Code High viability
COVTrack++: Learning Open-Vocabulary Multi-Object Tracking from Continuous Videos via a Synergistic Paradigm Build Now
A novel open-vocabulary multi-object tracking system that learns from continuous video data and synergistically combines detection and association for improved performance on unseen objects.
Multi-Object Tracking Mar 25 Code High viability
Language-Grounded Multi-Agent Planning for Personalized and Fair Participatory Urban Sensing Build Now
A multi-agent LLM framework that personalizes urban sensing tasks for participants, improving satisfaction and fairness.
Agents Mar 25 Code High viability
CVPD at QIAS 2026: RAG-Guided LLM Reasoning for Al-Mawarith Share Computation and Heir Allocation Build Now
A RAG-powered AI system for accurate and reliable Islamic inheritance calculations and heir allocation, outperforming state-of-the-art.
Legal AI Mar 25 Code High viability
UW-VOS: A Large-Scale Dataset for Underwater Video Object Segmentation Build Now
A new large-scale dataset and parameter-efficient framework for underwater video object segmentation that significantly outperforms existing methods.
Computer Vision Mar 25 Code High viability
PAC-DP: Personalized Adaptive Clipping for Differentially Private Federated Learning Build Now
A personalized adaptive clipping framework for federated learning that significantly improves accuracy and convergence speed while maintaining differential privacy.
Federated Learning Mar 25 Code High viability
Stochastic Dimension-Free Zeroth-Order Estimator for High-Dimensional and High-Order PINNs Build Now
A novel zeroth-order optimization framework for training high-dimensional Physics-Informed Neural Networks with significantly reduced memory and computational complexity.
Scientific Computing / PINNs Mar 25 Code High viability
Sparse Growing Transformer: Training-Time Sparse Depth Allocation via Progressive Attention Looping Build Now
A novel training framework that dynamically allocates computational depth in Transformers to reduce training FLOPs by up to 19% while improving performance.
LLM Training Mar 25 Code High viability
HGGT: Robust and Flexible 3D Hand Mesh Reconstruction from Uncalibrated Images Build Now
A feed-forward architecture that reconstructs 3D hand meshes and camera poses from uncalibrated images, outperforming state-of-the-art and generalizing to real-world scenarios.
3D Reconstruction Mar 25 Code High viability
MIRROR: Visual Motion Imitation via Real-time Retargeting and Teleoperation with Parallel Differential Inverse Kinematics Watch
Enabling real-time, safe humanoid robot teleoperation through advanced inverse kinematics and visual pose estimation.
Robotics Mar 25 High viability
From Untamed Black Box to Interpretable Pedagogical Orchestration: The Ensemble of Specialized LLMs Architecture for Adaptive Tutoring Watch
An ensemble of specialized LLMs with a rule-based orchestrator and interpretable student model for reliable, controllable, and efficient adaptive tutoring.
Adaptive Tutoring Mar 25 High viability
CoCR-RAG: Enhancing Retrieval-Augmented Generation in Web Q&A via Concept-oriented Context Reconstruction Build Now
A RAG framework that reconstructs multi-source web documents at a concept level to improve factual consistency and answer quality in Q&A systems.
Retrieval-Augmented Generation (RAG) Mar 25 Code High viability
CAKE: Real-time Action Detection via Motion Distillation and Background-aware Contrastive Learning Build Now
A real-time action detection system that distills motion knowledge into RGB models, achieving state-of-the-art performance on a single CPU.
Computer Vision Mar 25 Code High viability
Can we generate portable representations for clinical time series data using LLMs? Build Now
Leverage LLMs to create portable patient embeddings from clinical time series data, enabling scalable deployment of predictive models across hospitals with reduced retraining.
Medical AI Mar 25 Code High viability
Diet Your LLM: Dimension-wise Global Pruning of LLMs via Merging Task-specific Importance Score Build Now
A training-free method to efficiently prune LLMs by merging task-specific importance scores, improving accuracy without retraining.
LLM Optimization Mar 25 Pending High viability
SafeFlow: Real-Time Text-Driven Humanoid Whole-Body Control via Physics-Guided Rectified Flow and Selective Safety Gating Build Now
A real-time text-driven humanoid control system that generates physically feasible and safe motion trajectories by integrating physics-guided generation with a multi-stage safety gate.
Humanoid Robotics Mar 25 Code High viability
SilLang: Improving Gait Recognition with Silhouette Language Encoding Build Now
Leveraging Large Language Models to improve pedestrian gait recognition by encoding binary silhouettes into a language-like representation.
Biometric AI Mar 25 Code High viability
HyDRA: Hybrid Domain-Aware Robust Architecture for Heterogeneous Collaborative Perception Build Now
A novel architecture for collaborative perception that robustly handles agent heterogeneity without retraining, enabling scalable and cost-effective multi-agent systems.
Collaborative Perception Mar 25 Code High viability
SLAT-Phys: Fast Material Property Field Prediction from Structured 3D Latents Build Now
Predict material properties of 3D assets from a single image 120x faster than existing methods, enabling real-time physics simulation and digital twins.
3D Asset Material Property Prediction Mar 25 Code High viability
Grounding Arabic LLMs in the Doha Historical Dictionary: Retrieval-Augmented Understanding of Quran and Hadith Build Now
A retrieval-augmented generation framework that grounds Arabic LLMs in historical lexicographic data to significantly improve understanding of religious texts.
LLM Grounding Mar 25 Pending High viability
The Price Reversal Phenomenon: When Cheaper Reasoning Models End Up Costing More Build Now
A tool that predicts and optimizes LLM inference costs by analyzing token consumption, addressing the 'price reversal phenomenon' where cheaper listed models can cost more.
LLM Cost Optimization Mar 25 Code High viability
Policy-Guided Threat Hunting: An LLM enabled Framework with Splunk SOC Triage Build Now
An LLM-powered framework integrated with Splunk to automate and prioritize threat hunting for overwhelmed SOC analysts.
Cybersecurity AI Mar 25 Code High viability
MonoSIM: An open source SIL framework for Ackermann Vehicular Systems with Monocular Vision Build Now
An open-source simulation platform for developing and testing autonomous vehicle control systems using monocular vision.
Autonomous Driving Simulation Mar 25 Pending High viability
PointRFT: Explicit Reinforcement Fine-tuning for Point Cloud Few-shot Learning Build Now
A novel reinforcement fine-tuning method for 3D point cloud models that significantly improves performance in data-scarce scenarios.
3D Point Cloud Learning Mar 25 Code High viability
SynMVCrowd: A Large Synthetic Benchmark for Multi-view Crowd Counting and Localization Build Now
A large synthetic benchmark and baseline methods for more practical multi-view crowd counting and localization, advancing research towards real-world applications.
Computer Vision Mar 25 Pending High viability
From AI Assistant to AI Scientist: Autonomous Discovery of LLM-RL Algorithms with LLM Agents Build Now
Automate the discovery of improved policy optimization algorithms for language models, significantly boosting performance on complex reasoning tasks.
LLM Algorithm Discovery Mar 25 Code High viability
Argument Mining as a Text-to-Text Generation Task Build Now
A text-to-text generation approach for argument mining that simplifies the process and achieves state-of-the-art results.
Argument Mining Mar 25 Code High viability
Variable-Length Audio Fingerprinting Build Now
A novel deep learning method for audio fingerprinting that handles variable-length audio, outperforming existing state-of-the-art on real-world datasets.
Audio AI Mar 25 Code High viability
OmniACBench: A Benchmark for Evaluating Context-Grounded Acoustic Control in Omni-Modal Models Build Now
A benchmark and analysis tool to evaluate and improve the context-aware speech generation capabilities of omni-modal AI models.
Omni-modal AI Mar 25 Code High viability
Dialogue to Question Generation for Evidence-based Medical Guideline Agent Development Build Now
An AI assistant that generates targeted, evidence-based questions during physician-patient encounters to improve the implementation of medical guidelines.
Medical AI Mar 25 Code High viability
Revealing Multi-View Hallucination in Large Vision-Language Models Build Now
A novel decoding technique that significantly reduces multi-view hallucination in vision-language models by suppressing visual interference.
Vision-Language Models Mar 25 Code High viability
DP^2-VL: Private Photo Dataset Protection by Data Poisoning for Vision-Language Models Build Now
A framework to protect private photos from being exploited by vision-language models through data poisoning, preventing identity-affiliation leakage.
Vision-Language Models Mar 25 Code High viability
DepthArb: Training-Free Depth-Arbitrated Generation for Occlusion-Robust Image Synthesis Build Now
A training-free framework that improves occlusion accuracy in text-to-image generation by arbitrating object attention, enhancing compositional capabilities of existing diffusion models.
Generative Image Mar 25 Code High viability
DecepGPT: Schema-Driven Deception Detection with Multicultural Datasets and Robust Multimodal Learning Build Now
A multimodal deception detection system that provides auditable reports and robust cross-cultural generalization, powered by novel reasoning datasets and advanced representation learning.
Deception Detection Mar 25 Code High viability
Attention-aware Inference Optimizations for Large Vision-Language Models with Memory-efficient Decoding Build Now
Optimize large vision-language models for faster and more memory-efficient inference, enabling longer contexts and higher throughput.
LLM Inference Optimization Mar 25 Code High viability
Self-Distillation for Multi-Token Prediction Build Now
Accelerate LLM inference by up to 220% with a self-distillation method that improves multi-token prediction accuracy and efficiency.
LLM Inference Optimization Mar 25 Code High viability
Robust Multilingual Text-to-Pictogram Mapping for Scalable Reading Rehabilitation Watch
An AI-powered multilingual interface that automatically enhances text with contextually relevant pictograms to support reading comprehension for children with special educational needs.
Educational AI Mar 25
VOLMO: Versatile and Open Large Models for Ophthalmology Watch
A framework for building and deploying ophthalmology-specific multimodal large language models that outperform existing general and medical models.
Medical AI Mar 25
The Stochastic Gap: A Markovian Framework for Pre-Deployment Reliability and Oversight-Cost Auditing in Agentic Artificial Intelligence Watch
A Markovian framework to audit the reliability and oversight costs of agentic AI in enterprise workflows.
Agentic AI Mar 25 Code
Evaluating Chunking Strategies For Retrieval-Augmented Generation in Oil and Gas Enterprise Documents Watch
Optimizing document chunking for Retrieval-Augmented Generation in specialized enterprise domains like oil and gas to improve information retrieval accuracy and reduce computational costs.
RAG Optimization Mar 25 Code
Analysing the Safety Pitfalls of Steering Vectors Watch
This research uncovers how steering LLM behavior can inadvertently increase vulnerability to jailbreak attacks, highlighting a critical trade-off between controllability and safety.
LLM Safety Mar 25 Code
Representation Learning to Study Temporal Dynamics in Tutorial Scaffolding Watch
A new method to analyze and measure adaptive scaffolding in tutoring dialogues by aligning semantic content, applicable to both human and AI tutors.
Educational AI Mar 25 Code
Towards Safe Learning-Based Non-Linear Model Predictive Control through Recurrent Neural Network Modeling Watch
A novel sequential neural policy for safer and more efficient non-linear model predictive control, reducing computational burden and improving feasibility.
Robotics Control Mar 25 Code
Conformalized Transfer Learning for Li-ion Battery State of Health Forecasting under Manufacturing and Usage Variability Watch
An uncertainty-aware transfer learning framework for accurate and reliable lithium-ion battery state-of-health forecasting.
Battery AI Mar 25 Code
Learning Response-Statistic Shifts and Parametric Roll Episodes from Wave--Vessel Time Series via LSTM Functional Models Watch
A data-driven surrogate using LSTMs to predict and analyze critical ship roll instabilities from wave data.
Maritime AI Mar 25 Code
PINGALA: Prosody-Aware Decoding for Sanskrit Poetry Generation Watch
A decoding approach for Sanskrit poetry generation that improves semantic coherence and metrical adherence by segmenting verses and using phonetically aware transliteration.
LLM Fine-tuning Mar 25
Real Talk, Virtual Faces: A Formal Concept Analysis of Personality and Sentiment in Influencer Audiences Watch
A framework for analyzing the co-occurrence of sentiment, personality, and topics in online discourse to understand audience reactions to virtual influencers.
Social Media Analysis Mar 25 Code
Exploring How Fair Model Representations Relate to Fair Recommendations Watch
Develops novel methods to measure and improve fairness in recommender systems by analyzing recommendation parity directly, rather than relying on representation-level proxies.
Recommender Systems Mar 25 Code
Causal Transfer in Medical Image Analysis Watch
A survey proposing Causal Transfer Learning to build more robust and generalizable AI models for medical image analysis, addressing domain shift issues.
Medical AI Mar 25 Code
Towards Reward Modeling for AI Tutors in Math Mistake Remediation Watch
Develops a specialized reward model for AI math tutors to improve pedagogical quality and mistake remediation.
AI Tutors Mar 25
A Sensorless, Inherently Compliant Anthropomorphic Musculoskeletal Hand Driven by Electrohydraulic Actuators Watch
A novel musculoskeletal robotic hand design that uses electrohydraulic actuators for inherent compliance and self-sensing, eliminating the need for external sensors for safe manipulation.
Robotics Mar 25 Code
Large Language Model Guided Incentive Aware Reward Design for Cooperative Multi-Agent Reinforcement Learning Watch
Automated reward design for cooperative multi-agent systems using LLMs to improve coordination and task performance.
Multi-Agent RL Mar 25 Code
Bridging Biological Hearing and Neuromorphic Computing: End-to-End Time-Domain Audio Signal Processing with Reservoir Computing Watch
A novel reservoir computing approach simplifies audio signal processing for real-time, energy-efficient speech analysis in embedded systems.
Neuromorphic Audio Processing Mar 25 Code
Accelerating Diffusion-based Video Editing via Heterogeneous Caching: Beyond Full Computing at Sampled Denoising Timestep Watch
A training-free framework to accelerate diffusion-based video editing by intelligently caching context tokens, reducing redundant computations without sacrificing quality.
Diffusion Video Editing Acceleration Mar 25
Embracing Heteroscedasticity for Probabilistic Time Series Forecasting Watch
A probabilistic time series forecasting framework that explicitly models time-varying variance for more robust predictions and uncertainty quantification.
Time Series Forecasting Mar 25 Code
Variation is the Norm: Embracing Sociolinguistics in NLP Watch
A framework to integrate sociolinguistic variation into NLP models, improving robustness and performance on evolving languages.
Sociolinguistics in NLP Mar 25 Code
Towards Remote Attestation of Microarchitectural Attacks: The Case of Rowhammer Watch
A remote attestation protocol to detect Rowhammer attacks using commodity hardware features and TPMs.
Hardware Security Mar 25 Code
Heuristic-inspired Reasoning Priors Facilitate Data-Efficient Referring Object Detection Watch
A framework that injects heuristic reasoning priors to improve data efficiency in referring object detection for low-data environments.
Referring Object Detection Mar 25 Code
A convergent Plug-and-Play Majorization-Minimization algorithm for Poisson inverse problems Watch
A plug-and-play algorithm for Poisson inverse problems that leverages pre-trained denoisers for improved performance in medical imaging applications.
Medical AI Mar 25 Code
Efficient Controller Learning from Human Preferences and Numerical Data Via Multi-Modal Surrogate Models Watch
A framework for efficiently tuning control policies by combining numerical data with human preferences using multi-modal surrogate models.
Control Systems Optimization Mar 25
Likelihood hacking in probabilistic program synthesis Watch
A language-level safety constraint for probabilistic programming languages that prevents models from generating invalid programs, demonstrated with a modification to Stan.
Probabilistic Programming Mar 25
Granular Ball Guided Stable Latent Domain Discovery for Domain-General Crowd Counting Watch
A novel framework for more stable and accurate crowd counting across different environments by discovering and leveraging latent domains.
Computer Vision Mar 25 Code
Causality-Driven Disentangled Representation Learning in Multiplex Graphs Watch
A causal inference framework to disentangle shared and private information in multiplex graphs for improved representation learning.
Graph Representation Learning Mar 25 Code
ConceptKT: A Benchmark for Concept-Level Deficiency Prediction in Knowledge Tracing Watch
A new benchmark and method for diagnosing specific conceptual misunderstandings in students, going beyond simple correctness prediction in educational AI.
Educational AI Mar 25 Code
i-IF-Learn: Iterative Feature Selection and Unsupervised Learning for High-Dimensional Complex Data Watch
An iterative unsupervised framework that jointly performs feature selection and clustering for high-dimensional complex data, outperforming classical and deep baselines.
Unsupervised Feature Selection Mar 25 Code
DB SwinT: A Dual-Branch Swin Transformer Network for Road Extraction in Optical Remote Sensing Imagery Watch
A dual-branch Swin Transformer network for more accurate road extraction in optical remote sensing imagery.
Computer Vision Mar 25 Code
Robust Distributed Cooperative Path-Following and Local Replanning for Multi-UAVs Under Differentiated Low-Altitude Paths Watch
Enables multiple drones to cooperatively follow complex 3D paths in low-altitude airspace, overcoming disturbances and obstacles with real-time replanning.
Robotics Mar 25 Code
From Pixels to Digital Agents: An Empirical Study on the Taxonomy and Technological Trends of Reinforcement Learning Environments Watch
This research provides a data-driven taxonomy and analysis of reinforcement learning environments, identifying a paradigm shift towards LLM-driven agents and domain-specific generalization to guide the design of next-generation simulators.
Reinforcement Learning Environments Mar 25 Code
GRMLR: Knowledge-Enhanced Small-Data Learning for Deep-Sea Cold Seep Stage Inference Watch
A knowledge-enhanced machine learning model for accurate deep-sea cold seep stage inference using microbial data, reducing reliance on costly visual surveys.
Bioinformatics AI Mar 25 Code
Event-Driven Proactive Assistive Manipulation with Grounded Vision-Language Planning Watch
A robot system that proactively assists human manipulation by inferring goals from observed state changes, rather than waiting for explicit instructions.
Robotics Mar 25 Code
ORACLE: Orchestrate NPC Daily Activities using Contrastive Learning with Transformer-CVAE Watch
Generate realistic and varied NPC daily activity plans for immersive digital environments using a novel generative model.
NPC Behavior Generation Mar 25 Code
Comparing Developer and LLM Biases in Code Evaluation Ignore
A framework to evaluate LLM judges for code applications by identifying systematic biases and misalignment with human preferences.
LLM Evaluation Mar 25
Retrieval Improvements Do Not Guarantee Better Answers: A Study of RAG for AI Policy QA Ignore
This research investigates the limitations of RAG systems in policy analysis, finding that component improvements don't guarantee better answers.
RAG Systems Mar 25
POLY-SIM: Polyglot Speaker Identification with Missing Modality Grand Challenge 2026 Evaluation Plan Ignore
Developing robust multimodal speaker identification systems that perform well even with missing visual data and across different languages.
Multimodal AI Mar 25 Code
Infrastructure for Valuable, Tradable, and Verifiable Agent Memory Ignore
We propose a system to make autonomous agent memories tradable assets by binding them to verifiable computational provenance and creating a market layer for their exchange.
Agent Infrastructure Mar 25
The role of spatial context and multitask learning in the detection of organic and conventional farming systems based on Sentinel-2 time series Ignore
A Vision Transformer model leverages Sentinel-2 time series and spatial context to discriminate between organic and conventional farming systems, with varying accuracy across crop types.
Agricultural AI Mar 25 Code
A Sociolinguistic Analysis of Automatic Speech Recognition Bias in Newcastle English Ignore
This research analyzes bias in Automatic Speech Recognition systems for Newcastle English, revealing socially patterned errors and advocating for dialect-aware development.
ASR Bias Analysis Mar 25 Code
No Single Metric Tells the Whole Story: A Multi-Dimensional Evaluation Framework for Uncertainty Attributions Ignore
A new framework for evaluating AI uncertainty attribution methods to enable more reliable and comparable XAI development.
Explainable AI (XAI) Mar 25 Code
Design, Modelling and Characterisation of a Miniature Fibre-Reinforced Soft Bending Actuator for Endoluminal Interventions Ignore
Develops a miniature, fiber-reinforced soft bending actuator for endoluminal robotic interventions, demonstrating significant bending angles and structural integrity.
Robotics Mar 25 Code
Integrating Causal Machine Learning into Clinical Decision Support Systems: Insights from Literature and Practice Ignore
Develops design principles for causal machine learning-powered clinical decision support systems to enhance trust and collaboration.
Medical AI Mar 25
Enes Causal Discovery Ignore
A mixture-of-experts architecture for causal discovery from observational data, aiming to overcome limitations of existing methods.
Causal Discovery Mar 25 Code
What and When to Learn: CURriculum Ranking Loss for Large-Scale Speaker Verification Ignore
A novel curriculum learning loss function for large-scale speaker verification that significantly improves accuracy by adaptively ranking sample difficulty.
Speaker Verification Mar 25
Continuous-Time Learning of Probability Distributions: A Case Study in a Digital Trial of Young Children with Type 1 Diabetes Ignore
A probabilistic framework models continuous-time evolution of glucose distributions for improved diabetes monitoring and treatment analysis.
Medical AI Mar 25
Teacher-Student Diffusion Model for Text-Driven 3D Hand Motion Generation Ignore
A teacher-student diffusion model for generating realistic 3D hand motions from text, improving VR and robotics applications.
3D Motion Generation Mar 25
Federated fairness-aware classification under differential privacy Ignore
A novel algorithm for federated classification that simultaneously ensures privacy and fairness, with theoretical guarantees and experimental validation.
Privacy-Preserving ML Mar 25 Code
Improving Lean4 Autoformalization via Cycle Consistency Fine-tuning Ignore
This research explores a novel reinforcement learning approach for translating natural language mathematics into formal proof language, demonstrating improved performance over supervised methods.
Formalization AI Mar 25
A Neuro-Symbolic System for Interpretable Multimodal Physiological Signals Integration in Human Fatigue Detection Ignore
A neuro-symbolic system integrates physiological signals for interpretable human fatigue detection, offering insights into model reasoning.
Human Fatigue Detection Mar 25
Toward Generalist Neural Motion Planners for Robotic Manipulators: Challenges and Opportunities Ignore
Developing generalist neural motion planners to overcome limitations in cluttered robotic manipulation environments.
Robotics Motion Planning Mar 25 Code
CGRL: Causal-Guided Representation Learning for Graph Out-of-Distribution Generalization Ignore
A novel approach to improve Graph Neural Network generalization on out-of-distribution data by integrating causal representation learning and loss replacement.
Graph Neural Networks Mar 25 Code
A Large-Scale Study of Telegram Bots Ignore
A large-scale study and dataset of Telegram bots to identify and combat illicit activities, providing insights for content moderators and researchers.
Bot Analysis Mar 25 Code
AMIF: Authorizable Medical Image Fusion Model with Built-in Authentication Ignore
A medical image fusion model that embeds copyright identifiers and requires authentication for high-quality outputs, protecting intellectual property.
Medical AI Mar 25
Cost-Sensitive Neighborhood Aggregation for Heterophilous Graphs: When Does Per-Edge Routing Help? Ignore
A GNN layer that intelligently routes messages based on edge type to improve performance on specific types of heterophilous graphs.
Graph Neural Networks Mar 25 Pending
Software Supply Chain Smells: Lightweight Analysis for Secure Dependency Management Ignore
A tool to detect structural indicators of security risks in software dependencies.
Software Supply Chain Security Mar 25
Semantic Alignment across Ancient Egyptian Language Stages via Normalization-Aware Multitask Learning Ignore
A novel multitask learning approach for semantic alignment across historical language stages, providing a baseline and guidance for modeling under data constraints.
NLP - Historical Languages Mar 25 Code
Semantic Centroids and Hierarchical Density-Based Clustering for Cross-Document Software Coreference Resolution Ignore
A system for resolving inconsistent software mentions across scientific documents using semantic embeddings and density-based clustering.
NLP Coreference Resolution Mar 25
DVM: Real-Time Kernel Generation for Dynamic AI Models Ignore
A real-time compiler for dynamic AI models that significantly speeds up compilation and improves model efficiency.
AI Model Compilation Mar 25
Where Do Your Citations Come From? Citation-Constellation: A Free, Open-Source, No-Code, and Auditable Tool for Citation Network Decomposition with Complementary BARON and HEROCON Scores Ignore
A no-code tool for analyzing citation networks to understand the origin of scholarly influence.
Bibliometric Analysis Mar 25
Invisible Threats from Model Context Protocol: Generating Stealthy Injection Payload via Tree-based Adaptive Search Ignore
A novel black-box attack method that generates stealthy injection payloads to compromise LLM agents by treating payload generation as a tree-structured search problem.
LLM Security Mar 25
A Deep Dive into Scaling RL for Code Generation with Synthetic Data and Curricula Ignore
A novel synthetic data generation pipeline for Reinforcement Learning to improve large language models for code and math tasks.
LLM Training Mar 25
Mixed-signal implementation of feedback-control optimizer for single-layer Spiking Neural Networks Ignore
Enabling on-chip learning for neuromorphic systems with a feedback-control optimizer.
Neuromorphic Computing Mar 25 Code
Bridging the Evaluation Gap: Standardized Benchmarks for Multi-Objective Search Ignore
A standardized benchmark suite for multi-objective search to enable robust and reproducible cross-study comparisons.
Search Algorithms Mar 25 Code
The impact of sensor placement on graph-neural-network-based leakage detection Ignore
Optimize sensor placement for graph neural networks to significantly improve leakage detection in water distribution networks.
Water Leakage Detection Mar 25 Code
Enhanced Mycelium of Thought (EMoT): A Bio-Inspired Hierarchical Reasoning Architecture with Strategic Dormancy and Mnemonic Encoding Ignore
A bio-inspired hierarchical reasoning framework for LLMs that uses strategic dormancy and mnemonic encoding to improve complex, multi-domain problem-solving.
LLM Reasoning Architectures Mar 25 Code
MoE-Sieve: Routing-Guided LoRA for Efficient MoE Fine-Tuning Ignore
A routing-guided framework for LoRA fine-tuning of Mixture-of-Experts models that significantly reduces trainable parameters and training time by focusing on the most activated experts.
LLM Fine-tuning Mar 25
Decompose and Transfer: CoT-Prompting Enhanced Alignment for Open-Vocabulary Temporal Action Detection Ignore
A framework for open-vocabulary temporal action detection that uses LLM reasoning to decompose actions into phases for better generalization to unseen actions.
Computer Vision Mar 25 Code
Forensic Implications of Localized AI: Artifact Analysis of Ollama, LM Studio, and llama.cpp Ignore
This research provides forensic investigators with the tools and methodologies to uncover digital evidence from local LLM runners, addressing a critical blind spot in digital investigations.
Digital Forensics AI Mar 25
Kirchhoff-Inspired Neural Networks for Evolving High-Order Perception Ignore
A novel neural network architecture inspired by Kirchhoff's laws for improved data representation and physical consistency in tasks like PDE solving and image classification.
Physics-Informed Neural Networks Mar 25
High-Fidelity Face Content Recovery via Tamper-Resilient Versatile Watermarking Ignore
A watermarking framework for copyright protection, manipulation localization, and high-fidelity face content recovery to combat deepfakes.
Digital Watermarking Mar 25
An Empirical Analysis of Google Play Data Safety Disclosures: A Consistency Study of Privacy Indicators in Mobile Gaming Apps Ignore
This research analyzes the consistency of Google Play's Data Safety disclosures against actual app behavior, revealing significant inconsistencies in privacy reporting that necessitate improved validation mechanisms.
App Privacy Analysis Mar 25 Code
Completeness of Unbounded Best-First Minimax and Descent Minimax Ignore
This paper theoretically analyzes and experimentally validates improvements to minimax search algorithms for two-player perfect information games, enhancing their ability to find winning strategies.
Game AI Mar 25
Trust Region Constrained Bayesian Optimization with Penalized Constraint Handling Ignore
A theoretical Bayesian optimization method for high-dimensional constrained problems.
Optimization Mar 25
The Free-Market Algorithm: Self-Organizing Optimization for Open-Ended Complex Systems Ignore
A novel metaheuristic inspired by free-market economics for open-ended complex system optimization.
Optimization Algorithms Mar 25
Why Does Self-Distillation (Sometimes) Degrade the Reasoning Capability of LLMs? Ignore
This paper investigates why self-distillation can harm LLM reasoning, particularly in mathematical tasks, by suppressing uncertainty expression.
LLM Reasoning Mar 25
Neural Network Models for Contextual Regression Ignore
A novel neural network architecture for contextual regression that improves efficiency and interpretability by separating context identification from context-specific regression.
Regression Models Mar 25
On the Use of Bagging for Local Intrinsic Dimensionality Estimation Ignore
A theoretical framework for improving local intrinsic dimensionality estimation using bagging to reduce variance and mean squared error.
Data Mining Mar 25
Evidence of an Emergent "Self" in Continual Robot Learning Ignore
This research explores a theoretical framework for identifying emergent 'self' concepts in continually learning robots by analyzing invariant subnetworks.
Robotics AI Mar 25
The Specification Gap: Coordination Failure Under Partial Knowledge in Code Agents Ignore
This research explores the coordination challenges faced by multiple LLM-based code agents when implementing shared code components, highlighting the critical role of detailed specifications in achieving integration accuracy.
Agents Mar 25
Optimizing Multilingual LLMs via Federated Learning: A Study of Client Language Composition Ignore
This paper explores optimizing multilingual LLMs using federated learning by analyzing the impact of client language composition on model performance and fairness.
LLM Training Mar 25
Stance Labels Fail When They Matter Most: The Projection Problem in Stance Detection Ignore
This research identifies a fundamental flaw in how stance detection is currently performed, showing that existing methods fail when attitudes are complex and multi-dimensional.
NLP - Stance Detection Mar 25
Identification of NMF by choosing maximum-volume basis vectors Ignore
A new NMF framework that makes basis vectors as distinct as possible to improve interpretability and handle highly mixed data.
Matrix Factorization Mar 25 Code
A visual observation on the geometry of UMAP projections of the difference vectors of antonym and synonym word pair embeddings Ignore
Analyzing the geometric properties of word embeddings to understand antonym relationships.
LLM Embeddings Analysis Mar 25 Pending
Equivariant Filter Transformations for Consistent and Efficient Visual--Inertial Navigation Ignore
A theoretical framework for improving visual-inertial navigation through equivariant filter transformations.
Robotics and Navigation Mar 25
Combi-CAM: A Novel Multi-Layer Approach for Explainable Image Geolocalization Ignore
Enhancing the explainability of image geolocalization models by combining activation maps from multiple network layers.
Computer Vision Mar 25
Toward a Multi-Layer ML-Based Security Framework for Industrial IoT Ignore
A lightweight, ML-based security framework for Industrial IoT that predicts and mitigates network condition impacts on trust convergence.
Industrial IoT Security Mar 25
Towards Effective Experiential Learning: Dual Guidance for Utilization and Internalization Ignore
A new framework for LLM training that aims to improve reasoning by better utilizing and internalizing experience during reinforcement learning.
LLM Training Mar 25
Understanding the Challenges in Iterative Generative Optimization with LLMs Ignore
This paper identifies key challenges in setting up iterative generative optimization loops with LLMs, offering guidance for practitioners but not presenting a ready-to-deploy solution.
LLM Agents Mar 25
Transcending Classical Neural Network Boundaries: A Quantum-Classical Synergistic Paradigm for Seismic Data Processing Ignore
A quantum-classical generative adversarial network for seismic data processing that leverages quantum mechanics to overcome the representational limitations of traditional neural networks.
Quantum AI Mar 25 Code
Wireless communication empowers online scheduling of partially-observable transportation multi-robot systems in a smart factory Ignore
A novel framework integrates wireless communication with multi-robot task assignment and route scheduling for enhanced efficiency in smart factories.
Robotics & Logistics Mar 25
On Gossip Algorithms for Machine Learning with Pairwise Objectives Ignore
Develops theoretical convergence guarantees for gossip algorithms applied to pairwise machine learning objectives in distributed sensor networks.
Distributed Learning Mar 25
Optimal Variance-Dependent Regret Bounds for Infinite-Horizon MDPs Ignore
Develops theoretical optimal regret bounds for infinite-horizon reinforcement learning problems.
Reinforcement Learning Theory Mar 25
Elements of Conformal Prediction for Statisticians Ignore
A theoretical overview of conformal prediction methods for statistical inference.
Statistical Inference Mar 25
From Liar Paradox to Incongruent Sets: A Normal Form for Self-Reference Ignore
A formal framework for understanding self-reference in language, offering a structural basis for semantic knowledge.
Formal Semantics Mar 25
Uniform Laws of Large Numbers in Product Spaces Ignore
This paper theoretically investigates uniform convergence phenomena in product spaces, extending Vapnik--Chervonenkis theory with a focus on linear VC dimension.
Theoretical ML Mar 25