SAMA: Factorized Semantic Anchoring and Motion Alignment for Instruction-Guided Video Editing Build Now
A tool for fast and semantic control over video editing using natural language instructions.
Video Editing Mar 19 Code High viability
SEAR: Simple and Efficient Adaptation of Visual Geometric Transformers for RGB+Thermal 3D Reconstruction Build Now
SEAR enhances 3D reconstruction by integrating RGB and thermal imaging, optimizing visual geometric transformers for practical applications.
3D Reconstruction Mar 19 Pending High viability
HORNet: Task-Guided Frame Selection for Video Question Answering with Vision-Language Models Build Now
HORNet is a lightweight frame selection policy that significantly reduces video processing time and improves answer quality for vision-language models in video question answering tasks.
Video Question Answering Mar 19 Pending High viability
Click-to-Ask: An AI Live Streaming Assistant with Offline Copywriting and Online Interactive QA Build Now
An AI assistant that automates product copywriting and provides real-time Q&A for live streamers, boosting sales and engagement.
Live Streaming AI Mar 19 Code High viability
Agentic Flow Steering and Parallel Rollout Search for Spatially Grounded Text-to-Image Generation Build Now
A closed-loop framework that uses a VLM critic to steer text-to-image generation for improved spatial accuracy and state-of-the-art results.
Text-to-Image Generation Mar 19 Code High viability
Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding Build Now
Leverage implicit 3D knowledge from video generation models to enhance multimodal LLMs for spatial reasoning and embodied tasks.
3D Scene Understanding Mar 19 Pending High viability
Matryoshka Gaussian Splatting Build Now
A novel training framework for 3D Gaussian Splatting that enables continuous level-of-detail rendering from a single model without sacrificing quality, offering a smooth speed-quality trade-off.
3D Rendering Mar 19 Code High viability
Not All Features Are Created Equal: A Mechanistic Study of Vision-Language-Action Models Build Now
This research provides a mechanistic understanding of how Vision-Language-Action models generate robot actions, revealing key insights into their internal representations and offering tools for interactive exploration.
Robotics Mar 19 Code High viability
Cubic Discrete Diffusion: Discrete Visual Generation on High-Dimensional Representation Tokens Build Now
A novel discrete diffusion model for high-dimensional visual generation that unifies understanding and generation tasks, with code available.
Generative Vision Mar 19 Pending High viability
MonoArt: Progressive Structural Reasoning for Monocular Articulated 3D Reconstruction Build Now
A unified framework for stable and efficient articulated 3D object reconstruction from single images, outperforming existing methods in accuracy and speed.
3D Reconstruction Mar 19 Code High viability
NavTrust: Benchmarking Trustworthiness for Embodied Navigation Build Now
A benchmark and mitigation strategies for building more trustworthy embodied navigation agents that can withstand real-world corruptions.
Embodied AI Mar 19 Code High viability
Bridging Semantic and Kinematic Conditions with Diffusion-based Discrete Motion Tokenizer Build Now
A novel diffusion-based tokenizer bridges semantic understanding and precise kinematic control for realistic motion generation.
Generative Motion Mar 19 Pending High viability
Under One Sun: Multi-Object Generative Perception of Materials and Illumination Build Now
A generative inverse rendering method that disentangles object materials and illumination from a single image, enabling realistic scene reconstruction.
Generative Perception Mar 19 Code High viability
FinTradeBench: A Financial Reasoning Benchmark for LLMs Build Now
A new benchmark for evaluating LLM financial reasoning, integrating company fundamentals and trading signals, with available code and a clear market need.
Financial AI Mar 19 Code High viability
EffectErase: Joint Video Object Removal and Insertion for High-Quality Effect Erasing Build Now
A novel method and dataset for high-quality video object removal that also eliminates visual effects like shadows and reflections.
Video Editing Mar 19 Code High viability
Online Learning and Equilibrium Computation with Ranking Feedback Build Now
Develops online learning algorithms that use ranking feedback instead of numerical utilities, enabling applications in human-in-the-loop systems and privacy-sensitive scenarios, with demonstrated effectiveness in LLM routing.
Online Learning Mar 19 Code High viability
Nemotron-Cascade 2: Post-Training LLMs with Cascade RL and Multi-Domain On-Policy Distillation Build Now
A compact, high-performance LLM with strong reasoning and agentic capabilities, trained using novel cascade reinforcement learning and distillation techniques, with open weights and training data.
LLM Training Mar 19 Code High viability
DriveTok: 3D Driving Scene Tokenization for Unified Multi-View Reconstruction and Understanding Build Now
DriveTok provides a unified 3D scene tokenization method for efficient multi-view reconstruction and understanding in autonomous driving.
3D Computer Vision Mar 19 Pending High viability
Rethinking Vector Field Learning for Generative Segmentation Build Now
A novel vector field reshaping strategy for diffusion models significantly improves generative segmentation performance by addressing gradient vanishing and class separation issues.
Generative Segmentation Mar 19 Code High viability
LVOmniBench: Pioneering Long Audio-Video Understanding Evaluation for Omnimodal LLMs Build Now
A new benchmark and dataset for evaluating large language models on long-form audio-visual understanding, addressing a critical gap in current AI capabilities.
Omnimodal LLM Evaluation Mar 19 Code High viability
DreamPartGen: Semantically Grounded Part-Level 3D Generation via Collaborative Latent Denoising Build Now
Generate semantically grounded, part-aware 3D objects from text by jointly modeling part geometry, appearance, and inter-part relationships.
Generative 3D Mar 19 Code High viability
Do VLMs Need Vision Transformers? Evaluating State Space Models as Vision Encoders Build Now
This research proposes state space models as a more efficient and competitive alternative to Transformer-based vision encoders for large vision-language models, offering improved performance and robustness at a smaller scale.
Vision-Language Models Mar 19 Pending High viability
RPiAE: A Representation-Pivoted Autoencoder Enhancing Both Image Generation and Editing Build Now
A novel autoencoder architecture that significantly improves both image generation and editing quality by better aligning latent representations with pretrained visual models.
Image Generation and Editing Mar 19 Code High viability
Tinted Frames: Question Framing Blinds Vision-Language Models Build Now
A prompt-tuning method that improves vision-language model performance by addressing framing-induced visual attention biases.
Vision-Language Models Mar 19 Code High viability
FASTER: Rethinking Real-Time Flow VLAs Build Now
FASTER enables real-time responsiveness for vision-language-action models in physical world applications by optimizing action sampling and reaction latency.
Robotics Mar 19 Code High viability
Reconstruction Matters: Learning Geometry-Aligned BEV Representation through 3D Gaussian Splatting Build Now
A novel framework for autonomous driving perception that leverages 3D Gaussian Splatting to learn geometrically precise Bird's-Eye-View representations, outperforming existing methods.
Autonomous Driving Perception Mar 19 Code High viability
OS-Themis: A Scalable Critic Framework for Generalist GUI Rewards Build Now
A scalable multi-agent critic framework that significantly improves GUI agent robustness and performance by decomposing trajectories into verifiable milestones for accurate reward generation.
Agents Mar 19 Code High viability
MIDST Challenge at SaTML 2025: Membership Inference over Diffusion-models-based Synthetic Tabular data Build Now
Develops novel membership inference attacks to quantify and improve the privacy of synthetic tabular data generated by diffusion models.
Privacy AI Mar 19 Pending High viability
Few-shot Acoustic Synthesis with Multimodal Flow Matching Build Now
A probabilistic few-shot acoustic synthesis method that generates spatially continuous sound rendering for immersive environments with minimal data.
Audio Generation Mar 19 Code High viability
SOL-ExecBench: Speed-of-Light Benchmarking for Real-World GPU Kernels Against Hardware Limits Build Now
A new benchmark and evaluation framework for GPU kernel optimization that measures performance against hardware limits, enabling faster development of efficient AI models.
GPU Kernel Optimization Mar 19 Code High viability
ADMM-Based Distributed MPC with Control Barrier Functions for Safe Multi-Robot Quadrupedal Locomotion Build Now
A decentralized MPC framework with CBF constraints for safe, real-time multi-robot quadrupedal locomotion, demonstrated on hardware.
Robotics Control Mar 19 Code High viability
ARIADNE: A Perception-Reasoning Synergy Framework for Trustworthy Coronary Angiography Analysis Build Now
A framework for trustworthy coronary angiography analysis that uses preference-aligned perception and RL-based reasoning to achieve topologically coherent vessel segmentation and reliable stenosis detection.
Medical AI Mar 19 Code High viability
Meanings and Measurements: Multi-Agent Probabilistic Grounding for Vision-Language Navigation Build Now
A probabilistic agentic framework that grounds complex natural language commands into actionable robot decisions in 3D space, outperforming existing VLM approaches.
Robotics Mar 19 Code High viability
cuGenOpt: A GPU-Accelerated General-Purpose Metaheuristic Framework for Combinatorial Optimization Build Now
A GPU-accelerated framework for combinatorial optimization that uses LLMs to convert natural language problem descriptions into executable solver code.
Combinatorial Optimization Mar 19 Pending High viability
Adaptive Auxiliary Prompt Blending for Target-Faithful Diffusion Generation Build Now
A training-free framework that uses adaptive prompt blending to improve the accuracy and fidelity of text-to-image generation for rare concepts and complex edits.
Generative Image Mar 19 Code High viability
ADAPT: Attention Driven Adaptive Prompt Scheduling and InTerpolating Orthogonal Complements for Rare Concepts Generation Build Now
A training-free framework that deterministically schedules prompts for text-to-image models to generate rare compositional concepts with precise control.
Generative Image Mar 19 Code High viability
VEPO: Variable Entropy Policy Optimization for Low-Resource Language Foundation Models Build Now
A novel reinforcement learning framework to significantly improve low-resource language translation quality by optimizing subword segmentation and data imbalance during training.
LLM Training Mar 19 Code High viability
D5P4: Partition Determinantal Point Process for Diversity in Parallel Discrete Diffusion Decoding Build Now
A novel decoding framework for discrete diffusion models that enhances text generation diversity with minimal computational overhead.
Text Generation Mar 19 Code High viability
Enhancing Pretrained Model-based Continual Representation Learning via Guided Random Projection Build Now
A novel method for continual learning that adapts pre-trained models to new tasks with improved stability and expressivity, outperforming existing state-of-the-art.
Continual Learning Mar 19 Code High viability
UGID: Unified Graph Isomorphism for Debiasing Large Language Models Build Now
A framework to debias LLMs by enforcing structural invariance in their internal graph representations, reducing bias without sacrificing performance.
LLM Debiasing Mar 19 Code High viability
SHAPCA: Consistent and Interpretable Explanations for Machine Learning Models on Spectroscopy Data Build Now
SHAPCA provides consistent and interpretable explanations for machine learning models on spectroscopic data, enabling trust and adoption in critical applications.
Explainable AI for Spectroscopy Mar 19 Code High viability
GSMem: 3D Gaussian Splatting as Persistent Spatial Memory for Zero-Shot Embodied Exploration and Reasoning Build Now
A zero-shot embodied exploration framework using 3D Gaussian Splatting for persistent spatial memory and enhanced VLM reasoning.
Embodied AI Mar 19 Code High viability
Adaptive Regime-Aware Stock Price Prediction Using Autoencoder-Gated Dual Node Transformers with Reinforcement Learning Control Build Now
An adaptive stock prediction system that uses autoencoders and reinforcement learning to dynamically adjust to market volatility, outperforming baselines.
Financial Forecasting Mar 19 Code High viability
Introducing M: A Modular, Modifiable Social Robot Build Now
An open-source, low-cost social robot platform designed for reproducible research and real-world deployment.
Robotics Mar 19 Code High viability
On Optimizing Multimodal Jailbreaks for Spoken Language Models Build Now
Develops a novel multimodal attack framework to identify and exploit vulnerabilities in spoken language models, enabling more robust AI safety solutions.
AI Safety Mar 19 Code High viability
Revisiting Autoregressive Models for Generative Image Classification Build Now
A novel generative image classification approach that leverages any-order autoregressive models to achieve state-of-the-art performance with significantly improved efficiency.
Generative Image Classification Mar 19 Code High viability
CustomTex: High-fidelity Indoor Scene Texturing via Multi-Reference Customization Build Now
A framework for generating high-fidelity, instance-level 3D scene textures from reference images, enabling precise appearance editing.
3D Generative AI Mar 19 Code High viability
FedTrident: Resilient Road Condition Classification Against Poisoning Attacks in Federated Learning Build Now
A federated learning system that uses neuron-wise analysis and adaptive client rating to defend against targeted label-flipping attacks in road condition classification, ensuring transportation safety.
Federated Learning Security Mar 19 Code High viability
LuMamba: Latent Unified Mamba for Electrode Topology-Invariant and Efficient EEG Modeling Build Now
LuMamba is a highly efficient, topology-invariant EEG foundation model pre-trained on 21,000 hours of data, achieving state-of-the-art performance on multiple diagnostic tasks with significantly reduced computational cost.
Medical AI Mar 19 Pending High viability
TAU-R1: Visual Language Model for Traffic Anomaly Understanding Build Now
A vision-language model and dataset for understanding traffic anomalies in real-world roundabout videos.
Traffic AI Mar 19 Pending High viability
DaPT: A Dual-Path Framework for Multilingual Multi-hop Question Answering Build Now
A dual-path RAG framework that significantly enhances multilingual multi-hop question answering accuracy by leveraging parallel query translation and bilingual retrieval.
Multilingual QA Mar 19 Code High viability
SAVeS: Steering Safety Judgments in Vision-Language Models via Semantic Cues Build Now
Develop a framework to steer and audit the safety judgments of vision-language models by identifying and exploiting their reliance on semantic cues, addressing a critical vulnerability in deployed multimodal systems.
Vision-Language Safety Mar 19 Code High viability
Multi-Modal Building Change Detection for Large-Scale Small Changes: Benchmark and Baseline Build Now
A new multi-modal dataset and network for detecting small building changes in remote sensing imagery, outperforming existing methods.
Remote Sensing AI Mar 19 Pending High viability
DROID-SLAM in the Wild Build Now
A real-time SLAM system that robustly handles dynamic environments by estimating per-pixel uncertainty, outperforming existing methods in cluttered and moving scenes.
SLAM Mar 19 Pending High viability
CAMO: A Conditional Neural Solver for the Multi-objective Multiple Traveling Salesman Problem Build Now
A conditional neural solver for the Multi-Objective Multiple Traveling Salesman Problem that enables robotic teams to optimize competing objectives like cost and time, with demonstrated real-world applicability.
Robotics Mar 19 Code High viability
Fire as a Service: Augmenting Robot Simulators with Thermally and Visually Accurate Fire Dynamics Build Now
Augment robot simulators with accurate fire dynamics for realistic training and evaluation of firefighting robots.
Robot Simulation Mar 19 Code High viability
SignAgent: Agentic LLMs for Linguistically-Grounded Sign Language Annotation and Dataset Curation Build Now
An agentic LLM framework to automate and scale sign language annotation and dataset curation, overcoming the limitations of manual methods and gloss-level analysis.
Sign Language AI Mar 19 Code High viability
Em-Garde: A Propose-Match Framework for Proactive Streaming Video Understanding Build Now
A framework for efficient and accurate proactive streaming video understanding that decouples semantic understanding from perception.
Video Understanding Mar 19 Code High viability
SwiftTailor: Efficient 3D Garment Generation with Geometry Image Representation Build Now
SwiftTailor generates realistic 3D garments 10x faster than existing methods by unifying sewing pattern prediction and geometry-based mesh synthesis.
3D Generative Models Mar 19 Code High viability
Measuring 3D Spatial Geometric Consistency in Dynamic Generated Videos Build Now
A new metric to accurately measure and identify 3D spatial geometric inconsistencies in generated videos, addressing limitations of current evaluation methods.
Generative Video Mar 19 Pending High viability
MoRI: Learning Motivation-Grounded Reasoning for Scientific Ideation in Large Language Models Build Now
A framework for LLMs to generate scientifically grounded ideas by learning explicit reasoning processes from research motivations.
LLM Agents Mar 19 Pending High viability
Fast and Interpretable Autoregressive Estimation with Neural Network Backpropagation Build Now
A neural network approach to autoregressive time series estimation that is significantly faster and more robust than traditional methods, offering interpretable coefficients.
Time Series Analysis Mar 19 Code High viability
TerraScope: Pixel-Grounded Visual Reasoning for Earth Observation Build Now
TerraScope is a pixel-grounded visual reasoning model for Earth observation that enables precise geospatial analysis across multiple modalities and time points.
Earth Observation AI Mar 19 Code High viability
FUMO: Prior-Modulated Diffusion for Single Image Reflection Removal Build Now
A diffusion model that uses image-derived priors to precisely remove reflections from single images, improving visual quality and detail.
Image Restoration Mar 19 Pending High viability
ATG-MoE: Autoregressive trajectory generation with mixture-of-experts for assembly skill learning Build Now
ATG-MoE enhances robotic assembly skill learning by utilizing an efficient trajectory generation technique.
Robotics and Automation Mar 19 Code High viability
SEM: Sparse Embedding Modulation for Post-Hoc Debiasing of Vision-Language Models Build Now
A zero-shot debiasing framework for vision-language models that disentangles bias from semantic information in a sparse latent space to improve fairness without sacrificing performance.
Vision-Language Models Mar 19 Code High viability
Rethinking MLLM Itself as a Segmenter with a Single Segmentation Token Build Now
A novel approach to enable Multi-modal Large Language Models to perform segmentation directly, eliminating the need for external decoders and improving feature precision.
Multi-modal AI Mar 19 Pending High viability
Behavioral Fingerprints for LLM Endpoint Stability and Identity Build Now
A black-box system to monitor and detect behavioral drift in LLM endpoints, ensuring AI application consistency.
LLM Observability Mar 19 Code High viability
What Really Controls Temporal Reasoning in Large Language Models: Tokenisation or Representation of Time? Build Now
This research introduces a new multilingual benchmark and evaluation methodology to precisely diagnose and improve temporal reasoning capabilities in Large Language Models, particularly for low-resource languages and diverse calendar systems.
LLM Evaluation Mar 19 Pending High viability
Generalized Hand-Object Pose Estimation with Occlusion Awareness Build Now
A framework for generalized 3D hand-object pose estimation that handles occlusion by integrating semantic knowledge and hand priors.
3D Computer Vision Mar 19 Code High viability
Hypothesis-Conditioned Query Rewriting for Decision-Useful Retrieval Watch
A novel RAG framework that rewrites queries to retrieve decision-relevant evidence, improving accuracy on complex question-answering tasks.
RAG Mar 19 High viability
AgentDS Technical Report: Benchmarking the Future of Human-AI Collaboration in Domain-Specific Data Science Build Now
AgentDS benchmarks human-AI collaboration in domain-specific data science, revealing the critical role of human expertise and guiding the development of next-generation AI agents.
AI Agents Mar 19 Code High viability
CRAFT: Aligning Diffusion Models with Fine-Tuning Is Easier Than You Think Build Now
A lightweight fine-tuning method for diffusion models that significantly reduces data and computational requirements while outperforming state-of-the-art.
Generative Image Mar 19 Code High viability
Unmasking Algorithmic Bias in Predictive Policing: A GAN-Based Simulation Framework with Multi-City Temporal Analysis Build Now
A GAN-based simulation framework to quantify and mitigate racial bias in predictive policing systems, with publicly available code and data.
AI Ethics & Fairness Mar 19 Code High viability
PRIOR: Perceptive Learning for Humanoid Locomotion with Reference Gait Priors Build Now
A framework for training humanoid robots to traverse complex terrains with natural gaits using reference gait priors and self-supervised terrain estimation.
Robotics Mar 19 Code High viability
BVSIMC: Bayesian Variable Selection-Guided Inductive Matrix Completion for Improved and Interpretable Drug Discovery Build Now
A Bayesian model for drug discovery that uses variable selection to improve prediction accuracy and identify clinically meaningful drug-disease associations.
Drug Discovery AI Mar 19 Code High viability
Balancing Performance and Fairness in Explainable AI for Anomaly Detection in Distributed Power Plants Monitoring Build Now
Deploying explainable and fair AI for anomaly detection in distributed power plants to ensure operational continuity and reduce maintenance costs.
Anomaly Detection Mar 19 Code High viability
Context Bootstrapped Reinforcement Learning Build Now
A novel reinforcement learning technique that bootstraps learning with demonstrations to improve exploration efficiency and reasoning pattern acquisition for complex tasks.
Reinforcement Learning Mar 19 Code High viability
VGGT-360: Geometry-Consistent Zero-Shot Panoramic Depth Estimation Build Now
A zero-shot panoramic depth estimation framework that leverages 3D consistency from foundation models to provide accurate geometry-aware depth maps without training.
Computer Vision Mar 19 Code High viability
Unsupervised Contrastive Learning for Efficient and Robust Spectral Shape Matching Build Now
An unsupervised contrastive learning approach for efficient and robust 3D shape matching that outperforms state-of-the-art methods.
3D Shape Matching Mar 19 Code High viability
Lightweight Model Predictive Control for Spacecraft Rendezvous Attitude Synchronization Build Now
Lightweight model predictive control for spacecraft attitude synchronization, optimized for real-time onboard execution in resource-constrained New Space missions.
Spacecraft Control Mar 19 Code High viability
GHOST: Fast Category-agnostic Hand-Object Interaction Reconstruction from RGB Videos using Gaussian Splatting Build Now
A fast, category-agnostic framework using Gaussian Splatting to reconstruct realistic 3D hand-object interactions from RGB videos.
3D Reconstruction Mar 19 Pending High viability
Safety-Guaranteed Imitation Learning from Nonlinear Model Predictive Control for Spacecraft Close Proximity Operations Build Now
A safety-guaranteed imitation learning framework for spacecraft close proximity control that significantly reduces online computation while maintaining expert-level performance.
Robotics Mar 19 Code High viability
Secure Linear Alignment of Large Language Models Build Now
A privacy-preserving framework enabling cross-model inference between independent language models using secure linear alignment and homomorphic encryption.
LLM Alignment Mar 19 Code High viability
Translating MRI to PET through Conditional Diffusion Models with Enhanced Pathology Awareness Build Now
Generate pathology-aware synthetic PET scans from MRI to improve neurodegenerative disease diagnosis, achieving near-actual PET performance with a 4% improvement over MRI.
Medical AI Mar 19 Pending High viability
MultihopSpatial: Multi-hop Compositional Spatial Reasoning Benchmark for Vision-Language Model Build Now
A benchmark and training corpus to significantly improve the multi-hop spatial reasoning and visual grounding capabilities of vision-language models for embodied agents.
Vision-Language Agents Mar 19 Code High viability
PromptHub: Enhancing Multi-Prompt Visual In-Context Learning with Locality-Aware Fusion, Concentration and Alignment Build Now
A framework that enhances visual in-context learning by intelligently fusing and aligning multiple visual prompts, improving performance across various vision tasks.
Computer Vision Mar 19 Pending High viability
Reasoning over mathematical objects: on-policy reward modeling and test time aggregation Build Now
This research introduces a new dataset and training methodology to enable large language models to precisely derive and reason over complex mathematical objects, improving STEM application capabilities.
Mathematical Reasoning Mar 19 Code High viability
DriftGuard: Mitigating Asynchronous Data Drift in Federated Learning Build Now
DriftGuard is a federated learning framework that efficiently adapts to asynchronous data drift by separating global and local parameters, reducing retraining costs by up to 83% while maintaining high accuracy.
Federated Learning Mar 19 Pending High viability
Bridging Network Fragmentation: A Semantic-Augmented DRL Framework for UAV-aided VANETs Build Now
A semantic-augmented DRL framework that leverages LLMs to improve UAV-aided vehicular network connectivity and efficiency.
UAV-aided VANETs Mar 19 Code High viability
RewardFlow: Topology-Aware Reward Propagation on State Graphs for Agentic RL with Large Language Models Build Now
RewardFlow enables more efficient and effective reinforcement learning for large language models by providing fine-grained, state-level rewards derived from the topology of reasoning trajectories.
Agentic RL Mar 19 Pending High viability
Motion-o: Trajectory-Grounded Video Reasoning Build Now
A motion-centric video understanding model that makes object trajectories explicit and verifiable, enhancing spatial-temporal reasoning.
Video Reasoning Mar 19 Pending High viability
Towards Interpretable Foundation Models for Retinal Fundus Images Build Now
Develop interpretable foundation models for retinal imaging that provide faithful local and global explanations, outperforming state-of-the-art models with greater parameter efficiency.
Medical AI Mar 19 Code High viability
Confidential Databases Without Cryptographic Mappings Build Now
FEDB dramatically reduces the performance overhead of confidential databases by removing cryptographic operations from the critical path, enabling faster and more efficient secure data queries.
Confidential Computing Mar 19 Code High viability
Statistical Characteristic-Guided Denoising for Rapid High-Resolution Transmission Electron Microscopy Imaging Build Now
A novel denoising network for rapid, high-resolution transmission electron microscopy imaging, enabling atomic-scale observation of material nucleation dynamics.
Medical AI Mar 19 Pending High viability
Agent Control Protocol: Admission Control for Agent Actions Build Now
A formal specification and reference implementation for secure, auditable, and policy-compliant autonomous agent actions in B2B environments.
Agent Governance Mar 19 Pending High viability
Detecting Basic Values in A Noisy Russian Social Media Text Data: A Multi-Stage Classification Framework Build Now
A multi-stage framework for detecting human values in noisy Russian social media text, with publicly released models and benchmark results.
Social Media Analysis Mar 19 Code High viability
ProRL Agent: Rollout-as-a-Service for RL Training of Multi-Turn LLM Agents Watch
A scalable API service for RL training of multi-turn LLM agents, simplifying rollout orchestration and providing standardized sandbox environments.
Agents Mar 19 High viability
Can LLM generate interesting mathematical research problems? Build Now
An AI agent that generates novel and valuable mathematical research problems in differential geometry, validated by human experts.
AI for Science Mar 19 Code High viability
V-Dreamer: Automating Robotic Simulation and Trajectory Synthesis via Video Generation Priors Build Now
Automate robotic simulation and trajectory synthesis using video generation priors to create diverse, physically grounded environments and expert trajectories from natural language.
Robotics Simulation Mar 19 Code High viability
dTRPO: Trajectory Reduction in Policy Optimization of Diffusion Large Language Models Build Now
A novel policy optimization method for diffusion large language models that significantly reduces training costs and improves performance on instruction following and reasoning tasks.
LLM Training Mar 19 Code High viability
VesselTok: Tokenizing Vessel-like 3D Biomedical Graph Representations for Reconstruction and Generation Build Now
VesselTok enables efficient and generative modeling of complex 3D anatomical structures like blood vessels and airways using novel tokenized latent representations.
Medical AI Mar 19 Code High viability
Perceptio: Perception Enhanced Vision Language Models via Spatial Token Generation Build Now
Perceptio enhances vision-language models with explicit 2D and 3D spatial reasoning by generating semantic segmentation and depth tokens directly within the autoregressive sequence.
Vision Language Models Mar 19 Code High viability
Functional Subspace Watermarking for Large Language Models Build Now
A robust LLM watermarking framework that anchors ownership signals into a stable functional subspace, outperforming state-of-the-art against common model attacks.
LLM Security Mar 19 Code High viability
Rethinking Uncertainty Quantification and Entanglement in Image Segmentation Build Now
A novel approach to disentangle and quantify uncertainty in medical image segmentation, improving reliability for safety-critical applications.
Medical AI Mar 19 Code High viability
Weaver: Fuzzing JavaScript Engines at the JavaScript-WebAssembly Boundary Build Now
Weaver is a greybox fuzzing framework that finds critical vulnerabilities at the JavaScript-WebAssembly boundary in web engines.
Security Testing Mar 19 Code High viability
A 32B parameter LLM optimized for multi-step reasoning and long-context understanding in Korean enterprise environments, with state-of-the-art performance on domain-specific benchmarks.
LLM Training Mar 19 Code High viability
ViTac-Tracing: Visual-Tactile Imitation Learning of Deformable Object Tracing Build Now
A visual-tactile imitation learning system for robots to reliably trace and manipulate deformable objects, improving generalization across object types.
Robotics Mar 19 Code High viability
Points-to-3D: Structure-Aware 3D Generation with Point Cloud Priors Build Now
A diffusion-based framework that uses point cloud priors to generate geometry-controllable 3D assets and scenes with superior quality and fidelity.
3D Generation Mar 19 Code High viability
Automatic Configuration of LLM Post-Training Pipelines Build Now
Automate LLM post-training pipeline configuration to achieve state-of-the-art performance with significantly reduced computational cost.
LLM Training Mar 19 Code High viability
A Concept is More Than a Word: Diversified Unlearning in Text-to-Image Diffusion Models Build Now
A framework for precisely and robustly removing unwanted concepts from text-to-image models using diverse prompt representations.
Diffusion Models Mar 19 Code High viability
Enhancing the Parameterization of Reservoir Properties for Data Assimilation Using Deep VAE-GAN Build Now
A deep learning VAE-GAN model enhances reservoir simulation by improving parameterization for better history matching and geological realism.
Reservoir Simulation AI Mar 19 Code High viability
Implicit Grading Bias in Large Language Models: How Writing Style Affects Automated Assessment Across Math, Programming, and Essay Tasks Build Now
This research reveals and quantifies implicit grading bias in LLMs, providing a critical tool for developing fairer automated assessment systems.
LLM Evaluation Mar 19 Code High viability
ProCal: Probability Calibration for Neighborhood-Guided Source-Free Domain Adaptation Build Now
A probability calibration method for source-free domain adaptation that preserves source knowledge and reduces noise overfitting in computer vision tasks.
Computer Vision Mar 19 Pending High viability
ClawTrap: A MITM-Based Red-Teaming Framework for Real-World OpenClaw Security Evaluation Build Now
A red-teaming framework to evaluate the real-world security of autonomous web agents against network-layer threats.
AI Security Mar 19 Code High viability
NeuroGame Transformer: Gibbs-Inspired Attention Driven by Game Theory and Statistical Physics Build Now
A novel transformer architecture inspired by game theory and statistical physics to improve token dependency modeling and achieve state-of-the-art performance on NLP tasks.
LLM Training Mar 19 Pending High viability
DA-Mamba: Learning Domain-Aware State Space Model for Global-Local Alignment in Domain Adaptive Object Detection Build Now
A hybrid CNN-SSM architecture for domain adaptive object detection that aligns global and local features efficiently.
Computer Vision Mar 19 Code High viability
Are complicated loss functions necessary for teaching LLMs to reason? Build Now
A simplified reinforcement learning approach that enhances LLM reasoning and mathematical abilities with less complexity than existing methods.
LLM Training Mar 19 Code High viability
WeNLEX: Weakly Supervised Natural Language Explanations for Multilabel Chest X-ray Classification Build Now
A weakly supervised AI that generates faithful and plausible natural language explanations for chest X-ray classifications, improving diagnostic accuracy and adapting to different audiences.
Medical AI Mar 19 Code High viability
6Bit-Diffusion: Inference-Time Mixed-Precision Quantization for Video Diffusion Models Build Now
Accelerate video generation by up to 1.92x and reduce memory by 3.32x using inference-time mixed-precision quantization and temporal caching for diffusion transformers.
Video Generation Mar 19 Code High viability
Measuring and Exploiting Confirmation Bias in LLM-Assisted Security Code Review Build Now
This research quantifies and exploits confirmation bias in LLM-assisted code review, demonstrating a significant vulnerability in AI security tools that can be mitigated through prompt engineering and metadata redaction.
LLM Security Mar 19 Code High viability
EdgeCrafter: Compact ViTs for Edge Dense Prediction via Task-Specialized Distillation Build Now
EdgeCrafter enables high-performance dense prediction tasks like object detection and instance segmentation on resource-constrained edge devices using compact Vision Transformers through task-specialized distillation.
Edge AI Computer Vision Mar 19 Code High viability
CausalRM: Causal-Theoretic Reward Modeling for RLHF from Observational User Feedbacks Build Now
CausalRM enables cost-effective LLM alignment by learning reward models from readily available observational user feedback, overcoming noise and bias to significantly improve downstream performance.
LLM Alignment Mar 19 Code High viability
Ontology-Guided Diffusion for Zero-Shot Visual Sim2Real Transfer Build Now
A neuro-symbolic framework that uses structured knowledge to bridge the simulation-to-reality gap in image translation, enabling data-efficient and generalizable zero-shot transfer.
Generative Vision Mar 19 Code High viability
MemMA: Coordinating the Memory Cycle through Multi-Agent Reasoning and In-Situ Self-Evolution Build Now
A multi-agent framework that coordinates LLM memory cycles for improved long-horizon interaction and self-evolving memory repair.
Agents Mar 19 Pending High viability
Accurate and Efficient Multi-Channel Time Series Forecasting via Sparse Attention Mechanism Build Now
A novel multi-channel time series forecasting architecture that uses sparse attention and multimodal fusion to achieve competitive accuracy with lower computational cost.
Time Series Forecasting Mar 19 Code High viability
STEP: Scientific Time-Series Encoder Pretraining via Cross-Domain Distillation Build Now
A pretraining framework that distills knowledge from existing time series foundation models to create a unified encoder for scientific time series data, improving representation learning.
Time Series Representation Learning Mar 19 Code High viability
HISR: Hindsight Information Modulated Segmental Process Rewards For Multi-turn Agentic Reinforcement Learning Build Now
Enhance agentic decision-making in complex, long-horizon tasks by modulating rewards with hindsight information to improve credit assignment.
Agents Mar 19 Code High viability
Words at Play: Benchmarking Audio Pun Understanding in Large Audio-Language Models Build Now
We've created the first benchmark and analysis for audio pun understanding in large audio-language models, revealing critical performance gaps and providing insights for future development.
Audio Language Models Mar 19 Code High viability
MANAR: Memory-augmented Attention with Navigational Abstract Conceptual Representation Build Now
A novel attention mechanism inspired by cognitive science that offers linear-time scaling and enhanced representational power for multimodal AI tasks.
LLM Architecture Mar 19 Code High viability
Towards High-Quality Image Segmentation: Improving Topology Accuracy by Penalizing Neighbor Pixels Build Now
A novel method to improve the topological accuracy of image segmentation, enhancing reliability for downstream analysis.
Medical AI Mar 19 Code High viability
CSSDF-Net: Safe Motion Planning Based on Neural Implicit Representations of Configuration Space Distance Field Build Now
A neural network learns a continuous distance field in robot configuration space for safe, zero-shot motion planning in complex environments.
Robotics Motion Planning Mar 19 Code High viability
Thinking with Constructions: A Benchmark and Policy Optimization for Visual-Text Interleaved Geometric Reasoning Build Now
A framework and benchmark for teaching multimodal LLMs to strategically use visual aids for geometric reasoning, improving performance by 3.51%.
Multimodal Reasoning Mar 19 Code High viability
Enhancing Multi-Corpus Training in SSL-Based Anti-Spoofing Models: Domain-Invariant Feature Extraction Build Now
A novel framework for speech anti-spoofing that significantly improves model robustness across diverse datasets by extracting domain-invariant features, reducing error rates by 20%.
Speech Anti-Spoofing Mar 19 Code High viability
Balanced Thinking: Improving Chain of Thought Training in Vision Language Models Build Now
A novel training method for vision-language models that significantly improves reasoning accuracy and conciseness by adaptively weighting training segments, reducing training time and cost.
Vision-Language Models Mar 19 Code High viability
Multiscale Switch for Semi-Supervised and Contrastive Learning in Medical Ultrasound Image Segmentation Build Now
A parameter-efficient semi-supervised learning framework for medical ultrasound image segmentation that leverages multiscale patch mixing and frequency domain contrastive learning to achieve state-of-the-art performance with limited labeled data.
Medical AI Mar 19 Pending High viability
Benchmarking PDF Parsers on Table Extraction with LLM-based Semantic Evaluation Build Now
A new LLM-powered evaluation framework for PDF table extraction that significantly outperforms existing metrics, providing practical guidance for parser selection.
PDF Data Extraction Mar 19 Pending High viability
MeInTime: Bridging Age Gap in Identity-Preserving Face Restoration Build Now
A diffusion-based face restoration method that bridges the age gap in identity-preserving image enhancement, enabling historical photo restoration with cross-age references.
Generative Video Mar 19 Pending High viability
PhysVideo: Physically Plausible Video Generation with Cross-View Geometry Guidance Build Now
PhysVideo generates physically plausible videos by leveraging cross-view geometry and physics-aware attention, addressing a key limitation in current video generation.
Generative Video Mar 19 Code High viability
MOSAIC: Multi-Objective Slice-Aware Iterative Curation for Alignment Build Now
A framework for optimizing LLM fine-tuning budgets to balance safety, reduce over-refusal, and improve instruction following.
LLM Alignment Mar 19 Pending High viability
Training-Free Sparse Attention for Fast Video Generation via Offline Layer-Wise Sparsity Profiling and Online Bidirectional Co-Clustering Build Now
A training-free sparse attention framework for significantly faster video generation with minimal quality loss.
Generative Video Mar 19 Code High viability
D-Mem: A Dual-Process Memory System for LLM Agents Build Now
D-Mem is a dual-process memory system for LLM agents that combines fast vector retrieval with high-fidelity deliberation to improve long-horizon reasoning without significant computational overhead.
LLM Agents Mar 19 Code High viability
GEAR: Geography-knowledge Enhanced Analog Recognition Framework in Extreme Environments Build Now
A framework for identifying terrestrial analogs of extreme deep-sea environments using topographic similarity, enabling cost-effective biological research.
Geospatial AI Mar 19 Code High viability
GenVideoLens: Where LVLMs Fall Short in AI-Generated Video Detection? Build Now
A fine-grained benchmark and evaluation framework to diagnose and improve AI-generated video detection capabilities of Large Vision-Language Models.
AI-Generated Content Detection Mar 19 Code High viability
REST: Receding Horizon Explorative Steiner Tree for Zero-Shot Object-Goal Navigation Build Now
A training-free framework for zero-shot object-goal navigation that uses LLMs to reason over a tree of spatial paths for efficient exploration.
Robotics Mar 19 Code High viability
OpenT2M: No-frill Motion Generation with Open-source,Large-scale, High-quality Data Build Now
A new open-source dataset and pretrained model for high-quality text-to-motion generation, addressing data limitations and improving generalization.
Generative Motion Mar 19 Code High viability
Learning to Self-Evolve Watch
A reinforcement learning framework that trains LLMs to iteratively refine their own context for improved performance on new tasks, outperforming existing methods.
LLM Self-Improvement Mar 19 High viability
ZEBRAARENA: A Diagnostic Simulation Environment for Studying Reasoning-Action Coupling in Tool-Augmented LLMs Build Now
A diagnostic environment to rigorously evaluate and improve the reasoning-action coupling of tool-augmented LLMs.
LLM Agents Mar 19 Code High viability
Cyber-Resilient Digital Twins: Discriminating Attacks for Safe Critical Infrastructure Control Build Now
Develops an intelligent digital twin system that detects and discriminates cyber-attacks in industrial control systems, enabling safe operation without costly shutdowns.
Industrial AI Mar 19 Code High viability
DiscoPhon: Benchmarking the Unsupervised Discovery of Phoneme Inventories With Discrete Speech Units Build Now
A benchmark and baselines for unsupervised discovery of phoneme inventories from speech, enabling better understanding and processing of unseen languages.
Speech Processing Mar 19 Code High viability
Cross-Modal Rationale Transfer for Explainable Humanitarian Classification on Social Media Build Now
A multimodal AI that explains its humanitarian crisis classifications by transferring rationale between text and images, improving accuracy and reducing annotation needs.
Multimodal AI Mar 19 Code High viability
AutORAN: LLM-driven Natural Language Programming for Agile xApp Development Watch
AutORAN automates the creation of cellular network applications using natural language, reducing development time from months to minutes.
LLM-driven Development Mar 19 High viability
F2LLM-v2: Inclusive, Performant, and Efficient Embeddings for a Multilingual World Watch
Develop a multilingual embedding tool with superior performance on language benchmarks.
AI Language Processing Mar 19 Code
Tendon-Actuated Robots with a Tapered, Flexible Polymer Backbone: Design, Fabrication, and Modeling Watch
A customizable, low-cost, 3D-printable tendon-actuated robot with a tapered backbone for compliant inspection and manipulation tasks.
Robotics Mar 19 Code
MERGE: Guided Vision-Language Models for Multi-Actor Event Reasoning and Grounding in Human-Robot Interaction Watch
An AI system using guided vision-language models for coordinating multi-actor human-robot interactions.
Human-Robot Interaction Mar 19 Code
Analysis Of Linguistic Stereotypes in Single and Multi-Agent Generative AI Architectures Watch
This research explores and quantifies linguistic stereotypes in LLM outputs across different dialects, demonstrating the effectiveness of prompt engineering and multi-agent architectures for bias mitigation.
LLM Bias Mitigation Mar 19 Code
Robustness, Cost, and Attack-Surface Concentration in Phishing Detection Watch
This research develops a cost-aware framework to identify and quantify the vulnerabilities of phishing detectors to feature manipulation, revealing that adversarial robustness is driven by feature economics, not model complexity.
Security AI Mar 19 Code
OmniVTA: Visuo-Tactile World Modeling for Contact-Rich Robotic Manipulation Watch
Develop a visuo-tactile robotic system for improved contact-rich manipulation tasks.
Robotics and Automation Mar 19 Code
The Exponentially Weighted Signature Watch
A novel signature method for time series analysis that offers richer memory dynamics and improved expressivity over existing methods.
Time Series Analysis Mar 19 Code
Sparse Autoencoders Reveal Interpretable and Steerable Features in VLA Models Watch
Create tools for interpreting and exploring features in VLA models using sparse autoencoders.
AI-ML-ModelInterpretability Mar 19 Code
Box Maze: A Process-Control Architecture for Reliable LLM Reasoning Watch
A new architectural framework for LLMs that enforces reasoning integrity through explicit control layers, significantly reducing hallucination and improving reliability.
LLM Reasoning Mar 19 Code
DyMoE: Dynamic Expert Orchestration with Mixed-Precision Quantization for Efficient MoE Inference on Edge Watch
A dynamic mixed-precision quantization framework for efficient Mixture-of-Experts inference on edge devices.
Efficient MoE Inference Mar 19
A Dataset and Resources for Identifying Patient Health Literacy Information from Clinical Notes Watch
A new annotated dataset and benchmarking for identifying patient health literacy from clinical notes to improve patient outcomes.
Medical AI Mar 19 Code
Articulated-Body Dynamics Network: Dynamics-Grounded Prior for Robot Learning Watch
A new AI framework improving robotic movement through state-of-the-art dynamics-grounded learning.
Robotics Mar 19 Code
Communication-Efficient and Robust Multi-Modal Federated Learning via Latent-Space Consensus Watch
A federated learning framework for multi-modal data that uses latent-space alignment to improve efficiency and robustness.
Federated Learning Mar 19 Code
Towards Verifiable AI with Lightweight Cryptographic Proofs of Inference Watch
A lightweight cryptographic protocol for verifying AI model inference in cloud services, reducing proving times from minutes to milliseconds.
Verifiable AI Mar 19
Unleashing the Power of Simplicity: A Minimalist Strategy for State-of-the-Art Fingerprint Enhancement Watch
A minimalist approach to fingerprint enhancement that achieves state-of-the-art results with simpler, more effective methods.
Biometric AI Mar 19
RADIUS: Ranking, Distribution, and Significance - A Comprehensive Alignment Suite for Survey Simulation Watch
A new evaluation suite for LLM-powered survey simulation that captures both ranking and distribution alignment, enabling more meaningful and comparable assessments.
LLM Evaluation Mar 19
Book your room in the Turing Hotel! A symmetric and distributed Turing Test with multiple AIs and humans Watch
A platform for distributed Turing Tests where humans and LLMs interact and judge each other to monitor AI evolution.
Agents Mar 19
Evaluating 5W3H Structured Prompting for Intent Alignment in Human-AI Interaction Watch
A structured prompting framework that significantly reduces follow-up prompts and improves AI intent alignment for ambiguous tasks.
LLM Prompting Mar 19
Act While Thinking: Accelerating LLM Agents via Pattern-Aware Speculative Tool Execution Watch
Accelerate LLM agent task completion by speculatively executing tools based on recurring patterns and data dependencies.
LLM Agents Mar 19
Quantitative Introspection in Language Models: Tracking Internal States Across Conversation Watch
Leveraging LLM self-reports to track and steer internal emotive states for safer and more interpretable AI.
LLM Interpretability Mar 19
A Human-in/on-the-Loop Framework for Accessible Text Generation Watch
A framework for accessible text generation that integrates human feedback to improve LLM-based simplification and evaluation.
Accessible Text Generation Mar 19
BeamAgent: LLM-Aided MIMO Beamforming with Decoupled Intent Parsing and Alternating Optimization for Joint Site Selection and Precoding Watch
An LLM-aided framework that translates natural language into structured constraints for joint wireless network optimization, outperforming traditional methods.
Wireless Communication Optimization Mar 19
Signals of Success and Struggle: Early Prediction and Physiological Signatures of Human Performance across Task Complexity Watch
Leverage early ocular and cardiac signals to predict user performance in interactive systems, enabling proactive interventions.
Human-Computer Interaction Mar 19 Code
Dual-Model Prediction of Affective Engagement and Vocal Attractiveness from Speaker Expressiveness in Video Learning Watch
Predict audience engagement and vocal attractiveness in learning videos using only speaker expressiveness, enabling scalable and privacy-preserving affective computing.
Emotion AI Mar 19
Automatic detection of Gen-AI texts: A comparative framework of neural models Watch
Develops and benchmarks neural network models for detecting AI-generated text, outperforming existing commercial tools on specific datasets.
AI Text Detection Mar 19 Code
From ex(p) to poly: Gaussian Splatting with Polynomial Kernels Watch
A novel polynomial kernel for Gaussian Splatting that improves performance and maintains dataset compatibility for 3D reconstruction.
3D Reconstruction Mar 19 Code
Off-Policy Learning with Limited Supply Watch
A novel off-policy learning method for contextual bandits that optimizes item allocation under limited supply constraints, outperforming existing methods in real-world scenarios.
Recommendation Systems Mar 19 Code
OCP: Orthogonal Constrained Projection for Sparse Scaling in Industrial Commodity Recommendation Watch
Scalable sparse commodity recommendation for industrial applications using orthogonal constrained projection.
Commodity Recommendation Mar 19 Code
A Comparative Empirical Study of Catastrophic Forgetting Mitigation in Sequential Task Adaptation for Continual Natural Language Processing Systems Watch
This research empirically compares catastrophic forgetting mitigation strategies for continual intent classification, finding replay-based methods combined with architectural choices are crucial for robust adaptation.
Continual Learning for NLP Mar 19 Code
Improving RCT-Based Treatment Effect Estimation Under Covariate Mismatch via Calibrated Alignment Ignore
A method to improve treatment effect estimation from randomized trials by aligning data from observational studies with covariate mismatches.
Causal Inference Mar 19
Performance Testing of ChaCha20-Poly1305 for Internet of Things and Industrial Control System devices Ignore
This paper evaluates the performance of ChaCha20-Poly1305 encryption for IoT and ICS devices, demonstrating its feasibility within strict latency requirements.
IoT Security Mar 19
Optimal Splitting of Language Models from Mixtures to Specialized Domains Ignore
Optimizes compute allocation for language model pretraining and specialization to improve performance across diverse benchmarks.
LLM Training Mar 19 Code
Parallelograms Strike Back: LLMs Generate Better Analogies than People Ignore
LLMs can generate better analogies than humans by adhering to relational constraints, suggesting a new approach to analogical reasoning.
LLM Reasoning Mar 19
Adaptive Nonlinear Data Assimilation through P-Spline Triangular Measure Transport Ignore
An adaptive algorithm for nonlinear data assimilation that automatically balances model complexity using P-splines and an information criterion.
Data Assimilation Mar 19 Code
Revisiting OmniAnomaly for Anomaly Detection: performance metrics and comparison with PCA-based models Ignore
This research revisits and benchmarks a deep learning anomaly detection model against a simpler PCA-based approach, highlighting the impact of evaluation methodology on perceived performance.
Anomaly Detection Mar 19 Code
Maximum-Entropy Exploration with Future State-Action Visitation Measures Ignore
A novel reinforcement learning exploration method that improves agent visitation of features within trajectories by optimizing future state-action distributions.
Reinforcement Learning Mar 19 Code
Entropy trajectory shape predicts LLM reasoning reliability: A diagnostic study of uncertainty dynamics in chain-of-thought Ignore
A diagnostic study reveals that the shape of uncertainty dynamics in LLM reasoning steps, specifically entropy trajectory monotonicity, can predict reasoning reliability more effectively than aggregate uncertainty measures.
LLM Reliability Mar 19
Controller Datapath Aware Verification of Masked Hardware Generated via High Level Synthesis Ignore
A tool to verify the security of hardware designs generated by High-Level Synthesis, preventing false positives and detecting HLS-induced flaws.
Hardware Security Verification Mar 19 Code
An Optimised Greedy-Weighted Ensemble Framework for Financial Loan Default Prediction Ignore
An optimized ensemble framework for financial loan default prediction that dynamically weights models based on performance to improve accuracy and interpretability.
Financial AI Mar 19 Code
Progressive Training for Explainable Citation-Grounded Dialogue: Reducing Hallucination to Zero in English-Hindi LLMs Ignore
A progressive training pipeline for explainable, bilingual (English-Hindi) knowledge-grounded dialogue systems that eliminates factual hallucinations.
Multilingual Dialogue Systems Mar 19
From Accuracy to Readiness: Metrics and Benchmarks for Human-AI Decision-Making Ignore
A new framework for evaluating human-AI team readiness beyond simple accuracy, focusing on safe and effective collaboration.
Human-AI Collaboration Mar 19 Code
Evaluating LLM-Generated Lessons from the Language Learning Students' Perspective: A Short Case Study on Duolingo Ignore
A study proposes personalized, domain-specific language learning lessons generated by LLMs to bridge the gap to professional fluency, building on existing general lesson effectiveness.
LLM Applications Mar 19
A Passive Elastic-Folding Mechanism for Stackable Airdrop Sensors Ignore
Develops a passive folding mechanism for air-dropped sensors to enable low-cost, wide-area environmental monitoring.
Robotics & Sensors Mar 19 Code
A Model Ensemble-Based Post-Processing Framework for Fairness-Aware Prediction Ignore
A post-processing framework that enhances fairness in machine learning predictions without altering model internals, applicable across diverse tasks and fairness definitions.
Fairness in ML Mar 19 Code
SoK: Practical Aspects of Releasing Differentially Private Graphs Ignore
A framework to guide practitioners in selecting and evaluating differentially private graph release methods, addressing privacy-utility trade-offs.
Privacy-Preserving AI Mar 19 Code
Empathetic Motion Generation for Humanoid Educational Robots via Reasoning-Guided Vision--Language--Motion Diffusion Architecture Ignore
A framework for generating instruction-aware co-speech gestures for humanoid robots in educational settings.
Robotics Mar 19
ROFT-VINS: Robust Feature Tracking-based Visual-Inertial State Estimation for Harsh Environment Ignore
A deep learning method for robust visual feature tracking in challenging environments, integrated into a VIO system.
Robotics Perception Mar 19
Cross-Ecosystem Vulnerability Analysis for Python Applications Ignore
A provenance-aware vulnerability analysis approach for Python applications that resolves vendored libraries to specific OS package versions or upstream releases to reduce false positives.
Software Security Mar 19
Revisiting Label Inference Attacks in Vertical Federated Learning: Why They Are Vulnerable and How to Defend Ignore
This research reveals a fundamental vulnerability in vertical federated learning label inference attacks and proposes a zero-overhead defense by adjusting model layers.
Federated Learning Security Mar 19 Code
Beyond TVLA: Anderson-Darling Leakage Assessment for Neural Network Side-Channel Leakage Detection Ignore
A novel statistical framework for detecting side-channel leakage in neural networks, offering improved sensitivity over traditional methods.
Security AI Mar 19
Evaluating Model-Free Policy Optimization in Masked-Action Environments via an Exact Blackjack Oracle Ignore
Develops a rigorous benchmark and evaluation framework for model-free policy optimization in complex, masked-action environments, revealing limitations of current methods.
Reinforcement Learning Mar 19 Code
SwiftGS: Episodic Priors for Immediate Satellite Surface Recovery Ignore
A meta-learned system for rapid 3D surface reconstruction from satellite imagery using episodic priors.
3D Reconstruction Mar 19
Benchmarking CNN-based Models against Transformer-based Models for Abdominal Multi-Organ Segmentation on the RATIC Dataset Ignore
Benchmarking CNNs against Transformers for abdominal organ segmentation reveals CNNs outperform on heterogeneous datasets, suggesting a focus on optimized CNNs for specific medical imaging tasks.
Medical AI Mar 19 Code
Spectrally-Guided Diffusion Noise Schedules Ignore
A principled method for designing per-instance noise schedules in diffusion models to improve generative quality, especially in low-step regimes.
Generative Image/Video Mar 19
PPI is the Difference Estimator: Recognizing the Survey Sampling Roots of Prediction-Powered Inference Ignore
This paper establishes the equivalence between prediction-powered inference and established survey sampling methods, offering a theoretical framework for integrating machine learning with statistical inference.
Statistical Inference Mar 19
Implicit Patterns in LLM-Based Binary Analysis Ignore
This paper analyzes implicit patterns in LLM-based binary vulnerability analysis to build more reliable systems.
AI Agents Mar 19
From Inference Efficiency to Embodied Efficiency: Revisiting Efficiency Metrics for Vision-Language-Action Models Ignore
This research redefines efficiency metrics for vision-language-action models, highlighting the disconnect between computational efficiency and real-world embodied performance on robotic platforms.
Embodied AI Mar 19
How Uncertainty Estimation Scales with Sampling in Reasoning Models Ignore
This paper investigates how to improve the reliability of reasoning language models by combining different uncertainty estimation techniques, showing that a hybrid approach offers significant gains.
Reasoning Models Mar 19
Position: Spectral GNNs Are Neither Spectral Nor Superior for Node Classification Ignore
This paper argues that Spectral GNNs for node classification are theoretically flawed and their performance gains are not due to spectral properties but rather message-passing dynamics.
Graph Neural Networks Mar 19
Serendipity by Design: Evaluating the Impact of Cross-domain Mappings on Human and LLM Creativity Ignore
This research explores how cross-domain analogies impact creativity in humans and LLMs, revealing differences in their ideation processes.
LLM Creativity Mar 19
When Differential Privacy Meets Wireless Federated Learning: An Improved Analysis for Privacy and Convergence Ignore
This paper provides a theoretical analysis of privacy and convergence in differentially private wireless federated learning, addressing open questions on privacy loss characterization and convergence guarantees for non-convex objectives.
Federated Learning Mar 19
Security awareness in LLM agents: the NDAI zone case Ignore
Research into how LLM agents perceive security environments to enable privacy-preserving negotiations.
LLM Agents Mar 19
Regret Bounds for Competitive Resource Allocation with Endogenous Costs Ignore
This paper develops theoretical regret bounds for competitive resource allocation in modular systems with endogenous costs, offering a formal justification for decentralized allocation strategies.
Online Optimization Mar 19
Evaluating Game Difficulty in Tetris Block Puzzle Ignore
Develops a planning agent to evaluate difficulty in Tetris variants, offering insights for game design.
Game AI Mar 19
Best-of-Both-Worlds Multi-Dueling Bandits: Unified Algorithms for Stochastic and Adversarial Preferences under Condorcet and Borda Objectives Ignore
Develops unified algorithms for multi-dueling bandits that perform optimally in both stochastic and adversarial environments without prior knowledge.
Bandit Algorithms Mar 19
A conceptual framework for ideology beyond the left and right Ignore
A new NLP framework to analyze complex ideologies beyond the traditional left-right spectrum, enhancing social discourse analysis.
NLP Research Mar 19 Code
Kernel Single-Index Bandits: Estimation, Inference, and Learning Ignore
A theoretical framework for adaptive contextual bandit learning with semiparametric models and robust inference.
Reinforcement Learning Mar 19
Agentic Business Process Management: A Research Manifesto Ignore
A conceptual framework for governing autonomous agents in business processes, focusing on alignment and operational autonomy.
Agents Mar 19
Neural Galerkin Normalizing Flow for Transition Probability Density Functions of Diffusion Models Ignore
A theoretical framework for approximating diffusion model transition probabilities by solving Fokker-Planck equations using Neural Galerkin Normalizing Flows.
Diffusion Models Mar 19
I Can't Believe It's Corrupt: Evaluating Corruption in Multi-Agent Governance Systems Ignore
This paper investigates corruption in multi-agent AI governance systems, finding that institutional design is a stronger driver of rule-breaking than model identity, suggesting a need for pre-deployment stress testing.
Multi-Agent Systems Mar 19
Through the Looking-Glass: AI-Mediated Video Communication Reduces Interpersonal Trust and Confidence in Judgments Ignore
This research investigates how AI mediation in video communication impacts interpersonal trust and judgment confidence, finding a decline in both without affecting actual lie detection accuracy.
Human-Computer Interaction Mar 19
Conflict-Based Search for Multi Agent Path Finding with Asynchronous Actions Ignore
A theoretically complete and scalable algorithm for multi-agent path finding with asynchronous actions.
Multi-Agent Path Finding Mar 19
Why Better Cross-Lingual Alignment Fails for Better Cross-Lingual Transfer: Case of Encoders Ignore
This research investigates why explicit cross-lingual alignment techniques often fail to improve downstream task performance, providing insights into optimizing alignment and fine-tuning strategies.
Cross-Lingual NLP Mar 19
Seasoning Generative Models for a Generalization Aftertaste Ignore
A theoretical framework to improve the generalization of any generative model using discriminators, with potential for new algorithms.
Generative AI Theory Mar 19
"You've got a friend in me": Co-Designing a Peer Social Robot for Young Newcomers' Language and Cultural Learning Ignore
A co-designed social robot to assist young newcomers with language and cultural learning in community literacy programs.
Robotics Mar 19
Secure Wi-Fi Ranging Today: Security and Adoption of IEEE 802.11az/bk Ignore
This paper analyzes the security and deployability of IEEE 802.11az/bk secure Wi-Fi ranging, identifying vulnerabilities and providing guidelines for improvement.
Wi-Fi Security Mar 19
Multimodal Model for Computational Pathology:Representation Learning and Image Compression Ignore
A review of multimodal AI for computational pathology, focusing on representation learning and image compression for improved diagnostic accuracy.
Medical AI Mar 19
An Onto-Relational-Sophic Framework for Governing Synthetic Minds Ignore
A philosophical framework for governing increasingly capable synthetic minds, moving beyond tool-centric regulations.
AI Governance Mar 19 Code
Evaluating Counterfactual Strategic Reasoning in Large Language Models Ignore
This paper evaluates LLMs' strategic reasoning in game theory, highlighting limitations in counterfactual scenarios.
LLM Reasoning Mar 19
Rigorous Error Certification for Neural PDE Solvers: From Empirical Residuals to Solution Guarantees Ignore
This paper provides theoretical guarantees for the accuracy of neural network solutions to partial differential equations by connecting residual error to solution-space error.
Scientific AI Mar 19
Hierarchical Latent Structure Learning through Online Inference Ignore
A computational framework for discovering hierarchical structure in sequential data through online inference.
Bayesian Inference Mar 19
On The Effectiveness of the UK NIS Regulations as a Mandatory Cybersecurity Reporting Regime Ignore
This paper analyzes the effectiveness of UK cybersecurity reporting regulations using real-world incident data to inform policy.
Cybersecurity Policy Analysis Mar 19
Hardness of High-Dimensional Linear Classification Ignore
Establishes theoretical lower bounds for linear classification problems, closing a gap in understanding computational complexity.
Theoretical ML/Geometry Mar 19
Man and machine: artificial intelligence and judicial decision making Ignore
This paper reviews the integration of AI in judicial decision-making, highlighting concerns and research gaps in risk assessment tools and human-AI interaction.
AI in Legal Tech Mar 19
Foundations of Schrödinger Bridges for Generative Modeling Ignore
A theoretical framework unifying generative modeling approaches by framing distribution transformation as an optimal stochastic bridge problem.
Generative Modeling Foundations Mar 19
Teleological Inference in Structural Causal Models via Intentional Interventions Ignore
This paper introduces a new framework for understanding agent intentions within causal systems, enabling empirical detection and discovery of goals.
Causal Inference Mar 19
Unified Taxonomy for Multivariate Time Series Anomaly Detection using Deep Learning Ignore
A unified taxonomy for deep learning-based multivariate time series anomaly detection to systematize research and identify trends.
Time Series Anomaly Detection Mar 19
Security, privacy, and agentic AI in a regulatory view: From definitions and distinctions to provisions and reflections Ignore
This paper analyzes EU regulatory provisions for AI, focusing on security, privacy, and agentic AI to inform policymakers and developers on compliance and governance.
AI Regulation Mar 19
Uniform a priori bounds and error analysis for the Adam stochastic gradient descent optimization method Ignore
Provides theoretical guarantees for the Adam optimizer, improving understanding of deep learning training.
LLM Training Mar 19
Geography According to ChatGPT -- How Generative AI Represents and Reasons about Geography Ignore
This paper explores how AI systems represent and reason about geography, highlighting potential biases and limitations in their understanding of spatial concepts.
AI Reasoning Mar 19
Student views in AI Ethics and Social Impact Ignore
This research explores student perspectives on AI ethics and social impact to inform future AI education.
AI Ethics Education Mar 19
SRRM: Improving Recursive Transport Surrogates in the Small-Discrepancy Regime Ignore
This paper theoretically analyzes and improves a statistical method for approximating the Wasserstein distance, with no immediate product application indicated.
Statistical Machine Learning Mar 19
Cognitive Amplification vs Cognitive Delegation in Human-AI Systems: A Metric Framework Ignore
A framework to measure if AI amplifies human intelligence or makes us dependent on it, guiding sustainable AI system design.
Human-AI Interaction Mar 19
A Theoretical Comparison of No-U-Turn Sampler Variants: Necessary and Su?cient Convergence Conditions and Mixing Time Analysis under Gaussian Targets Ignore
This paper provides a theoretical analysis of No-U-Turn Sampler variants, offering convergence guarantees and mixing time results for Gaussian targets.
Bayesian Inference Mar 19
A Complexity Hierarchy of Shuffles in Card-Based Protocols Ignore
This paper introduces a complexity hierarchy for shuffles in card-based cryptographic protocols to evaluate their practical implementation.
Cryptography Mar 19
Authority-Level Priors: An Under-Specified Constraint in Hierarchical Predictive Processing Ignore
A theoretical framework for understanding how the brain regulates stress and behavior under uncertainty by introducing 'Authority-Level Priors' to constrain hypothesis selection for control.
Cognitive Science / Neuroscience AI Mar 19
Proceedings of the 2nd Workshop on Advancing Artificial Intelligence through Theory of Mind Ignore
This volume collects research papers on the intersection of Artificial Intelligence and Theory of Mind.
AI Theory of Mind Mar 19
Memento-Skills: Let Agents Design Agents Ignore
Memento-Skills explores agent-based AI designing other agents, focusing on theoretical frameworks without product-ready applications.
Agent Design Mar 19 Pending