HiSpatial: Taming Hierarchical 3D Spatial Understanding in Vision-Language Models Build Now
HiSpatial provides state-of-the-art 3D spatial intelligence for vision-language models, suitable for enhancing autonomous systems and smart environments.
3D Spatial Intelligence Mar 26 Code High viability
PackForcing: Short Video Training Suffices for Long Video Sampling and Long Context Inference Build Now
PackForcing: Efficient long-video generation using short-video training with reduced memory footprint and improved temporal coherence.
Video Technology Mar 26 Pending High viability
LEMMA: Laplacian pyramids for Efficient Marine SeMAntic Segmentation Build Now
Deploy a lightweight semantic segmentation model for real-time marine environment analysis on resource-constrained devices.
Semantic Segmentation Mar 26 Code High viability
LaMP: Learning Vision-Language-Action Policies with 3D Scene Flow as Latent Motion Prior Build Now
LaMP offers a cutting-edge robotic manipulation framework leveraging 3D scene flow for enhanced vision-language-action alignment, outperforming existing models by integrating geometric foresight in control policies.
Robotics and Automation Mar 26 Code High viability
AD-CARE: A Guideline-grounded, Modality-agnostic LLM Agent for Real-world Alzheimer's Disease Diagnosis with Multi-cohort Assessment, Fairness Analysis, and Reader Study Build Now
AD-CARE: An AI-powered, guideline-driven agent for enhancing Alzheimer's diagnosis accuracy and clinical efficiency through modality-agnostic multi-modal data integration.
Healthcare AI Mar 26 Code High viability
LILAC: Language-Conditioned Object-Centric Optical Flow for Open-Loop Trajectory Generation Build Now
LILAC translates natural language into robotic actions using optical flow for efficient task execution.
Robotics and Automation Mar 26 Code High viability
GridVAD: Open-Set Video Anomaly Detection via Spatial Reasoning over Stratified Frame Grids Build Now
GridVAD leverages natural-language anomaly proposals for zero-shot video anomaly detection, achieving state-of-the-art performance without task-specific training.
Video Anomaly Detection Mar 26 Code High viability
PICon: A Multi-Turn Interrogation Framework for Evaluating Persona Agent Consistency Watch
A framework for validating persona agent consistency across multiple dimensions for reliable human simulation.
AI Evaluation Framework Mar 26 Code
Adaptive Learned Image Compression with Graph Neural Networks Build Now
A novel graph neural network approach for adaptive image compression that significantly outperforms state-of-the-art methods by modeling spatially varying redundancy.
Image Compression Mar 26 Pending High viability
Separate Before You Compress: The WWHO Tokenization Architecture Build Now
A novel tokenization architecture and algorithm that significantly reduces token count and inference costs for complex scripts, unlocking LLM potential for the Global South.
LLM Tokenization Mar 26 Pending High viability
Prompt Attack Detection with LLM-as-a-Judge and Mixture-of-Models Build Now
Leveraging lightweight LLMs as low-latency judges to secure public chatbots against prompt attacks in real-time production environments.
LLM Security Mar 26 Code High viability
Trace2Skill: Distill Trajectory-Local Lessons into Transferable Agent Skills Build Now
Automatically distill transferable agent skills from execution experience, enabling LLM agents to tackle complex tasks without parameter updates.
LLM Agents Mar 26 Code High viability
ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling Build Now
ShotStream enables real-time, interactive multi-shot video generation for storytelling by using a novel causal architecture with dual-cache memory and a two-stage distillation strategy.
Generative Video Mar 26 Pending High viability
Less Gaussians, Texture More: 4K Feed-Forward Textured Splatting Build Now
A feed-forward 3D rendering framework that enables high-fidelity 4K novel view synthesis by predicting compact Gaussian primitives and per-primitive textures, overcoming resolution scaling barriers.
3D Rendering Mar 26 Code High viability
MuRF: Unlocking the Multi-Scale Potential of Vision Foundation Models Build Now
Enhance existing vision foundation models with multi-resolution processing for improved performance across various tasks without retraining.
Vision Foundation Models Mar 26 Pending High viability
RefAlign: Representation Alignment for Reference-to-Video Generation Build Now
A framework to improve identity consistency and reduce artifacts in reference-to-video generation by explicitly aligning visual features.
Generative Video Mar 26 Code High viability
Vega: Learning to Drive with Natural Language Instructions Build Now
A vision-language-action model that learns to drive by following natural language instructions, enabling personalized autonomous driving experiences.
Autonomous Driving Mar 26 Pending High viability
Drive My Way: Preference Alignment of Vision-Language-Action Model for Personalized Driving Build Now
A personalized vision-language-action driving framework that adapts to individual driver habits and real-time natural language instructions.
Personalized Autonomous Driving Mar 26 Code High viability
PSDesigner: Automated Graphic Design with a Human-Like Creative Workflow Build Now
An automated graphic design system that emulates human creative workflows to produce production-quality designs from user instructions.
Generative Design Mar 26 Code High viability
MegaFlow: Zero-Shot Large Displacement Optical Flow Build Now
A zero-shot optical flow model leveraging pre-trained vision priors for accurate large displacement estimation and long-range point tracking.
Computer Vision Mar 26 Pending High viability
How good was my shot? Quantifying Player Skill Level in Table Tennis Build Now
Quantify player skill in table tennis by learning generative models of tactical strokes and embedding them in a latent space that reflects individual characteristics.
Sports Analytics Mar 26 Code High viability
Training the Knowledge Base through Evidence Distillation and Write-Back Enrichment Build Now
Enhance retrieval-augmented generation systems by dynamically distilling and enriching the knowledge base for improved factual accuracy and performance.
RAG Optimization Mar 26 Code High viability
Unleashing Guidance Without Classifiers for Human-Object Interaction Animation Build Now
A data-driven approach to generate realistic human-object interaction animations by leveraging the denoising process itself for guidance, eliminating the need for explicit classifiers.
Generative Animation Mar 26 Code High viability
SlotVTG: Object-Centric Adapter for Generalizable Video Temporal Grounding Build Now
A lightweight adapter for multimodal LLMs that enables precise object-centric temporal grounding in videos with improved out-of-domain generalization.
Video Understanding Mar 26 Code High viability
BizGenEval: A Systematic Benchmark for Commercial Visual Content Generation Build Now
A new benchmark for evaluating and improving AI image generation for commercial design tasks, revealing significant gaps in current models.
Generative Image Mar 26 Code High viability
PixelSmile: Toward Fine-Grained Facial Expression Editing Build Now
A diffusion framework for precise, controllable, and fine-grained facial expression editing with strong identity preservation.
Generative Image Editing Mar 26 Pending High viability
Back to Basics: Revisiting ASR in the Age of Voice Agents Build Now
WildASR provides a diagnostic benchmark and analytical tools to improve the reliability and safety of Automatic Speech Recognition systems in real-world voice agents.
ASR Mar 26 Code High viability
AnyHand: A Large-Scale Synthetic Dataset for RGB(-D) Hand Pose Estimation Build Now
A large-scale synthetic dataset and lightweight module to significantly improve 3D hand pose estimation accuracy and generalization for RGB and RGB-D inputs.
Computer Vision Mar 26 Code High viability
SoftMimicGen: A Data Generation System for Scalable Robot Learning in Deformable Object Manipulation Build Now
An automated data generation pipeline for deformable object manipulation in robotics, enabling scalable robot learning.
Robot Learning Data Generation Mar 26 Code High viability
Natural-Language Agent Harnesses Build Now
Externalize agent harness logic into editable natural language for improved transferability and study, powered by an intelligent runtime.
Agents Mar 26 Code High viability
No Hard Negatives Required: Concept Centric Learning Leads to Compositionality without Degrading Zero-shot Capabilities of Contrastive Models Build Now
A new method for vision-language models that improves compositional understanding without sacrificing zero-shot performance, with code available.
Vision-Language Models Mar 26 Pending High viability
Agent Factories for High Level Synthesis: How Far Can General-Purpose Coding Agents Go in Hardware Optimization? Build Now
Autonomous coding agents that significantly accelerate hardware design optimization by intelligently decomposing and exploring complex configurations.
Hardware Optimization Agents Mar 26 Code High viability
Out of Sight but Not Out of Mind: Hybrid Memory for Dynamic Video World Models Build Now
A novel hybrid memory system for video world models that tracks dynamic subjects even when they are out of view, ensuring motion continuity and realistic simulation.
Video World Models Mar 26 Code High viability
Seeing to Ground: Visual Attention for Hallucination-Resilient MDLLMs Build Now
A training-free framework that uses visual attention to eliminate hallucinations in multimodal large language models by ensuring text generation is visually grounded.
Multimodal LLMs Mar 26 Code High viability
TRACE: Object Motion Editing in Videos with First-Frame Trajectory Guidance Build Now
A framework for easily editing object trajectories in videos by specifying a path in a single frame, producing coherent and realistic edits.
Video Editing Mar 26 Code High viability
Wan-Weaver: Interleaved Multi-modal Generation via Decoupled Training Build Now
Wan-Weaver enables interleaved text and image generation by decoupling planning and visualization, achieving state-of-the-art performance without real interleaved data.
Multi-modal Generation Mar 26 Code High viability
S2D2: Fast Decoding for Diffusion LLMs via Training-Free Self-Speculation Build Now
Accelerate LLM generation by training-free self-speculative decoding for diffusion models, achieving significant speedups with improved accuracy.
LLM Decoding Mar 26 Pending High viability
The Kitchen Loop: User-Spec-Driven Development for a Self-Evolving Codebase Watch
A framework for autonomous, self-evolving software development using LLM agents and rigorous testing to ensure quality and prevent regressions.
LLM Agents Mar 26 High viability
Just Zoom In: Cross-View Geo-Localization via Autoregressive Zooming Build Now
Autoregressive zooming over satellite imagery for precise GPS-denied localization, outperforming existing retrieval methods.
Geo-localization Mar 26 Code High viability
Persistent Robot World Models: Stabilizing Multi-Step Rollouts via Reinforcement Learning Build Now
Reinforcement learning for robot world models to enable stable, long-term visual prediction for simulation.
Robotics Simulation Mar 26 Code High viability
Longitudinal Digital Phenotyping for Early Cognitive-Motor Screening Build Now
An AI framework using tablet interactions to continuously screen for early cognitive-motor developmental issues in children, identifying persistent deficits for timely intervention.
Medical AI Mar 26 Code High viability
Can Users Specify Driving Speed? Bench2Drive-Speed: Benchmark and Baselines for Desired-Speed Conditioned Autonomous Driving Build Now
Enabling customizable speed and overtaking for autonomous vehicles by leveraging existing driving data.
Autonomous Driving Mar 26 Pending High viability
Uncertainty-Guided Label Rebalancing for CPS Safety Monitoring Build Now
A novel approach to rebalance imbalanced safety data in Cyber-Physical Systems by leveraging behavioral uncertainty, significantly improving safety predictor performance.
CPS Safety Monitoring Mar 26 Code High viability
Fast-dVLA: Accelerating Discrete Diffusion VLA to Real-Time Performance Build Now
Accelerate robot adaptation by decoupling and merging training objectives for enhanced capabilities with reduced computational cost.
Robotics AI Mar 26 Code High viability
Designing Any Imaging System from Natural Language: Agent-Constrained Composition over a Finite Primitive Basis Build Now
Automate the design of complex imaging systems from natural language, overcoming expertise bottlenecks and accelerating scientific prototyping.
AI for Scientific Instrument Design Mar 26 Code High viability
Anchored-Branched Steady-state WInd Flow Transformer (AB-SWIFT): a metamodel for 3D atmospheric flow in urban environments Build Now
A transformer-based metamodel for accurate and efficient 3D atmospheric flow simulation in urban environments, outperforming existing methods.
Environmental AI Mar 26 Pending High viability
LanteRn: Latent Visual Structured Reasoning Build Now
A framework enabling large multimodal models to perform efficient visual reasoning directly in latent space, improving fine-grained understanding.
Multimodal Reasoning Mar 26 Code High viability
Social Hippocampus Memory Learning Build Now
A memory-centric social machine learning framework enabling privacy-preserving collaboration among heterogeneous agents by sharing abstracted knowledge instead of model parameters.
Federated Learning Mar 26 Code High viability
DeepFAN, a transformer-based deep learning model for human-artificial intelligence collaborative assessment of incidental pulmonary nodules in CT scans: a multi-reader, multi-case trial Build Now
A transformer-based AI model that significantly improves radiologist performance in assessing pulmonary nodules, with clinical trial validation.
Medical AI Mar 26 Code High viability
Spatiotemporal System Forecasting with Irregular Time Steps via Masked Autoencoder Build Now
A masked autoencoder for accurate spatiotemporal forecasting of systems with irregular time steps, applicable to climate and fluid dynamics.
Spatiotemporal Forecasting Mar 26 Code High viability
Towards Generalizable Robotic Data Flywheel: High-Dimensional Factorization and Composition Build Now
A framework for structured data factorization and iterative learning to significantly improve robotic model generalization with fewer demonstrations.
Robotics Data Flywheel Mar 26 Code High viability
UNIC: Neural Garment Deformation Field for Real-time Clothed Character Animation Build Now
Real-time neural deformation field for animating complex garment meshes in virtual environments.
3D Animation Mar 26 Code High viability
Hierarchy-Guided Multimodal Representation Learning for Taxonomic Inference Build Now
A hierarchy-aware multimodal AI that accurately identifies species from imperfect image and DNA data, crucial for conservation and environmental monitoring.
Biodiversity AI Mar 26 Code High viability
TAAC: A gate into Trustable Audio Affective Computing Build Now
A framework for trustable audio-based depression diagnosis that encrypts sensitive user identity information while maintaining high diagnostic accuracy.
Medical AI Mar 26 Code High viability
GeoHeight-Bench: Towards Height-Aware Multimodal Reasoning in Remote Sensing Build Now
A new benchmark and baseline model for height-aware reasoning in remote sensing, addressing a critical gap in current multimodal models.
Remote Sensing AI Mar 26 Code High viability
An Integrative Genome-Scale Metabolic Modeling and Machine Learning Framework for Predicting and Optimizing Biofuel-Relevant Biomass Production in Saccharomyces cerevisiae Build Now
A machine learning framework integrating metabolic modeling and optimization to predict and dramatically enhance biofuel-relevant biomass production in yeast.
Biotech AI Mar 26 Code High viability
Towards Comprehensive Real-Time Scene Understanding in Ophthalmic Surgery through Multimodal Image Fusion Build Now
A multimodal AI system fuses operating microscope and OCT images for real-time, precise instrument tracking and tool-tissue distance estimation in ophthalmic surgery.
Medical AI Mar 26 Code High viability
A multilingual text-to-speech model that clones voices from just 3 seconds of audio, outperforming existing solutions in naturalness and expressivity.
Text-to-Speech Mar 26 High viability
Towards Embodied AI with MuscleMimic: Unlocking full-body musculoskeletal motor learning at scale Build Now
A framework for scalable, physiologically realistic full-body musculoskeletal motor learning, enabling rapid training and fine-tuning of human-like movements.
Robotics Mar 26 Pending High viability
PAWS: Perception of Articulation in the Wild at Scale from Egocentric Videos Build Now
Extracts 3D object articulations from unannotated egocentric videos to enable robotics and animation applications.
3D Scene Understanding Mar 26 Code High viability
Missing-Aware Multimodal Fusion for Unified Microservice Incident Management Build Now
A self-supervised framework for robust microservice incident management that handles missing data, improving anomaly detection, failure triage, and root cause localization.
Microservice Incident Management Mar 26 Code High viability
Beyond the Golden Data: Resolving the Motion-Vision Quality Dilemma via Timestep Selective Training Build Now
A novel training method for video generation models that decouples motion and visual quality, enabling superior performance even with imperfect data.
Generative Video Mar 26 Code High viability
CHIRP dataset: towards long-term, individual-level, behavioral monitoring of bird populations in the wild Build Now
A new dataset and method for individual bird re-identification and behavior monitoring in the wild, outperforming state-of-the-art.
Animal Behavior Monitoring Mar 26 Code High viability
Lightweight GenAI for Network Traffic Synthesis: Fidelity, Augmentation, and Classification Build Now
Lightweight GenAI models generate realistic network traffic for data augmentation and classification, overcoming data scarcity and privacy concerns with high fidelity and efficiency.
Network Traffic Synthesis Mar 26 Code High viability
RealRestorer: Towards Generalizable Real-World Image Restoration with Large-Scale Image Editing Models Build Now
A state-of-the-art open-source image restoration model trained on a large-scale dataset, outperforming existing methods and closing the gap with closed-source alternatives.
Image Restoration Mar 26 Code High viability
Unveiling the Resilience of LLM-Enhanced Search Engines against Black-Hat SEO Manipulation Build Now
This research benchmarks and identifies vulnerabilities in LLM-enhanced search engines against sophisticated SEO manipulation, offering insights for building more resilient AI search systems.
LLM Security Mar 26 Code High viability
Knowledge-Guided Failure Prediction: Detecting When Object Detectors Miss Safety-Critical Objects Build Now
A runtime monitoring framework for object detectors that predicts failures by measuring semantic misalignment between internal features and foundation model embeddings, significantly improving recall for safety-critical objects.
Computer Vision Mar 26 Code High viability
EcoThink: A Green Adaptive Inference Framework for Sustainable and Accessible Agents Build Now
EcoThink is an adaptive inference framework that significantly reduces LLM energy consumption for generative AI agents without sacrificing performance, enabling sustainable and accessible AI.
LLM Inference Optimization Mar 26 Code High viability
AdaSFormer: Adaptive Serialized Transformers for Monocular Semantic Scene Completion from Indoor Environments Build Now
A novel transformer architecture for accurate 3D scene reconstruction from single indoor images, outperforming existing methods.
3D Scene Understanding Mar 26 Pending High viability
Causal-INSIGHT: Probing Temporal Models to Extract Causal Structure Build Now
A model-agnostic framework to extract directed temporal influence structures from pre-trained time series models, improving interpretability and delay localization.
Causal Inference Mar 26 Code High viability
Maximum Entropy Behavior Exploration for Sim2Real Zero-Shot Reinforcement Learning Build Now
A zero-shot reinforcement learning algorithm that enables direct deployment of policies to real-world robots without finetuning by maximizing exploration entropy.
Reinforcement Learning Mar 26 Code High viability
CIAR: Interval-based Collaborative Decoding for Image Generation Acceleration Build Now
CIAR accelerates image generation by intelligently offloading computation to the device, reducing cloud requests by 70% and achieving 2.18x speed-up.
Image Generation Acceleration Mar 26 Code High viability
Temporally Decoupled Diffusion Planning for Autonomous Driving Build Now
A diffusion model that generates safer and more goal-aligned autonomous driving trajectories by decoupling near-term and long-term planning.
Autonomous Driving Planning Mar 26 Code High viability
Cross-Model Disagreement as a Label-Free Correctness Signal Build Now
A training-free system that uses a second language model to detect confident errors in a primary model, improving LLM safety and reliability.
LLM Safety & Monitoring Mar 26 Code High viability
DC-Reg: Globally Optimal Point Cloud Registration via Tight Bounding with Difference of Convex Programming Build Now
A globally optimal point cloud registration framework that significantly tightens bounding for faster and more robust results, even with extreme noise and outliers.
Point Cloud Registration Mar 26 Code High viability
From Manipulation to Mistrust: Explaining Diverse Micro-Video Misinformation for Robust Debunking in the Wild Build Now
A multi-agent reasoning framework and benchmark for explaining diverse micro-video misinformation, enabling robust debunking.
Misinformation Detection Mar 26 Pending High viability
VideoWeaver: Multimodal Multi-View Video-to-Video Transfer for Embodied Agents Build Now
VideoWeaver enables robot policies to transfer to new environments by performing consistent multi-view video translation, overcoming limitations of single-view approaches.
Embodied AI Mar 26 Code High viability
TAPO: Translation Augmented Policy Optimization for Multilingual Mathematical Reasoning Build Now
A reinforcement learning framework that uses English as a pivot to significantly improve multilingual mathematical reasoning in LLMs by decoupling understanding from reasoning.
Multilingual LLM Reasoning Mar 26 Code High viability
Beyond Content Safety: Real-Time Monitoring for Reasoning Vulnerabilities in Large Language Models Build Now
A real-time monitor for LLM reasoning safety that detects and interrupts unsafe reasoning steps, improving security and reliability.
LLM Safety Mar 26 Code High viability
MMaDA-VLA: Large Diffusion Vision-Language-Action Model with Unified Multi-Modal Instruction and Generation Build Now
A unified diffusion model for robot manipulation that generates future observations and actions in parallel, improving long-horizon consistency and performance.
Robotics Control Mar 26 Code High viability
Shape and Substance: Dual-Layer Side-Channel Attacks on Local Vision-Language Models Build Now
Exploit algorithmic side-channels in on-device Vision-Language Models to infer sensitive user contexts, enabling proactive security measures for Edge AI.
Edge AI Security Mar 26 Code High viability
PMT: Plain Mask Transformer for Image and Video Segmentation with Frozen Vision Encoders Build Now
A fast Transformer-based segmentation decoder that leverages frozen vision foundation models for efficient image and video segmentation.
Image and Video Segmentation Mar 26 Pending High viability
UMBRELLA: Uncertainty-aware Multi-robot Reactive Coordination under Dynamic Temporal Logic Tasks Build Now
A framework for multi-robot coordination that handles dynamic tasks and uncertainties, improving efficiency and reliability.
Multi-Robot Coordination Mar 26 Code High viability
ALPS: Automated Least-Privilege Enforcement for Securing Serverless Functions Build Now
ALPS automates least-privilege enforcement for serverless functions, reducing security risks and improving permission management across cloud providers.
Cloud Security Mar 26 Code High viability
FSGNet: A Frequency-Aware and Semantic Guidance Network for Infrared Small Target Detection Build Now
A lightweight and effective network for infrared small target detection that uses frequency and semantic guidance to improve precision and efficiency.
Computer Vision Mar 26 Pending High viability
Multimodal Dataset Distillation via Phased Teacher Models Build Now
A phased distillation framework that creates compact synthetic datasets from large image-text data, significantly improving student model performance and reducing storage.
Multimodal AI Mar 26 Pending High viability
CLIP-RD: Relational Distillation for Efficient CLIP Knowledge Distillation Build Now
A novel relational distillation framework to create efficient, lightweight CLIP models that preserve structural relationships, outperforming existing methods.
Vision-Language Models Mar 26 Code High viability
IntentReact: Guiding Reactive Object-Centric Navigation via Topological Intent Build Now
A novel framework for robot navigation that bridges global topological planning with local perception for more efficient and robust object-goal navigation.
Robotics Navigation Mar 26 Code High viability
Bayesian Learning-Enhanced Navigation with Deep Smoothing for Inertial-Aided Navigation Build Now
A data-driven framework that uses Bayesian learning and deep smoothing to significantly improve the accuracy of inertial-aided navigation systems for robotics and mapping.
Robotics Navigation Mar 26 Code High viability
InstanceAnimator: Multi-Instance Sketch Video Colorization Build Now
A novel framework for flexible and controllable multi-instance sketch video colorization with enhanced detail fidelity.
Generative Video Mar 26 Code High viability
4OPS: Structural Difficulty Modeling in Integer Arithmetic Puzzles Build Now
Develops a novel method for precisely modeling and predicting the difficulty of integer arithmetic puzzles, enabling personalized learning experiences.
Educational AI Mar 26 Code High viability
SafeGuard ASF: SR Agentic Humanoid Robot System for Autonomous Industrial Safety Build Now
An agentic humanoid robot system for autonomous industrial safety, detecting fires, abnormal temperatures, and intruders.
Robotics Mar 26 Code High viability
From Intent to Evidence: A Categorical Approach for Structural Evaluation of Deep Research Agents Build Now
A new benchmark and theoretical framework for evaluating deep research agents, revealing significant gaps in their ability to perform complex structural synthesis.
Agents Mar 26 Pending High viability
Large Language Model as Token Compressor and Decompressor Build Now
Leverage off-the-shelf LLMs to compress long texts into compact latent codes, enabling token-efficient long-context reasoning and generation.
LLM Compression Mar 26 Code High viability
HeSS: Head Sensitivity Score for Sparsity Redistribution in VGGT Build Now
A novel method to significantly accelerate 3D vision transformers by intelligently redistributing attention sparsity based on head sensitivity, reducing computational cost without sacrificing accuracy.
3D Vision Acceleration Mar 26 Pending High viability
Adaptive Chunking: Optimizing Chunking-Method Selection for RAG Build Now
A framework that adaptively optimizes document chunking for RAG systems, significantly improving answer correctness and question success rates without changing models or prompts.
RAG Optimization Mar 26 Pending High viability
MACRO: Advancing Multi-Reference Image Generation with Structured Long-Context Data Build Now
A new dataset and benchmark for multi-reference image generation that overcomes current model limitations by providing structured, long-context supervision.
Generative Image Mar 26 Code High viability
Towards Controllable Low-Light Image Enhancement: A Continuous Multi-illumination Dataset and Efficient State Space Framework Build Now
A controllable low-light image enhancement framework with a new dataset and efficient state space model architecture.
Image Enhancement Mar 26 Code High viability
DAGverse: Building Document-Grounded Semantic DAGs from Scientific Papers Build Now
A framework and dataset for automatically extracting structured knowledge graphs from scientific papers, enabling deeper document understanding and reasoning.
Document Understanding Mar 26 Code High viability
CSI-tuples-based 3D Channel Fingerprints Construction Assisted by MultiModal Learning Build Now
A multimodal AI framework that constructs 3D channel fingerprints for enhanced 6G communication by fusing diverse data sources, outperforming existing methods by over 27.5% in accuracy.
6G Communications Mar 26 Code High viability
SliderQuant: Accurate Post-Training Quantization for LLMs Build Now
SliderQuant offers accurate post-training quantization for LLMs by adaptively quantizing layers based on their sensitivity, outperforming existing methods across various tasks and models.
LLM Optimization Mar 26 Pending High viability
A Gait Foundation Model Predicts Multi-System Health Phenotypes from 3D Skeletal Motion Watch
A foundation model for 3D skeletal motion predicts multi-system health phenotypes, enabling gait to be used as a scalable, passive vital sign.
Health AI Mar 26 High viability
V2U4Real: A Real-world Large-scale Dataset for Vehicle-to-UAV Cooperative Perception Build Now
A large-scale real-world dataset and benchmarks for Vehicle-to-UAV cooperative perception, addressing limitations of ground-level autonomous vehicle sensing.
Autonomous Vehicle Perception Mar 26 Pending High viability
When Hate Meets Facts: LLMs-in-the-Loop for Check-worthiness Detection in Hate Speech Build Now
An LLM-in-the-loop framework and dataset to improve hate speech detection by assessing claim check-worthiness, reducing moderator effort and improving model accuracy.
Content Moderation AI Mar 26 Code High viability
CRAFT: Grounded Multi-Agent Coordination Under Partial Information Build Now
A benchmark for evaluating and improving pragmatic communication and coordination in large language models for complex tasks.
Multi-Agent Coordination Mar 26 Pending High viability
EagleNet: Energy-Aware Fine-Grained Relationship Learning Network for Text-Video Retrieval Build Now
EagleNet enhances text-video retrieval by learning fine-grained relationships between text and video frames, leading to more accurate context-aware embeddings.
Text-Video Retrieval Mar 26 Pending High viability
ViewSplat: View-Adaptive Dynamic Gaussian Splatting for Feed-Forward Synthesis Build Now
ViewSplat enables high-fidelity, real-time 3D scene reconstruction from unposed images by dynamically adapting Gaussian splatting to each viewpoint.
3D Reconstruction Mar 26 Code High viability
Towards Practical Lossless Neural Compression for LiDAR Point Clouds Build Now
A novel neural compression framework for LiDAR point clouds that achieves real-time speed with competitive compression performance.
LiDAR Compression Mar 26 Pending High viability
Hyperspectral Trajectory Image for Multi-Month Trajectory Anomaly Detection Build Now
A novel vision-based approach for multi-month trajectory anomaly detection that significantly outperforms existing methods and offers substantial speedups.
Trajectory Anomaly Detection Mar 26 Code High viability
MolQuest: A Benchmark for Agentic Evaluation of Abductive Reasoning in Chemical Structure Elucidation Build Now
MolQuest provides a novel agent-based benchmark to evaluate and improve LLMs' abductive reasoning for complex scientific tasks like chemical structure elucidation, revealing significant performance gaps in current frontier models.
AI for Scientific Discovery Mar 26 Code High viability
Activation Matters: Test-time Activated Negative Labels for OOD Detection with Vision-Language Models Build Now
A training-free method for out-of-distribution detection that dynamically selects negative labels based on test-time activation to significantly improve accuracy.
Out-of-Distribution Detection Mar 26 Pending High viability
FEAST: Fully Connected Expressive Attention for Spatial Transcriptomics Build Now
FEAST uses a novel attention mechanism and off-grid sampling to infer spatial gene expression from whole slide images, overcoming limitations of existing graph neural networks and providing biologically plausible insights.
Medical AI Mar 26 Pending High viability
Offline Decision Transformers for Neural Combinatorial Optimization: Surpassing Heuristics on the Traveling Salesman Problem Build Now
Leveraging offline reinforcement learning to outperform classical heuristics on complex combinatorial optimization problems like the Traveling Salesman Problem.
Combinatorial Optimization Mar 26 Code High viability
Training-free Detection and 6D Pose Estimation of Unseen Surgical Instruments Build Now
A training-free system for accurate 6D pose estimation of unseen surgical instruments using only CAD models, enabling robust tracking in clinical environments.
Surgical AI Mar 26 Code High viability
WebTestBench: Evaluating Computer-Use Agents towards End-to-End Automated Web Testing Build Now
A benchmark and baseline framework for evaluating and improving end-to-end automated web testing agents, addressing critical gaps in test completeness and reliability for industrial deployment.
Agents Mar 26 Pending High viability
SDD-YOLO: A Small-Target Detection Framework for Ground-to-Air Anti-UAV Surveillance with Edge-Efficient Deployment Build Now
A highly efficient small-target detection framework for anti-UAV surveillance, optimized for edge deployment with a new dataset and improved accuracy.
Computer Vision Mar 26 Code High viability
A Wireless World Model for AI-Native 6G Networks Build Now
A foundation model that predicts wireless channel evolution by understanding 3D geometry and signal dynamics, enabling physics-aware 6G intelligence.
AI for Networks Mar 26 Code High viability
Free-Lunch Long Video Generation via Layer-Adaptive O.O.D Correction Build Now
A training-free framework that significantly improves long video generation quality from pre-trained short-clip models by adaptively correcting out-of-distribution issues.
Generative Video Mar 26 Pending High viability
Probabilistic Concept Graph Reasoning for Multimodal Misinformation Detection Build Now
An interpretable framework for detecting multimodal misinformation by reasoning over discovered concepts, outperforming existing methods.
Multimodal AI Mar 26 Code High viability
CIV-DG: Conditional Instrumental Variables for Domain Generalization in Medical Imaging Build Now
A causal framework using conditional instrumental variables to achieve robust domain generalization in medical imaging by disentangling pathological semantics from site-specific artifacts.
Medical AI Mar 26 Code High viability
SafeMath: Inference-time Safety improves Math Accuracy Build Now
A safety alignment technique that reduces harmful LLM outputs in mathematical reasoning without sacrificing accuracy, supported by a new dataset and released code.
LLM Safety Mar 26 Pending High viability
A Decade-Scale Benchmark Evaluating LLMs' Clinical Practice Guidelines Detection and Adherence in Multi-turn Conversations Build Now
A benchmark to evaluate and improve LLM adherence to clinical practice guidelines, crucial for safe healthcare deployment.
Medical AI Mar 26 Code High viability
CardioDiT: Latent Diffusion Transformers for 4D Cardiac MRI Synthesis Build Now
A 4D latent diffusion transformer for synthesizing realistic cardiac MRI sequences, improving temporal consistency and physiological accuracy.
Medical AI Mar 26 Pending High viability
AnyID: Ultra-Fidelity Universal Identity-Preserving Video Generation from Any Visual References Build Now
AnyID enables ultra-fidelity video generation from any visual reference, offering precise attribute control for creative expression.
Generative Video Mar 26 Code High viability
Knowledge-Guided Retrieval-Augmented Generation for Zero-Shot Psychiatric Data: Privacy Preserving Synthetic Data Generation Build Now
Generate privacy-preserving synthetic psychiatric data using LLMs guided by clinical knowledge, overcoming limitations of real data access.
Synthetic Data Generation Mar 26 Code High viability
Train at Moving Edge: Online-Verified Prompt Selection for Efficient RL Training of Large Reasoning Model Build Now
A dual-stage framework that efficiently selects high-utility prompts for reinforcement learning in large language models, reducing computational costs without sacrificing performance.
LLM Training Mar 26 Code High viability
Cross-Preference Learning for Sentence-Level and Context-Aware Machine Translation Build Now
A novel training framework for machine translation that adaptively leverages document-level context to improve translation quality and robustness.
Machine Translation Mar 26 Code High viability
VolDiT: Controllable Volumetric Medical Image Synthesis with Diffusion Transformers Build Now
A controllable 3D medical image synthesis tool using diffusion transformers for improved fidelity and spatial guidance.
Medical AI Mar 26 Pending High viability
Bilingual Text-to-Motion Generation: A New Benchmark and Baselines Build Now
A new bilingual text-to-motion benchmark and baseline model that enables high-quality motion generation from diverse language inputs, including zero-shot code-switching.
Generative Motion Mar 26 Code High viability
AG-EgoPose: Leveraging Action-Guided Motion and Kinematic Joint Encoding for Egocentric 3D Pose Estimation Build Now
A novel dual-stream framework for robust egocentric 3D human pose estimation from fisheye cameras, leveraging action-guided motion and kinematic joint encoding.
3D Pose Estimation Mar 26 Pending High viability
ET-SAM: Efficient Point Prompt Prediction in SAM for Unified Scene Text Detection and Layout Analysis Build Now
ET-SAM accelerates scene text detection and layout analysis by efficiently predicting point prompts, enabling faster inference and better data utilization.
Scene Text Analysis Mar 26 Code High viability
Towards Foundation Models for 3D Scene Understanding: Instance-Aware Self-Supervised Learning for Point Clouds Build Now
A self-supervised learning framework for 3D point clouds that enhances instance localization and semantic understanding, enabling scalable foundation models for diverse downstream tasks.
3D Scene Understanding Mar 26 Code High viability
SportSkills: Physical Skill Learning from Sports Instructional Videos Build Now
A dataset and retrieval system for personalized sports skill improvement from instructional videos.
Sports Skill Learning Mar 26 Code High viability
A Semantically Disentangled Unified Model for Multi-category 3D Anomaly Detection Build Now
A unified 3D anomaly detection model that disentangles category semantics to achieve state-of-the-art performance and reliability.
3D Anomaly Detection Mar 26 Code High viability
Vision Hopfield Memory Networks Build Now
A brain-inspired vision foundation model that offers improved data efficiency and interpretability through hierarchical memory mechanisms.
Vision Foundation Models Mar 26 Code High viability
UniAI-GraphRAG: Synergizing Ontology-Guided Extraction, Multi-Dimensional Clustering, and Dual-Channel Fusion for Robust Multi-Hop Reasoning Build Now
An enhanced RAG framework that improves multi-hop reasoning and domain-specific QA through ontology-guided extraction, advanced clustering, and dual-channel retrieval.
RAG Systems Mar 26 Pending High viability
FD$^2$: A Dedicated Framework for Fine-Grained Dataset Distillation Build Now
A framework for fine-grained dataset distillation that creates more discriminative and diverse synthetic datasets, improving recognition performance.
Dataset Distillation Mar 26 Code High viability
SAVe: Self-Supervised Audio-visual Deepfake Detection Exploiting Visual Artifacts and Audio-visual Misalignment Build Now
A self-supervised framework for detecting audio-visual deepfakes by learning solely on authentic videos, overcoming dataset bias and improving generalization.
Deepfake Detection Mar 26 Code High viability
EgoXtreme: A Dataset for Robust Object Pose Estimation in Egocentric Views under Extreme Conditions Build Now
A new dataset and code for robust 6D object pose estimation in egocentric views under extreme real-world conditions, addressing limitations of current benchmarks.
Computer Vision Mar 26 Code High viability
Robust Principal Component Completion Build Now
A novel Bayesian framework for robust principal component completion that directly identifies sparse component support, improving foreground extraction and anomaly detection.
Computer Vision Mar 26 Pending High viability
Denoise and Align: Towards Source-Free UDA for Robust Panoramic Semantic Segmentation Build Now
A framework for robust panoramic semantic segmentation that adapts models from labeled pinhole datasets to unlabeled panoramic data without access to the original source data, improving performance on challenging real-world applications.
Computer Vision Mar 26 Code High viability
AirSplat: Alignment and Rating for Robust Feed-Forward 3D Gaussian Splatting Build Now
Adapt 3D vision foundation models for high-fidelity, pose-free novel view synthesis with a novel alignment and rating framework.
3D Vision Mar 26 Code High viability
MCLMR: A Model-Agnostic Causal Learning Framework for Multi-Behavior Recommendation Build Now
A causal learning framework that improves recommendation systems by intelligently fusing user behaviors and mitigating bias.
Recommendation Systems Mar 26 Pending High viability
CTS-PLL: A Robust and Anytime Framework for Collaborative Task Sequencing and Multi-Agent Path Finding Build Now
A robust and anytime framework for collaborative task sequencing and multi-agent path finding that improves solution quality and success rates in complex environments.
Multi-Agent Path Finding Mar 26 Code High viability
AnyDoc: Enhancing Document Generation via Large-Scale HTML/CSS Data Synthesis and Height-Aware Reinforcement Optimization Build Now
A framework for generating diverse documents from natural language instructions by synthesizing large-scale HTML/CSS datasets and using height-aware reinforcement learning.
Document Generation Mar 26 Code High viability
Do LLMs Know What They Know? Measuring Metacognitive Efficiency with Signal Detection Theory Build Now
A new evaluation framework using signal detection theory to measure LLM metacognitive efficiency, revealing which models truly know what they don't know.
LLM Evaluation Mar 26 Code High viability
MoireMix: A Formula-Based Data Augmentation for Improving Image Classification Robustness Build Now
A formula-based data augmentation technique for image classification that procedurally generates Moire interference patterns on-the-fly to significantly improve model robustness with negligible computational cost.
Image Classification Augmentation Mar 26 Code High viability
MSRL: Scaling Generative Multimodal Reward Modeling via Multi-Stage Reinforcement Learning Build Now
Scales generative multimodal reward modeling using a novel multi-stage reinforcement learning approach, reducing reliance on costly multimodal preference data and significantly improving performance on visual understanding and generation tasks.
Multimodal AI Mar 26 Pending High viability
Label What Matters: Modality-Balanced and Difficulty-Aware Multimodal Active Learning Build Now
A reinforcement learning framework for smarter, modality-balanced data labeling in multimodal AI, reducing annotation costs and improving accuracy.
Multimodal Active Learning Mar 26 Code High viability
OMIND: Framework for Knowledge Grounded Finetuning and Multi-Turn Dialogue Benchmark for Mental Health LLMs Build Now
A framework and dataset for training and evaluating LLMs specifically for mental health conversations, demonstrating superior reasoning and performance.
Mental Health LLMs Mar 26 Code High viability
Large Language Models as Optimization Controllers: Adaptive Continuation for SIMP Topology Optimization Build Now
Leveraging LLMs as adaptive controllers for topology optimization to achieve superior engineering designs.
AI for Engineering Design Mar 26 Code High viability
ElephantBroker: A Knowledge-Grounded Cognitive Runtime for Trustworthy AI Agents Watch
ElephantBroker provides a verifiable and trustworthy memory system for LLM agents, enabling secure and auditable operation in high-stakes environments.
AI Agents Mar 26 High viability
Pixelis: Reasoning in Pixels, from Seeing to Acting Build Now
Pixelis is a pixel-space agent that learns to act and adapt in visual environments by directly manipulating images and videos, enabling grounded multimodal perception and embodied adaptation.
Vision-Language Agents Mar 26 Code High viability
THEMIS: Towards Holistic Evaluation of MLLMs for Scientific Paper Fraud Forensics Build Now
A new benchmark and evaluation framework for multimodal LLMs to detect sophisticated fraud in scientific papers.
Multimodal AI for Scientific Integrity Mar 26 Code High viability
Visual Attention Drifts,but Anchors Hold:Mitigating Hallucination in Multimodal Large Language Models via Cross-Layer Visual Anchors Build Now
A training-free method to reduce hallucination in multimodal LLMs by reinforcing intermediate visual features, improving output reliability without significant computational overhead.
Multimodal LLMs Mar 26 Code High viability
Learning domain-invariant features through channel-level sparsification for Out-Of Distribution Generalization Build Now
A novel method for training image analysis systems that reliably generalize across different data sources by enforcing feature sparsity and causal interventions.
OOD Generalization Mar 26 Code High viability
Bridging Perception and Reasoning: Token Reweighting for RLVR in Multimodal LLMs Build Now
A plug-and-play strategy to improve multimodal LLM reasoning by dynamically reweighting critical perception and reasoning tokens during training.
Multimodal LLMs Mar 26 Code High viability
RenoBench: A Citation Parsing Benchmark Watch
A new benchmark and dataset for accurate, machine-readable citation parsing to improve scholarly infrastructure.
NLP Infrastructure Mar 26 Code
Is Mathematical Problem-Solving Expertise in Large Language Models Associated with Assessment Performance? Watch
This research investigates the link between LLM math problem-solving ability and their accuracy in assessing student reasoning steps, suggesting potential for improved AI tutors.
AI in Education Mar 26 Code
Cooperative Deep Reinforcement Learning for Fair RIS Allocation Watch
A cooperative multi-agent reinforcement learning system for fair resource allocation in wireless networks using reconfigurable intelligent surfaces.
Wireless Resource Allocation Mar 26 Code
BFMD: A Full-Match Badminton Dense Dataset for Dense Shot Captioning Watch
A new dataset and multimodal captioning framework for understanding tactical dynamics in full badminton matches.
Computer Vision Mar 26 Code
Visualizing Impedance Control in Augmented Reality for Teleoperation: Design and User Evaluation Watch
Augmented reality visualization of impedance control for teleoperation improves force-critical manipulation tasks by providing intuitive, real-time feedback without haptic hardware.
AR Teleoperation Mar 26
System Design for Maintaining Internal State Consistency in Long-Horizon Robotic Tabletop Games Watch
A system design for robots to maintain internal state consistency in long-horizon tabletop games, demonstrated with Mahjong.
Robotics Mar 26 Code
A Causal Framework for Evaluating ICU Discharge Strategies Watch
A causal inference framework to optimize ICU patient discharge strategies, demonstrated on MIMIC-IV.
Medical AI Mar 26 Code
GlowQ: Group-Shared LOw-Rank Approximation for Quantized LLMs Watch
A novel low-rank approximation technique for quantized LLMs that reduces overhead and improves accuracy by selectively restoring layers.
LLM Quantization Mar 26
A Distribution-to-Distribution Neural Probabilistic Forecasting Framework for Dynamical Systems Watch
A neural framework that directly forecasts probability distributions for dynamical systems, offering improved uncertainty quantification.
Probabilistic Forecasting Mar 26 Code
Image Rotation Angle Estimation: Comparing Circular-Aware Methods Watch
A study comparing circular-aware methods for image rotation estimation, identifying the most robust and accurate approaches for vision pipelines.
Computer Vision Mar 26 Code
Macroscopic Characteristics of Mixed Traffic Flow with Deep Reinforcement Learning Based Automated and Human-Driven Vehicles Watch
Leveraging Deep Reinforcement Learning to optimize autonomous vehicle control in mixed traffic, improving road capacity and fuel efficiency.
Autonomous Vehicles Mar 26 Code
A Minimum-Energy Control Approach for Redundant Mobile Manipulators in Physical Human-Robot Interaction Applications Watch
A novel control method for mobile manipulators that minimizes kinetic energy for safer and more efficient human-robot interaction.
Robotics Control Mar 26 Code
FluxEDA: A Unified Execution Infrastructure for Stateful Agentic EDA Watch
A stateful execution infrastructure for agentic EDA tools to enable iterative optimization and state preservation in production environments.
Agentic EDA Automation Mar 26
A CDF-First Framework for Free-Form Density Estimation Watch
A new framework for free-form density estimation that estimates the CDF first to overcome limitations of direct PDF estimation, outperforming existing methods.
Density Estimation Mar 26 Code
TacSIm: A Dataset and Benchmark for Football Tactical Style Imitation Watch
A new dataset and benchmark for accurately replicating real-world football team tactical behaviors, moving beyond simple reward optimization.
Sports AI Mar 26 Code
zk-X509: Privacy-Preserving On-Chain Identity from Legacy PKI via Zero-Knowledge Proofs Watch
Leverage existing X.509 certificates for privacy-preserving on-chain identity using zero-knowledge proofs.
Decentralized Identity Mar 26
A Catalog of Basque Dialectal Resources: Online Collections and Standard-to-Dialectal Adaptations Watch
This paper catalogs and creates new Basque dialectal NLP resources, addressing data scarcity for low-resource languages.
NLP Data Resources Mar 26 Code
Photon: Speedup Volume Understanding with Efficient Multimodal Large Language Models Watch
Photon accelerates 3D medical image understanding by efficiently representing volumetric data with adaptive token scheduling for multimodal LLMs.
Medical AI Mar 26
Goodness-of-pronunciation without phoneme time alignment Watch
Enables low-resource language speech evaluation by adapting weakly-supervised ASR models without phoneme time alignment.
Speech AI Mar 26 Code
Layer-Specific Lipschitz Modulation for Fault-Tolerant Multimodal Representation Learning Watch
A mathematically grounded framework for building multimodal AI systems that can tolerate sensor failures and signal degradation.
Multimodal AI Mar 26 Code
Process-Aware AI for Rainfall-Runoff Modeling: A Mass-Conserving Neural Framework with Hydrological Process Constraints Watch
A physics-constrained AI framework for rainfall-runoff modeling that improves predictive accuracy and interpretability by embedding hydrological process knowledge.
Hydrological AI Mar 26 Code
R-C2: Cycle-Consistent Reinforcement Learning Improves Multimodal Reasoning Ignore
A reinforcement learning framework that enforces cross-modal consistency to improve multimodal reasoning accuracy.
Multimodal Reasoning Mar 26
Intelligent Navigation and Obstacle-Aware Fabrication for Mobile Additive Manufacturing Systems Ignore
A universal platform for mobile additive manufacturing robots that integrates navigation and material deposition for adaptable production in dynamic environments.
Robotics and Automation Mar 26
On Neural Scaling Laws for Weather Emulation through Continual Training Ignore
Develops a framework for understanding and optimizing neural network scaling for weather forecasting, enabling more efficient resource allocation and improved prediction accuracy.
Scientific ML Mar 26 Code
Measuring What Matters -- or What's Convenient?: Robustness of LLM-Based Scoring Systems to Construct-Irrelevant Factors Ignore
This research evaluates the robustness of LLM-based essay scoring systems to irrelevant factors, suggesting potential for more reliable automated assessment.
LLM Evaluation Mar 26 Code
Beyond Via: Analysis and Estimation of the Impact of Large Language Models in Academic Papers Ignore
This research analyzes shifts in academic writing driven by LLMs, developing a method to detect LLM-generated text and understand their impact on language.
LLM Analysis Mar 26 Code
Visual or Textual: Effects of Explanation Format and Personal Characteristics on the Perception of Explanations in an Educational Recommender System Ignore
This research investigates how visual versus textual explanations impact user perception in educational recommender systems, offering design guidelines.
Educational Recommender Systems Mar 26
Accurate Surface and Reflectance Modelling from 3D Radar Data with Neural Radiance Fields Ignore
A neural implicit method for improved 3D surface and reflectance modeling from sparse and noisy radar data.
3D Reconstruction Mar 26
Demographic Fairness in Multimodal LLMs: A Benchmark of Gender and Ethnicity Bias in Face Verification Ignore
This research benchmarks demographic fairness in multimodal LLMs for face verification, revealing significant biases and performance disparities across different models and demographic groups.
Multimodal LLM Bias Mar 26 Code
Are LLMs Overkill for Databases?: A Study on the Finiteness of SQL Ignore
This research suggests that simpler, template-based approaches may be more efficient and auditable than LLMs for generating SQL queries in many practical database scenarios.
Database Query Generation Mar 26
Revisiting On-Policy Distillation: Empirical Failure Modes and Simple Fixes Ignore
Improves on-policy distillation for LLMs by addressing empirical failure modes to achieve more stable optimization and better downstream performance.
LLM Training Mar 26
Humans vs Vision-Language Models: A Unified Measure of Narrative Coherence Ignore
A new metric to evaluate narrative coherence in vision-language models, revealing systematic differences from human storytelling.
Vision-Language Models Mar 26 Pending
Adaptive Subspace Modeling With Functional Tucker Decomposition Ignore
A functional Tucker decomposition method that embeds continuity constraints for improved multidimensional data analysis in classification tasks.
Tensor Decomposition Mar 26 Code
Challenges in Hyperspectral Imaging for Autonomous Driving: The HSI-Drive Case Ignore
Developing custom vision algorithms to overcome hyperspectral imaging challenges for real-time autonomous driving applications.
Hyperspectral Imaging for Autonomous Driving Mar 26 Code
An Experimental Comparison of the Most Popular Approaches to Fake News Detection Ignore
This research critically assesses 12 fake news detection approaches, highlighting LLMs' potential for zero/few-shot learning in text-only English scenarios.
Fake News Detection Mar 26 Code
Interpretable PM2.5 Forecasting for Urban Air Quality: A Comparative Study of Operational Time-Series Models Ignore
Lightweight and interpretable time-series models offer competitive performance for urban air quality forecasting, balancing accuracy and computational efficiency.
Time-Series Forecasting Mar 26
Translation Asymmetry in LLMs as a Data Augmentation Factor: A Case Study for 6 Romansh Language Varieties Ignore
A novel data augmentation strategy for low-resource machine translation that aligns with resource gradients, outperforming existing models for specific language varieties.
Low-Resource Machine Translation Mar 26
Not a fragment, but the whole: Map-based evaluation of data-driven Fire Danger Index models Ignore
A novel evaluation method for wildfire prediction models that prioritizes operational decision-making by accurately assessing fire activity and minimizing false alarms.
Wildfire Prediction Mar 26 Code
Navigating the Prompt Space: Improving LLM Classification of Social Science Texts Through Prompt Engineering Ignore
This paper explores prompt engineering techniques to improve LLM classification accuracy for social science texts, finding that minimal context increases yield the best results.
LLM Prompt Engineering Mar 26
Does Structured Intent Representation Generalize? A Cross-Language, Cross-Model Empirical Study of 5W3H Prompting Ignore
A framework for structured intent representation in human-AI interaction that improves goal alignment and accessibility across languages and models.
LLM Prompting Mar 26
Supercharging Federated Intelligence Retrieval Ignore
A secure federated RAG system for distributed private data retrieval and confidential LLM inference.
Federated Learning Mar 26
Hessian-informed machine learning interatomic potential towards bridging theory and experiments Ignore
Develops a novel machine learning interatomic potential that accurately predicts material properties by incorporating local curvature information, bridging the gap between theoretical simulations and experimental observations.
Materials Science AI Mar 26
Integrating Deep RL and Bayesian Inference for ObjectNav in Mobile Robotics Ignore
A hybrid framework for mobile robots that combines Bayesian inference with deep reinforcement learning to improve autonomous object search efficiency and reliability in partially observable indoor environments.
Mobile Robotics Mar 26
Agentic Trust Coordination for Federated Learning through Adaptive Thresholding and Autonomous Decision Making in Sustainable and Resilient Industrial Networks Ignore
A server-side control layer for federated learning that uses agentic trust coordination to improve reliability in industrial networks.
Federated Learning Mar 26
Evaluating Language Models for Harmful Manipulation Ignore
A framework for evaluating AI manipulation in context-specific human-AI interactions, revealing domain and geographic differences in manipulative behavior.
AI Safety & Evaluation Mar 26
How Pruning Reshapes Features: Sparse Autoencoder Analysis of Weight-Pruned Language Models Ignore
This research analyzes how weight pruning in language models affects learned features, revealing that rare features are surprisingly resilient and suggesting pruning acts as implicit feature selection.
LLM Interpretability Mar 26 Pending
Physical Backdoor Attack Against Deep Learning-Based Modulation Classification Ignore
A novel physical backdoor attack against deep learning-based radio frequency signal classifiers that bypasses existing defenses.
Adversarial ML Security Mar 26 Code
Connectivity-Aware Representations for Constrained Motion Planning via Multi-Scale Contrastive Learning Ignore
A novel representation learning approach for constrained motion planning that improves success rates and reduces planning time by intelligently selecting start and goal configurations.
Motion Planning Mar 26
Does Explanation Correctness Matter? Linking Computational XAI Evaluation to Human Understanding Ignore
This research validates the effectiveness of Explainable AI metrics by demonstrating that not all functional correctness improvements translate to better human understanding, highlighting a gap for AI explainability tools.
Explainable AI (XAI) Mar 26 Code
Semantic-Aware Prefix Learning for Token-Efficient Image Generation Ignore
A novel tokenization method for image generation that improves semantic understanding and generation quality.
Generative Image Models Mar 26
Efficient Preemptive Robustification with Image Sharpening Ignore
A novel image sharpening technique that efficiently robustifies deep neural networks against adversarial attacks without requiring complex training or optimization.
Computer Vision Mar 26 Code
A Unified Spatial Alignment Framework for Highly Transferable Transformation-Based Attacks on Spatially Structured Tasks Ignore
A framework for creating more effective adversarial attacks on structured computer vision tasks by aligning input transformations with label transformations.
Adversarial Attacks Mar 26 Code
An Image Dataset of Common Skin Diseases of Bangladesh and Benchmarking Performance with Machine Learning Models Ignore
A publicly available dataset of common skin diseases in Bangladesh, with initial benchmarking, to accelerate AI-powered dermatology diagnostics.
Medical AI Mar 26 Code
Comparing Natural and Synthetic Structured Data: A Study of the Passive Verb Alternation in French and Italian Ignore
Develops a method to evaluate LLM linguistic generalization using structured natural and synthetic data, highlighting the limitations of synthetic data for real-world understanding.
LLM Evaluation Mar 26 Code
Fair regression under localized demographic parity constraints Ignore
A novel regression fairness method that enforces distributional parity at specific quantiles or thresholds, offering a tunable trade-off between fairness and accuracy.
Fairness in ML Mar 26 Code
Translation or Recitation? Calibrating Evaluation Scores for Machine Translation of Extremely Low-Resource Languages Ignore
Develops intrinsic metrics to contextualize machine translation performance for extremely low-resource languages, addressing variability caused by dataset artifacts rather than model capability.
Machine Translation Mar 26 Code
Gap Safe Screening Rules for Fast Training of Robust Support Vector Machines under Feature Noise Ignore
Accelerate the training of robust machine learning models by safely screening out irrelevant data points.
Machine Learning Optimization Mar 26 Code
Knowledge-Guided Adversarial Training for Infrared Object Detection via Thermal Radiation Modeling Ignore
Enhance infrared object detection robustness by embedding physical thermal radiation knowledge into adversarial training.
Computer Vision Mar 26 Code
To Write or to Automate Linguistic Prompts, That Is the Question Ignore
Automating LLM prompt optimization for linguistic tasks shows task-dependent results, sometimes matching expert performance but not yet replacing it.
LLM Prompt Optimization Mar 26
PIDP-Attack: Combining Prompt Injection with Database Poisoning Attacks on Retrieval-Augmented Generation Systems Ignore
A novel attack method that combines prompt injection and database poisoning to compromise Retrieval-Augmented Generation systems, demonstrating significant improvements over existing methods.
LLM Security Mar 26 Code
Learning to Rank Caption Chains for Video-Text Alignment Ignore
A novel ranking optimization approach for video-text alignment that improves caption generation quality over binary preference methods.
Video-Text Alignment Mar 26
Dissimilarity-Based Persistent Coverage Control of Multi-Robot Systems for Improving Solar Irradiance Prediction Accuracy in Solar Thermal Power Plants Ignore
A persistent coverage control algorithm for multi-robot systems to improve solar irradiance prediction accuracy in solar thermal power plants.
Robotics for Environmental Monitoring Mar 26
RubricEval: A Rubric-Level Meta-Evaluation Benchmark for LLM Judges in Instruction Following Ignore
A new benchmark for evaluating the reliability of rubric-based instruction following in LLMs, revealing significant performance gaps even in advanced models.
LLM Evaluation Mar 26 Code
When Sensing Varies with Contexts: Context-as-Transform for Tactile Few-Shot Class-Incremental Learning Ignore
A novel method for few-shot class-incremental learning in tactile sensing that adapts to varying acquisition contexts.
Few-Shot Learning Mar 26 Code
SEVerA: Verified Synthesis of Self-Evolving Agents Ignore
A framework for generating self-evolving LLM agents with formal guarantees of safety and correctness.
Agents Mar 26
Sparse Visual Thought Circuits in Vision-Language Models Ignore
Develops a diagnostic framework to understand and control the internal workings of vision-language models by analyzing sparse autoencoder features.
Vision-Language Models Mar 26 Code
Neural Network Conversion of Machine Learning Pipelines Ignore
This paper explores transferring knowledge from non-neural machine learning pipelines to neural networks to create unified inference engines for multiple ML tasks.
ML Pipeline Optimization Mar 26
A Unified Memory Perspective for Probabilistic Trustworthy AI Ignore
A novel memory architecture to improve the efficiency and scalability of probabilistic computations for trustworthy AI.
Trustworthy AI Hardware Mar 26
Self-Improvement of Large Language Models: A Technical Overview and Future Outlook Ignore
A framework for organizing techniques in self-improving large language models, exploring autonomous data generation, evaluation, and refinement.
LLM Training Mar 26
A Mentalistic Interface for Probing Folk-Psychological Attribution to Non-Humanoid Robots Ignore
A research platform to study how people attribute mental states to robots using LLM-generated explanations.
Robotics AI Mar 26
Insights on back marking for the automated identification of animals Ignore
Optimizing animal back mark design for accurate machine learning-based individual monitoring.
Animal Monitoring AI Mar 26
Synchronous Signal Temporal Logic for Decidable Verification of Cyber-Physical Systems Ignore
A decidable fragment of Signal Temporal Logic for static verification of safety-critical cyber-physical systems, enabling model checking with SPIN.
Formal Verification Mar 26 Code
NERO-Net: A Neuroevolutionary Approach for the Design of Adversarially Robust CNNs Ignore
A neuroevolutionary approach to design CNNs that are intrinsically robust to adversarial attacks, improving post-attack accuracy without sacrificing clean accuracy.
AI Model Design Mar 26
How Class Ontology and Data Scale Affect Audio Transfer Learning Ignore
This research investigates the impact of data scale and class ontology on audio transfer learning performance, identifying key factors for effective pre-training.
Audio Transfer Learning Mar 26
Residual-as-Teacher: Mitigating Bias Propagation in Student--Teacher Estimation Ignore
A theoretical framework to reduce bias propagation in student-teacher AI models.
AI Training Methods Mar 26
Modernising Reinforcement Learning-Based Navigation for Embodied Semantic Scene Graph Generation Ignore
This research modernizes navigation for embodied agents to generate semantic scene graphs more efficiently by improving decision-making policies and action formulations.
Embodied AI Navigation Mar 26
Multi-target Coverage-based Greybox Fuzzing Ignore
A novel fuzzing technique for cooperative system and firmware execution to discover vulnerabilities.
Security Fuzzing Mar 26
Practical Efficient Global Optimization is No-regret Ignore
This paper theoretically analyzes the regret bounds of a practical Bayesian optimization algorithm, offering theoretical insights into its no-regret properties.
Bayesian Optimization Mar 26
On the Vulnerability of Deep Automatic Modulation Classifiers to Explainable Backdoor Threats Ignore
This paper explores a novel physical backdoor attack on deep learning-based automatic modulation classifiers in wireless communications, guided by explainable AI.
Cybersecurity AI Mar 26
Usability of Passwordless Authentication in Wi-Fi Networks: A Comparative Study of Passkeys and Passwords in Captive Portals Ignore
This paper explores the usability of passkeys versus passwords in Wi-Fi captive portals, identifying design recommendations to improve user experience and reduce error rates.
Authentication Systems Mar 26
Revealing the influence of participant failures on model quality in cross-silo Federated Learning Ignore
This research systematically investigates the impact of participant failures on model quality in cross-silo Federated Learning, providing insights into data skewness and availability patterns.
Federated Learning Mar 26
Mitigating Evasion Attacks in Fog Computing Resource Provisioning Through Proactive Hardening Ignore
This paper proposes an adversarial training method to improve the robustness of k-means based resource provisioning in fog networks against evasion attacks.
AI Security Mar 26
The Competence Shadow: Theory and Bounds of AI Assistance in Safety Engineering Ignore
Formalizing the risks of AI assistance in safety engineering to design shadow-resistant workflows for physical AI systems.
AI Safety Engineering Mar 26 Code
Probing the Lack of Stable Internal Beliefs in LLMs Ignore
This research identifies a fundamental limitation in current LLMs' ability to maintain consistent internal goals, hindering the development of realistic persona-driven applications.
LLM Behavior Mar 26
Factors Influencing the Quality of AI-Generated Code: A Synthesis of Empirical Evidence Ignore
This paper synthesizes empirical evidence on factors influencing the quality of AI-generated code, highlighting the interplay of human and AI system characteristics.
AI Code Generation Mar 26
From Logic Monopoly to Social Contract: Separation of Power and the Institutional Foundations for Autonomous Agent Economies Ignore
A theoretical framework for structuring autonomous agent economies with a separation of powers model to address reliability and deception issues.
Autonomous Agents Mar 26
Retraining as Approximate Bayesian Inference Ignore
A theoretical framework for understanding model retraining as approximate Bayesian inference to minimize 'learning debt'.
LLM Training Mar 26
Decidable By Construction: Design-Time Verification for Trustworthy AI Ignore
A theoretical framework for designing AI models that are provably correct and stable before training, eliminating post-hoc verification overhead.
AI Verification Mar 26
Beyond Detection: Rethinking Education in the Age of AI-writing Ignore
This paper argues that the cognitive benefits of writing are lost when outsourced to AI, proposing new pedagogical approaches and critical literacy skills for the age of generative AI.
AI Education & Pedagogy Mar 26
Probabilistic Abstract Interpretation on Neural Networks via Grids Approximation Ignore
A theoretical framework for analyzing the input-output behavior of neural networks by approximating density distributions.
AI Safety & Verification Mar 26
The Geometry of Efficient Nonconvex Sampling Ignore
Develops a theoretical algorithm for uniform sampling from complex geometric shapes, with potential applications in statistical inference and machine learning.
Sampling Algorithms Mar 26
The Rules-and-Facts Model for Simultaneous Generalization and Memorization in Neural Networks Ignore
A theoretical model to understand how neural networks learn rules and memorize facts simultaneously.
Theoretical ML Mar 26
Distribution and Clusters Approximations as Abstract Domains in Probabilistic Abstract Interpretation to Neural Network Analysis Ignore
This paper introduces theoretical novel approximation methods for analyzing neural network density distributions, lacking immediate product application.
AI Research Tools Mar 26