Papers
1–3 of 3Research Paper·Mar 13, 2026
MoKus: Leveraging Cross-Modal Knowledge Transfer for Knowledge-Aware Concept Customization
Concept customization typically binds rare tokens to a target concept. Unfortunately, these approaches often suffer from unstable performance as the pretraining data seldom contains these rare tokens....
7.0 viability
Research Paper·Feb 27, 2026
A Mixed Diet Makes DINO An Omnivorous Vision Encoder
Pre-trained vision encoders like DINOv2 have demonstrated exceptional performance on unimodal tasks. However, we observe that their feature representations are poorly aligned across different modaliti...
5.0 viability
Research Paper·Jan 26, 2026
Rethinking Cross-Modal Fine-Tuning: Optimizing the Interaction between Feature Alignment and Target Fitting
Adapting pre-trained models to unseen feature modalities has become increasingly important due to the growing need for cross-disciplinary knowledge integration.~A key challenge here is how to align th...
5.0 viability