Vision Transformer
Vision Transformer is a model in our research taxonomy.
Related papers
- Diffusion-Driven Deceptive Patches: Adversarial Manipulation and Forensic Detection in Facial Identity Verification
- MetricAnything: Scaling Metric Depth Pretraining with Noisy Heterogeneous Sources
- Do Transformers Understand Ancient Roman Coin Motifs Better than CNNs?
- PEAR: Pixel-aligned Expressive humAn mesh Recovery
- RGB-Event HyperGraph Prompt for Kilometer Marker Recognition based on Pre-trained Foundation Models
- MapViT: A Two-Stage ViT-Based Framework for Real-Time Radio Quality Map Prediction in Dynamic Environments
- FedBCD:Communication-Efficient Accelerated Block Coordinate Gradient Descent for Federated Learning
- Self-learned representation-guided latent diffusion model for breast cancer classification in deep ultraviolet whole surface images
- Xray-Visual Models: Scaling Vision models on Industry Scale Data
- A Framework for Cross-Domain Generalization in Coronary Artery Calcium Scoring Across Gated and Non-Gated Computed Tomography
- A Contrastive Learning Framework Empowered by Attention-based Feature Adaptation for Street-View Image Classification
- ECHOSAT: Estimating Canopy Height Over Space And Time
- Learn from A Rationalist: Distilling Intermediate Interpretable Rationales