Vision-Language Models
- Vision-Language Pretraining
- Multimodal Reasoning
Relevant Papers:
- Vltp: Vision-language guided token pruning for task-oriented segmentation
- Taskclip: Extend large vision-language model for task oriented object detection
Key research and applications in vision-language AI.