All Stories

  1. Kvasir-VQA: A Text-Image Pair GI Tract Dataset
  2. Bridging Multimedia Modalities: Enhanced Multimodal AI Understanding and Intelligent Agents
  3. Soccer Game Summarization using Audio Commentary, Metadata, and Captions