What is it about?

In many real-world situations, such as surveillance and autonomous driving, computers need to detect and follow multiple moving objects at the same time. While recent AI models based on Transformers can handle both tasks in an end-to-end manner, training them jointly can cause the tasks to interfere with each other. This work introduces a new AI model that reduces this conflict between detecting new objects and tracking existing ones, while still keeping the model fully end-to-end. We achieve this by separating key steps in the model’s processing pipeline and designing each part specifically for either detection or tracking. As a result, the model becomes better at both identifying new targets and consistently following those it has already seen.

Featured Image

Read the Original

This page is a summary of: DQFormer: Transformer with Decoupled Query Augmentations for End-to-End Multi-Object Tracking, ACM Transactions on Multimedia Computing Communications and Applications, May 2025, ACM (Association for Computing Machinery),
DOI: 10.1145/3735510.
You can read the full text:

Read

Contributors

The following have contributed to this page