Abstract
We introduce YOLO11-JDE, a fast and accurate multi-
object tracking (MOT) solution that combines real-time ob-
ject detection with self-supervised Re-Identification (Re-
ID). By incorporating a dedicated Re-ID branch into
YOLO11s, our model performs Joint Detection and Embed-
ding (JDE), generating appearance features for each detec-
tion. The Re-ID branch is trained in a fully self-supervised
setting while simultaneously training for detection, elimi-
nating the need for costly identity-labeled datasets. The
triplet loss, with hard positive and semi-hard negative min-
ing strategies, is used for learning discriminative embed-
dings. Data association is enhanced with a custom tracking
implementation that successfully integrates motion, appear-
ance, and location cues. YOLO11-JDE achieves competi-
tive results on MOT17 and MOT20 benchmarks, surpass-
ing existing JDE methods in terms of FPS and using up to
ten times fewer parameters. Thus, making our method a
highly attractive solution for real-world applications. The
code is publicly available at https://github.com/
inakierregueab/YOLO11-JDE.
object tracking (MOT) solution that combines real-time ob-
ject detection with self-supervised Re-Identification (Re-
ID). By incorporating a dedicated Re-ID branch into
YOLO11s, our model performs Joint Detection and Embed-
ding (JDE), generating appearance features for each detec-
tion. The Re-ID branch is trained in a fully self-supervised
setting while simultaneously training for detection, elimi-
nating the need for costly identity-labeled datasets. The
triplet loss, with hard positive and semi-hard negative min-
ing strategies, is used for learning discriminative embed-
dings. Data association is enhanced with a custom tracking
implementation that successfully integrates motion, appear-
ance, and location cues. YOLO11-JDE achieves competi-
tive results on MOT17 and MOT20 benchmarks, surpass-
ing existing JDE methods in terms of FPS and using up to
ten times fewer parameters. Thus, making our method a
highly attractive solution for real-world applications. The
code is publicly available at https://github.com/
inakierregueab/YOLO11-JDE.
Original language | English |
---|---|
Title of host publication | Winter Conference on Applications of Computer Vision Workshops |
Publisher | IEEE (Institute of Electrical and Electronics Engineers) |
Publication date | 2025 |
Publication status | Published - 2025 |
Event | Winter Conference on Applications of Computer Vision Workshops, - Tucson , United States Duration: 28 Feb 2025 → 4 Mar 2025 |
Conference
Conference | Winter Conference on Applications of Computer Vision Workshops, |
---|---|
Country/Territory | United States |
City | Tucson |
Period | 28/02/2025 → 04/03/2025 |