A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS

Posted Apr 23, 2024

By Abhijit More 2 min read

Introduction

Welcome to a comprehensive summary of the YOLO (You Only Look Once) models, detailing their evolution from YOLOv1 to YOLO-NAS. YOLO is one of the most popular object detection frameworks in real-time applications such as autonomous vehicles, video surveillance, and robotics. This README provides a snapshot of each YOLO model, highlighting the key innovations, developers, and performance improvements. Let’s dive into the world of YOLO!

1️⃣ YOLOv1 (2016)

Developers: Joseph Redmon et al.
Key Features:
- First real-time object detection model using a single neural network pass.
- Divides image into grids to predict bounding boxes and class probabilities.
Limitations: Struggles with small objects and nearby object detection.

2️⃣ YOLOv2 (2017)

Developers: Joseph Redmon and Ali Farhadi.
Key Features:
- Introduced batch normalization and anchor boxes for better bounding box predictions.
- Improved accuracy with multi-scale training.

3️⃣ YOLOv3 (2018)

Developers: Joseph Redmon and Ali Farhadi.
Key Features:
- New Darknet-53 backbone for better feature extraction.
- Multi-scale predictions for improved detection of small objects.

4️⃣ YOLOv4 (2020)

Developers: Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao.
Key Features:
- Introduced CSPDarknet53 and PANet for feature fusion.
- Innovations like Mosaic augmentation and CIoU loss for improved training.

5️⃣ YOLOv5 (2020)

Developer: Glen Jocher at Ultralytics.
Key Features:
- Developed in PyTorch, easy to use and deploy.
- Scalable models from nano to extra-large, optimized for speed and accuracy.

6️⃣ Scaled YOLOv4 (2021)

Developers: Same team as YOLOv4.
Key Features:
- Introduced scaling for lightweight and high-performance models.
- YOLOv4-tiny and YOLOv4-large for edge devices and cloud GPUs.

7️⃣ YOLOR (2021)

Developers: Same team as YOLOv4.
Key Features:
- Multi-task learning for tasks like detection, classification, and pose estimation.
- Uses implicit knowledge to boost model performance.

8️⃣ YOLOX (2021)

Developers: Megvii Technology.
Key Features:
- Anchor-free architecture for simplified training.
- Decoupled head for better accuracy in classification and regression.

9️⃣ YOLOv6 (2022)

Developers: Meituan Vision AI Department.
Key Features:
- New EfficientRep backbone based on RepVGG.
- Improved quantization and task alignment for faster and more accurate detection.

🔟 YOLOv7 (2022)

Developers: Alexey Bochkovskiy, Chien-Yao Wang, Hong-Yuan Mark Liao.
Key Features:
- Introduced E-ELAN blocks for efficient learning.
- Optimized for small objects with real-time performance improvements.

1️⃣1️⃣ YOLOv8 (2023)

Developer: Ultralytics.
Key Features:
- Anchor-free with a decoupled head for objectness, classification, and regression tasks.
- Supports multiple tasks like segmentation, detection, and pose estimation.

1️⃣2️⃣ YOLO-NAS (2023)

Developer: Deci.
Key Features:
- Designed using AutoNAC, an automatic architecture search tool for real-time applications.
- Enhanced for small object detection and edge-device deployments.

📈 Conclusion

YOLO has come a long way from its inception, balancing real-time performance with increased accuracy across different tasks. Each version builds on its predecessor, making YOLO the go-to framework for object detection in diverse applications.

🔗 References:

Papers, Computer Vision

This post is licensed under CC BY 4.0 by the author.