Vehicle Detection using YOLO v5

Overview

This project implements a custom AI model that can identify and classify 15 different types of vehicles in real-time using YOLOv5 deep learning technology. The system achieves 0.8 mAP accuracy across multiple vehicle brands and types, demonstrating the effectiveness of modern computer vision techniques for automotive applications.

We developed and compared four different YOLOv5 model variants (s, m, l, x) to find the optimal configuration for vehicle detection tasks. The project includes comprehensive dataset creation, model training, and performance analysis using Google Colab's free GPU resources.

What is YOLO?

YOLO (You Only Look Once) revolutionized object detection by processing images in a single pass, making it incredibly fast for real-time applications. Unlike traditional methods that scan images multiple times, YOLO divides the image into a grid and predicts objects within each cell simultaneously.

How YOLO Works:

Grid Division: Splits image into M×M grid cells
Object Detection: Each cell predicts objects within it
Classification: Identifies what type of object it found
Confidence Scoring: Rates how sure it is about each detection

Problem Statement

Can we train an artificial intelligence model to distinguish between different car brands and types as accurately as a human expert?

We decided to tackle the challenging problem of multi-class vehicle detection - not just finding cars in images, but identifying their specific brands (BMW, Ford, Toyota, etc.) and types (Coupe, Sedan, SUV, Truck) simultaneously.

Key Challenges:

Multi-class Detection: Distinguishing between 15 different vehicle categories
Brand Recognition: Identifying subtle visual differences between car manufacturers
Type Classification: Categorizing vehicles by body style and function
Real-time Performance: Achieving fast inference speeds for practical applications
Dataset Quality: Creating high-quality labeled training data

Solution

Our approach employed YOLOv5 with a comprehensive methodology covering dataset creation, model training, and performance evaluation across multiple configurations.

Dataset: 15 Vehicle Categories

Vehicle Brands (11 Classes)

Acura, BMW, Chevrolet, Ford, Honda, Infinity, Lexus, Mercedes, Nissan, Subaru, Toyota

Vehicle Types (4 Classes)

Coupe, Sedan, SUV, Truck

Research Methodology

Dataset Creation

1000+ images manually labeled using professional annotation tools

Model Training

4 YOLOv5 variants tested with different configurations

Performance Analysis

Comprehensive evaluation across metrics and configurations

Insights & Learning

Key findings for optimal model configuration and deployment

Technical Implementation

Step-by-Step Implementation

1. Environment Setup

Set up the development environment using Google Colab for free GPU access.

# Mount Google Drive for data storage
from google.colab import drive
drive.mount('/content/gdrive')

# Clone YOLOv5 repository
!git clone https://github.com/ultralytics/yolov5
%cd yolov5
!pip install -r requirements.txt

2. Dataset Preparation

Collected and manually labeled 1000+ vehicle images using professional annotation tools.

Data Collection: 1000+ vehicle images from various sources
Annotation: Bounding boxes using labelImg tool
Data Split: 70% Train / 20% Val / 10% Test

# Dataset Configuration (data.yaml)
train: ../train/images
val: ../valid/images
test: ../test/images

nc: 15  # number of classes
names: ['BMW', 'Ford', 'Toyota', 'Coupe', 'Sedan', 'SUV', ...]

3. Model Training

Trained multiple YOLOv5 variants to find the best performing configuration.

# Train YOLOv5s model
!python train.py --img 640 --batch 16 --epochs 300 \
                 --data '../data.yaml' \
                 --weights yolov5s.pt \
                 --name vehicle_detection \
                 --cache

4. Testing & Evaluation

After training, we tested our models on unseen data and analyzed their performance.

# Run detection on test images
!python detect.py --weights runs/train/vehicle_detection/weights/best.pt \
                  --img 640 --conf 0.5 \
                  --source ../test/images

YOLOv5 Architecture Details

Key Innovation: Single Pass Detection

Grid Division: Image split into M×M cells
Direct Prediction: Each cell predicts objects within it
Real-time Speed: No multiple passes needed
End-to-end: 24 conv + 2 FC layers

YOLOv5 Improvements

PyTorch Framework: Modern implementation
Multi-scale Detection: 13×13, 26×26, 52×52 grids
Advanced Augmentation: Mosaic, MixUp techniques
Model Variants: s/m/l/x for different needs

Results & Performance

Key Achievements

0.8 mAP Score: Mean Average Precision across all vehicle classes
4 Model Variants: YOLOv5s, m, l, x compared systematically
15 Vehicle Classes: Brands + Types detected simultaneously
<7h Training Time: Even for the largest models on Colab

What We Discovered

Best Model Performance

YOLOv5s performed surprisingly well for our dataset size
YOLOv5l showed best loss function optimization
All models achieved 0.8 mAP with proper training
Validation accuracy exceeded training accuracy (good generalization!)

Training Insights

300 epochs were sufficient for convergence
Transfer learning significantly improved results
Confidence thresholds of 0.5-0.9 worked best
No overfitting observed across any model variant

Image Size Impact

416×416 and 640×640 both achieved 0.8 mAP
Larger images showed better localization accuracy
Training time difference was manageable on Colab
Model robustness across input sizes confirmed

Model Variant Comparison

All four YOLOv5 variants achieved similar performance:

YOLOv5s: 7M params, 0.8 mAP - Best for small datasets
YOLOv5m: 21M params, 0.8 mAP - Balanced performance
YOLOv5l: 47M params, 0.8 mAP - Best loss optimization
YOLOv5x: 87M params, 0.8 mAP - Highest capacity

Lessons Learned

Key Takeaways

For Small Datasets: YOLOv5s is optimal - smaller models can achieve similar performance with proper training
Transfer Learning Works: Pre-trained weights significantly improve both training speed and final accuracy
Google Colab Viable: Free GPU resources sufficient for academic deep learning projects
Data Quality Matters: High-quality annotations and systematic dataset organization are crucial

Technical Insights

Parameter Tuning: Proper selection of confidence thresholds and image sizes is critical
Model Selection: For academic projects, smaller models often perform as well as larger ones
Training Strategy: 300 epochs with transfer learning provides optimal results
Evaluation Metrics: mAP provides comprehensive performance assessment across all classes

Practical Applications

YOLO is excellent for real-time vehicle detection systems
Multi-class detection enables sophisticated automotive applications
Cloud-based training makes deep learning accessible for academic projects
Proper dataset creation is more important than model complexity

Try It Yourself

Everything you need to replicate this project is available in the interactive Jupyter notebook. The notebook includes complete Google Colab setup instructions, dataset preparation guide, step-by-step training code with explanations, and performance analysis.

Requirements

Google account (for Colab access)
Basic Python knowledge
Understanding of machine learning concepts
~2-4GB of Google Drive storage
Time: 4-6 hours for full implementation

Quick Start Guide

Open notebook in Google Colab
Run environment setup cells
Upload your dataset (or use ours)
Execute training cells
Test your trained model!

Tips for Success

Start Small: Begin with YOLOv5s for faster training and testing
Use Transfer Learning: Always start with pre-trained weights for better results
Monitor Training: Watch the loss curves to ensure proper convergence
Quality Data: Invest time in high-quality dataset annotation

Project Details

Category: Deep Learning

Duration: 4 months

Team Size: 2 Members

Role: Lead Developer

Course: ENSC 424

Status: Completed

Research Achievements

4 YOLOv5 Variant Comparison
0.8 mAP Performance
15-Class Multi-Detection
Loss Function Analysis
Confidence Threshold Study
Transfer Learning Evaluation
Overfitting Prevention
Academic Publication Quality

Dataset Information

Total Images: 1000+

Vehicle Brands: 11 Classes

Vehicle Types: 4 Classes

Data Split: 70/20/10

Professional Quality Control
Dual-Label Annotation
Balanced Class Distribution

Key Metrics

mAP Score: 0.8

Training Time: <7 hours

Epochs: 300

Image Size: 640×640