Tutorial: AI-Powered Traffic Analysis
Detect, Track, and Count Vehicles and Pedestrians in Real-Time Using AI
This is a two-part series created by urban planning and urban technology students at University of Michigan for the Urban AI course. Part I is an overview of a topic of interest, and Part II is a replicable code tutorial.
(Part I) Overview: Computer Vision in the Built Environment
(Part II) Tutorial: AI-powered Traffic Analysis
Introduction
With cities becoming more densely populated and the number of vehicles on the streets increasing, efficient traffic monitoring is crucial for improving urban mobility and reducing congestion. Traditional traffic monitoring methods, such as human-operated cameras and road sensors, are very costly and labor-intensive. However, with advancements in Artificial Intelligence (AI) and computer vision, real-time automated traffic analysis has become a powerful tool for urban planning, traffic management, and even autonomous driving applications.
In this lab, we will explore how object detection and tracking models can be used to monitor traffic patterns. We will use YOLOv5 (You Only Look Once version 5), one of the fastest and most accurate deep learning models for real-time object detection, alongside DeepSORT, a tracking algorithm that assigns unique IDs to detected objects and follows them across multiple frames.
By the end of this lab, you will understand how real-time AI-based traffic monitoring works and learn how to modify and optimize detection parameters to improve accuracy.
Learning Objectives
Understand how AI-powered object detection works in urban sensing.
Learn how to use YOLOv5 and DeepSORT for vehicle detection and tracking.
Modify parameters to analyze different traffic scenarios and optimize detection.
Download Required Folder
Download this zip file containing starter code (.ipynb file) and some video files.
🚨 Before You Begin - Requirements
To follow along, you need:
The following dependencies installed:
pip install deep-sort-realtime
pip install yolov5
pip install opencv-python
*Note*: If package dependency issues occur, it is recommended to use a virtual environment to install the necessary packages and run the Jupyter notebook or Python file within that environment.
# Open your terminal (Mac/Linux) or command prompt (Windows) and execute the following commands
conda create -n yolov5_env python=3.12
conda activate yolov5_env
# Install necessary packages
pip install deep-sort-realtime
pip install yolov5
pip install opencv-python
# Then, add this environment to Jupyter (you may need to install ipykernel in this new env again)
python -m ipykernel install --user --name yolov5_env --display-name "Python (YOLOv5)"
Step 1: Load YOLOv5 and Initialize the Tracker
First, we load OpenCV (computer vision library), the YOLOv5 model (image tracking) and DeepSORT for tracking objects across frames.
import cv2
import yolov5
import numpy as np
from collections import defaultdict
from deep_sort_realtime.deepsort_tracker import DeepSort # Import DeepSORT
# Load YOLOv5 model
# model = yolov5.load('yolov5s.pt')
model = yolov5.load('yolov5n.pt') # nano model for faster speed
# set model parameters
model.conf = 0.5 # Confidence threshold
model.iou = 0.5 # IoU threshold
model.agnostic = False # Class-agnostic NMS
model.multi_label = False # Multiple labels per box
model.max_det = 1000 # Max detections per image
Explaining the code block above:
model.conf is the confidence parameter (between 0 and 1)
higher = more certainty for each labelled object
lower = less strict labelling
model.iou is the “Intersection over Union”
it is the area of overlap between two bounding boxes divided by the total area covered by both bounding boxes
higher = more accurate; lower = less accurate
YOLOv5 detects objects frame-by-frame.
DeepSORT assigns unique IDs to objects and tracks their movement.
The max_age=30 parameter ensures objects are tracked for up to 30 frames even if temporarily occluded.
Step 2: Set Input and Output Files
# Change video_filename of the video you would like to read
video_filename = "cars_highway.mp4"
# Change tracked_objs_filename to your desired name for the output video
tracked_objs_filename = "cars_highway_counted.mp4"
Change video_path to the filename of the video you would like to read. For example, video_filename = "cars_highway.mp4"
The script will eventually create a new video file with the object tracking analysis. Change tracked_objs_filename to your desired name (e.g: cars_highway_counted.mp4) for the output video. *Make sure the output file has to be ‘.mp4’*
BE SURE TO CHANGE THE OUTPUT FILE NAME EACH TIME YOU CHANGE THE INPUT VIDEO. This will ensure that you do not overwrite a file on accident.
Step 3: Read Traffic Video and Detect Vehicles
Run the script below! Here is the expected output:
* A window will open, showing the video with detected vehicles highlighted.
* Count of unique vehicles will be displayed in real-time.
* The processed video will be saved in the same folder.
PLEASE NOTE:
* you might get a lot of error messages, but they can be ignored as long as a new window opens and you can see the object tracking working
* the analysis can take a long time, so be sure to wait until the video display is finished updating before moving forward
cap = cv2.VideoCapture(video_filename)
# Get video properties
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
fps = int(cap.get(cv2.CAP_PROP_FPS))
# Define VideoWriter to save output video
# output_path = "3sec_cars_counted.mp4"
fourcc = cv2.VideoWriter_fourcc(*"mp4v") # Codec
out = cv2.VideoWriter(tracked_objs_filename, fourcc, fps, (width, height))
# Initialize DeepSORT tracker
tracker = DeepSort(max_age=30) # Keeps track of objects for 30 frames
# Dictionary to store unique object IDs
unique_objects = set()
frame_counts = defaultdict(int)
window_name = "object detection"
# Process video frame by frame
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break # Stop if end of video
# Run YOLOv5 inference
results = model(frame, size=640)
# Get predictions
# predictions = results.pred[0]
predictions = results.xyxy[0].cpu().numpy()
boxes = predictions[:, :4] # Bounding boxes
scores = predictions[:, 4] # Confidence scores
categories = predictions[:, 5].astype(int) # Class IDs
# Prepare detections for DeepSORT
detections = []
for box, score, category in zip(boxes, scores, categories):
x1, y1, x2, y2 = map(int, box)
detections.append(([x1, y1, x2 - x1, y2 - y1], score, int(category)))
# Format: ([x, y, width, height], confidence, class_id)
# Update tracker with new detections
tracked_objects = tracker.update_tracks(detections, frame=frame) # Fix: Use update_tracks()
# Dictionary to count objects in this frame
# frame_counts = defaultdict(int)
# Draw bounding boxes and track unique objects
for track in tracked_objects:
if not track.is_confirmed():
continue # Skip unconfirmed detections
track_id = track.track_id # Get unique track ID
x1, y1, w, h = map(int, track.to_tlwh()) # Get bounding box
class_id = track.det_class if hasattr(track, 'det_class') else None
if class_id is not None:
label = f"{model.names[class_id]} ID:{track_id}"
else:
label = f"ID:{track_id}" # Fallback if class_id is missing
color = (0, 255, 0) # Green for bounding boxes
cv2.rectangle(frame, (x1, y1), (x1 + w, y1 + h), color, 2)
cv2.putText(frame, label, (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)
# Only count if this object ID is seen for the first time
if track_id not in unique_objects:
unique_objects.add(track_id)
if class_id is not None:
frame_counts[model.names[class_id]] += 1
# Display frame counts on video
y_offset = 20
for obj, count in frame_counts.items():
cv2.putText(frame, f"{obj}: {count}", (10, y_offset), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 0, 255), 2)
y_offset += 30 # Move text down for next object
# Open window
cv2.imshow(window_name, frame)
cv2.waitKey(1)
# Write frame to output video
out.write(frame)
# Press 'q' to exit early
# if cv2.waitKey(1) & 0xFF == ord('q'):
# break
# Release resources
cap.release()
out.release()
Explaining the code block above:
YOLOv5 detects objects (cars, buses, etc.) in each frame.
DeepSORT assigns an ID to each object and tracks its movement.
A bounding box is drawn around each detected vehicle.
The final object count is displayed on-screen and printed at the end.
Step 4: Closing the window and printing final object counts
Run the block below to close the window and print the final counts of the objects. If the window does not close, try running the code block again.
If the window still isn't closing, you can close the window you by restarting the kernel. You'll have to re-run the file afterwards. In jupyter notebook, go to "Kernel" then "Restart Kernel...". In VS Code, there should be a restart icon at the top of the screen.
cv2.waitKey(1)
cv2.destroyWindow(window_name)
print("\nFINAL UNIQUE OBJECT COUNTS:")
for obj, count in frame_counts.items():
print(f"{obj}: {count}")
Practice Questions, Solo Experimentation
Try these tasks to explore AI-powered urban sensing further.
1. Experiment with different videos
Using the video folder we provided for you, try analyzing different traffic videos and observe how detection performs under different conditions (day/night, heavy rain, grainy video, etc.).
2. Adjust YOLO Parameters
Modify the confidence threshold (model.conf) and IoU (model.iou) values to see how they affect detection accuracy.
What happens when you increase model.conf above 0.5?
What happens when you reduce it to 0.1?
What about when you increase/decrease model.iou?
3. Experiment with max_age
Change the tracking persistence (max_age) in DeepSort(max_age=30). This parameter can affect whether or not an object is detected across frames.
How does reducing max_age=10 affect tracking accuracy (if at all)?
4. Use a larger model
We are currently using the nano model of YOLO. You can change the following line in Step 1 from
"model = yolov5.load('yolov5n.pt')"
to
"model = yolov5.load('yolov5s.pt')"
This will use the "small" model instead of “nano”. How does this change accuracy or how long it takes the script to run?
Video Sources:
"cars_highway.mp4": Traffic Flow In The Highway - 4K Stock Videos | NoCopyright | AllVideoFree, Free Edit Market.
"highway_two_way.mp4": Image of traffic on the road passing between buildings in the city, the preferred vehicles for land transportation are usually cars Free Video, vecteezy.com. https://www.vecteezy.com/video/23272130-image-of-traffic-on-the-road-passing-between-buildings-in-the-city-the-preferred-vehicles-for-land-transportation-are-usually-cars
"heavy_traffic.mp4": FREE STOCK FOOTAGE - Heavy traffic, M.M Films.
"LA_intersection_dark.mp4": Los Angeles Intersection Stock Video, Motion Array.
"nyc_sidewalk_grainy.mp4": People Walking by on a Sidewalk - 4K Stock Videos | NoCopyright | AllVideoFree, Free Edit Market.
"paris_street_nighttime.mp4": Paris Rainy Street Stock Footage, Free Stock Footage by Motion Places.