Learn how to use the cutting-edge, real-time object detection system You Only Look Once (YOLO) for traffic detection.
Object detection is a crucial computer vision technique that identifies and locates objects within images or videos. Unlike simple classification, which only categorizes objects, object detection provides their precise spatial locations through bounding boxes. This technique is essential for various applications, enabling machines to interpret and understand visual data similarly to human perception.
Object detection in traffic data plays a vital role, offering numerous benefits that enhance road safety, traffic management, and urban planning. By accurately identifying and tracking vehicles, pedestrians, and other objects, it enables more efficient and intelligent transportation systems. The key benefits of object detection in traffic management include enhanced safety by alerting drivers to potential hazards, improved traffic flow through real-time data on vehicle movement for dynamic traffic signal adjustments, accurate data collection on traffic patterns, automated incident detection for faster response times, and support for autonomous vehicles by allowing self-driving cars to navigate safely.
Traffic detection involves monitoring and analyzing traffic conditions to manage roadways efficiently, utilizing technologies such as cameras to collect data on vehicle movement, traffic volume, and congestion levels. This data is then used to optimize traffic flow, enhance road safety, and support urban planning. The key benefits of traffic detection include real-time traffic management for timely updates to drivers and efficient signal adjustments, reduced congestion by identifying and alleviating congestion points, enhanced safety through quick identification and response to incidents, and data-driven planning for better road networks and infrastructure projects.
YOLO (You Only Look Once) is a state-of-the-art, real-time object detection system designed to detect objects in images and videos with high accuracy and speed. YOLO was introduced by Joseph Redmon and his collaborators in their research paper "You Only Look Once: Unified, Real-Time Object Detection."
Single Shot Detection: Unlike traditional object detection systems that apply a model to an image multiple times, YOLO looks at the image only once. This significantly reduces computation time and increases detection speed.
Grid System: YOLO divides the input image into a grid of cells. Each cell is responsible for predicting a certain number of bounding boxes and their corresponding confidence scores.
First, load all the libraries.
import gradio as gr
import cv2
import math
import numpy as np
from PIL import Image
Now after loading the model, we’ll use the Ultralytics library to load YOLO.
from ultralytics import YOLO
# Load the YOLO model
model = YOLO("yolov9c.pt")
Next, we’ll define the class labels.
# Define the object classes
classNames = ["person", "bicycle", "car", "motorbike", "bus", "train", "truck", "boat", "stop sign", "bench", "traffic light", "fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse", "cow", "backpack", "handbag", "suitcase"]
We will define the function to process the image.
def detect_objects(image):
# Convert image to OpenCV format
img = cv2.cvtColor(np.array(image), cv2.COLOR_RGB2BGR)
# Run object detection
results = model(img)
# Initialize counter for each class
object_counts = {class_name: 0 for class_name in classNames}
# Process the results
for r in results:
boxes = r.boxes
for box in boxes:
cls = int(box.cls[0]) # Get class index
if cls < len(classNames): # Ensure class index is within bounds
class_name = classNames[cls] # Get class name
object_counts[class_name] += 1 # Increment the counter for the detected class
# Get bounding box coordinates
x1, y1, x2, y2 = box.xyxy[0]
x1, y1, x2, y2 = int(x1), int(y1), int(x2), int(y2) # Convert to int values
# Draw bounding box
cv2.rectangle(img, (x1, y1), (x2, y2), (255, 0, 255), 3)
# Get confidence
confidence = math.ceil((box.conf[0] * 100)) / 100
# Draw label
org = (x1, y1 - 10) # Position of the label
font = cv2.FONT_HERSHEY_SIMPLEX
fontScale = 0.5
color = (255, 0, 0)
thickness = 2
label = f"{class_name}: {confidence}"
cv2.putText(img, label, org, font, fontScale, color, thickness)
# Convert image back to PIL format
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img = Image.fromarray(img)
# Create a summary of detected objects
summary = ", ".join([f"{count} {name}" for name, count in object_counts.items() if count > 0])
return img, f"Detected objects: {summary}"
As the final step, we’ll build an interactive Gradio interface.
# Gradio interface
iface = gr.Interface(
fn=detect_objects,
inputs=gr.Image(type="pil"),
outputs=[gr.Image(type="pil"), gr.Text()],
title="Object Detection with YOLO",
description="Upload an image to detect objects and get the total count."
)
# Launch the Gradio interface
iface.launch()
Object detection with YOLO (You Only Look Once) is a breakthrough in computer vision. It quickly identifies and categorizes objects in real time, which is crucial for applications like autonomous vehicles and traffic control. By using live traffic cameras and advanced algorithms, these systems improve road safety, streamline traffic, and manage incidents effectively. YOLO's speed in analyzing images and videos is key to deploying smart systems across various fields, advancing AI's ability to understand and apply visual data.