This tutorial aims to equip learners with the skills to train, deploy, and integrate YOLO-based fire detection models with multimodal alert systems.
Fire is a significant cause of fatalities, property damage, and economic disruptions worldwide. Case in point: the January 2025 Southern California wildfires, which raged between 7-31 January – a series of 7 destructive wildfires affecting the Los Angeles metropolitan area and San Diego County in California, United States. Every year, countless fire incidents lead to immense devastation. Therefore, adopting innovative and efficient early warning technologies is crucial to mitigate their impact. However, fire detection methods confront hurdles like false alarms, detecting fire in complex environments, and real-time processing constraints.
Vision AI: Vision AI leverages advanced machine learning and computer vision techniques to analyze and interpret visual data, enabling machines to understand images and videos enhancing automation, decision-making, and real-time monitoring across various applications, including fire detection.
Object Detection: Object detection is a key component of computer vision that identifies and localizes objects within an image or video. In fire detection systems, it aids in the highly accurate recognition of smoke and flames, facilitating prompt reactions and lowering the possibility of false alarms.
YOLO is a state-of-the-art, real-time object detection system, best known for:
In this guide, we'll build a fire hazard detection system using advanced AI tools. We'll use YOLO for quick real-time object detection, add multimodal and vision models for better analysis, and integrate Zapier to automate alerts and responses.
Before we start, please make sure you have the following in place:
We will use Google Colab to train our model. Why?
! nvidia-smi
!pip -q install ultralytics opencv-python anthropic
from ultralytics import YOLO
from IPython.display import display,Image
import cv2
from ultralytics import YOLO
from IPython import get_ipython
from IPython.display import display
import os
from google.colab.patches import cv2_imshow
import time, base64
from anthropic import Anthropic, HUMAN_PROMPT, AI_PROMPT
There are 3 essential blocks in the YOLO Architecture and everything will occur in these blocks, which are the Backbone, Neck, and Head. These are described below:
Transfer Learning: Uses a pre-trained YOLO model (trained on large datasets like COCO) and fine-tunes it on fire detection data. This approach significantly reduces training time and improves accuracy when labeled data is limited.
Training from Scratch: Initializes YOLO with random weights and trains it entirely on a domain-focused dataset. This requires extensive data and computational power but allows full customization.
Comparison: Transfer learning is preferred for rapid deployment and limited data availability while training from scratch is beneficial for highly customized models.
Let’s dive back into our notebook to train our model on the dataset we exported earlier from Roboflow.
!yolo task=detect mode=train model=yolov8m.pt data = {dataset.location}/data.yaml epochs = 20 imgsz = 640 plots = True
The training process will take approximately 3-4 hours with T4 GPU.
Model evaluation and optimization is important to evaluate the performance of your fire detection model using several key metrics:
4. F1-Score: The F1-score combines precision and recall into a single metric that balances the two. It is the harmonic mean of precision and recall, providing a good overall indicator of model performance when both false positives and false negatives are important to minimize.
F1 = 2 (Precision Recall/ Precison +Recall)
Let’s go back to our notebook and check our results for our validation dataset:
!yolo task=detect mode=val model=runs/detect/train/weights/best.pt
data={dataset.location}/data.yaml
Here are the precision, recall, and F1 scores for both fire and smoke based on the provided metrics:
For fire:
For smoke:
Fire Class:
Smoke Class:
Real-time inference with OpenCV involves loading a trained YOLO model, processing video frames, and making predictions on each frame in real-time. This typically includes:
For this part of the implementation, we will use Fire Video from Shutterstock.
You have to download and upload it to your Colab setup.
model_path = "/content/runs/detect/train/weights/best.pt"
# in colab notebook this is where you will find your trained model
# Check if file exists
if not os.path.exists(model_path):
raise FileNotFoundError(f"Model file not found at: {model_path}")
# Load your model
model = YOLO(model_path)
3. Initialize a VideoCapture object that handles video stream:
cap = cv2.VideoCapture("/content/fire2.mp4") # Path to your video file
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
results = model.predict(frame, conf=0.5)
for r in results:
for box, conf, cls in zip(r.boxes.xyxy, r.boxes.conf, r.boxes.cls):
label = r.names[int(cls)]
if label == "fire":
x1, y1, x2, y2 = map(int, box) # Get bounding box coordinates
cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 0, 255), 2)
cv2.putText(frame, f"{label} {conf:.2f}", (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
resized_frame = cv2.resize(frame, (640, 360))
cv2.imwrite("snapshot.jpg", resized_frame)
cv2_imshow(frame)
Results:
This cell will take a bit longer to execute, and the expected results will have frames, either with no detection or marked with a red bounding box detecting fire.
Vision AI Models:
Vision AI models use deep learning techniques to interpret and understand visual data. These models are designed to identify objects, detect anomalies, and segment images for tasks like object detection, facial recognition, and image classification.
Anthropic’s Claude 3 Opus:
Claude 3 Opus, the most advanced model in Anthropic’s Claude 3 family, is designed for superior language understanding and generation. It excels in complex reasoning, multilingual proficiency, and contextual awareness while maintaining a strong focus on safety and ethical AI alignment. As a multimodal model, it can process and analyze both text and visual data, making it highly effective for applications requiring deep comprehension, structured output, and real-time adaptability across various domains.
Begin with setting up the Anthropic API Key, ensure it's there as a secret in your Colab with access shared to the notebook you are working on.
You can access the key using the code:
from google.colab import userdata
API_KEY = userdata.get('anthropic_api_key')
client = Anthropic(api_key=API_KEY)
Setting up the model to generate a warning for fire detection:
MODEL_NAME = "claude-3-opus-20240229"
with open("/content/snapshot.jpg", "rb") as image_file:
binary_data = image_file.read()
base_64_encoded_data = base64.b64encode(binary_data)
base64_string = base_64_encoded_data.decode('utf-8')
# Prepare the message for the Anthropic API
message_list = [
{
"role": 'user',
"content": [
{"type": "image", "source": {"type": "base64", "media_type": "image/jpeg", "data": base64_string}},
{"type": "text", "text": "Generate a fire alarm warning to alert the user about the detected fire and specify its location. Ensure message is clear and concise."}
]
}
]
response = client.messages.create(
model=MODEL_NAME,
max_tokens= 100,
messages=message_list
)
print(response.content[0].text)
Results:
You will get a warning like this generated, which will describe the surroundings where fire has begun alerting the user.
WARNING: Fire detected coming from electrical outlet! Unplug all devices immediately and evacuate the area. Call emergency services if the fire spreads or cannot be quickly extinguished safely.
This is it. We have successfully trained our model. Let's export our trained model weights, to use them in a web application.
from google.colab import files
files.download('/content/runs/detect/train/weights/best.pt')
Now, we will develop a web application that prompts the user to upload a video and, in case of fire detection, a warning is generated and the user is notified through Gmail. The code for the frame processing and warning generation would be almost similar to the Jupyter Notebook code.
The most essential part would be the implementation of sending alerts over Gmail to the user, and for this, we would use Zapier.
We will get back to this later. For now, head to your IDE.
If not already installed, you can get Miniconda from here. Miniconda should be up and running with your IDE. In case you face any issue, follow this guide.
Start by creating a folder for your project. In the terminal, navigate to that directory and create a virtual environment using:
conda create -p C:\Users\HP\OneDrive\Desktop\testing .\testenv2 python=3.12
*Replace “C:\Users\HP\OneDrive\Desktop\testing” with the path of your directory and “.\testenv2” with the environment name.
Check for the environment; this should return your environment’s name:
conda env list
Activate environment:
conda activate
Now that environment is activated, create a file in the same folder, requirements.txt, and include the packages:
streamlit
cv2
ultralytics
anthropic
dotenv-python
requests
Download the requirements:
Pip install -r requirements.txt
Create another file .env and store your Zapier Webhook URL, and API keys here:
ANTHROPIC_API_KEY = “ANTHROPIC_API_KEY”
EMAIL_ADDRESS = user@gmail.com
EMAIL_PASSWORD = password123
ZAPIER_WEBHOOK_URL = https://hooks.zapier.com/hooks/catch/abcde/fghij
Create another file app.py, in the folder.
Our main code will be in this file, which we will divide up into 4 essential blocks or functions:
# Import necessary modules
import streamlit as st
import cv2
import time
import base64
from ultralytics import YOLO
from anthropic import Anthropic
import os
from dotenv import load_dotenv
import requests
Load the environment variables, fetch your API key, and create another global variable that tracks the warning message generated by the vision model.
load_dotenv()
# Global variable to store the warning message
warning_message = None
# Initialize the Anthropic client
ANTHROPIC_API_KEY = os.getenv("ANTHROPIC_API_KEY")
client = Anthropic(api_key=ANTHROPIC_API_KEY)
#(Cached to prevent reloading of model)
@st.cache_resource
def load_yolo_model():
"""Load and return the YOLO model for fire detection."""
model_path = "C:\\Users\\HP\\Downloads\\best.pt"
return YOLO(model_path)
def process_frame(frame, model):
"""
Processes a video frame using the YOLO model to detect fire.
Args:
frame (numpy array): The video frame.
model (YOLO): The preloaded YOLO model.
Returns:
tuple: Processed frame and a boolean indicating fire detection.
"""
results = model.predict(frame, conf=0.3)
fire_detected = False
for r in results:
for box, conf, cls in zip(r.boxes.xyxy, r.boxes.conf, r.boxes.cls):
label = r.names[int(cls)]
if label == "fire":
fire_detected = True
x1, y1, x2, y2 = map(int, box)
cv2.rectangle(frame, (x1, y1), (x2, y2), (0, 0, 255), 2)
cv2.putText(frame, f"{label} {conf:.2f}", (x1, y1 - 10),
cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
return frame, fire_detected
def generate_warning(frame):
"""
Generates a fire warning message using the Anthropic API and takes a snapshot.
Args:
frame (numpy array): The video frame where fire is detected.
Returns:
str: The generated warning message.
"""
global warning_message
# Save a snapshot of the frame
snapshot_path = "snapshot.jpg"
resized_frame = cv2.resize(frame, (640, 360))
cv2.imwrite(snapshot_path, resized_frame)
# Encode the snapshot as base64
with open(snapshot_path, "rb") as image_file:
binary_data = image_file.read()
base64_encoded_data = base64.b64encode(binary_data).decode('utf-8')
# Prepare the message for the Anthropic API
message_list = [
{
"role": "user",
"content": [
{"type": "image", "source": {"type": "base64", "media_type": "image/jpeg", "data": base64_encoded_data}},
{"type": "text", "text": "Generate a fire alarm warning to alert the user about the detected fire and specify its location. Ensure the message is clear and concise."}
]
}
]
# Get the warning message from the API
response = client.messages.create(
model="claude-3-opus-20240229",
max_tokens=80,
messages=message_list
)
warning_message = response.content[0].text
return warning_message
# Function to send a fire warning notification via Zapier webhook
def send_notification_via_zapier(warning_message):
"""
Sends fire warning notification using Zapier webhook.
Args:
warning_message (str): The warning message to be sent.
"""
data = {
"message": warning_message,
"alert_type": "Fire Detected 🚨",
}
try:
response = requests.post(ZAPIER_WEBHOOK_URL, json=data)
if response.status_code == 200:
print("Notification sent successfully via Zapier!")
else:
print(f"Failed to send notification. Status code: {response.status_code}")
except Exception as e:
print(f"Error sending notification: {e}")
# Main Streamlit Application
def main():
"""Main function to run the Streamlit application for fire hazard detection."""
global warning_message # Use the global variable to persist warnings
st.title("Fire Hazard Detection WebApp")
st.sidebar.header("Controls")
video_file = st.sidebar.file_uploader("Upload a video file", type=["mp4", "avi", "mov"])
if video_file:
# Save uploaded video locally
video_path = video_file.name
with open(video_path, "wb") as f:
f.write(video_file.read())
cap = cv2.VideoCapture(video_path)
st.sidebar.text("Video uploaded successfully.")
# Load the YOLO model
model = load_yolo_model()
video_placeholder = st.empty()
warning_placeholder = st.empty()
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
# Process the frame
processed_frame, fire_detected = process_frame(frame, model)
video_placeholder.image(cv2.cvtColor(processed_frame, cv2.COLOR_BGR2RGB), channels="RGB")
# Generate warning if fire is detected
if fire_detected and warning_message is None:
warning_message = generate_warning(frame) # Generate warning
send_notification_via_zapier(warning_message) # Send notification via Zapier
# Display the warning message (persist if already generated)
if warning_message:
warning_placeholder.error(warning_message)
time.sleep(0.1) # Add delay for smooth video display
cap.release()
if __name__ == "__main__":
main()
streamlit run fire.py
This will start a web application.
Click on the sidebar, and upload a video with fire. You can upload the same video we used in our notebook.
Keep this application running until you see a warning generated.
Soon after this, head to your Zap drafts. Here you will see a new record; choose the record, and click on Continue with the selected record.
This page should appear – choose Gmail.
This setup box should open. Under the Action Event, choose to Send Email and add your Gmail in the account tab, then click on Continue.
In the next part, set the receiver’s email address and other details.
Add subject and, for the body, click on the + sign and choose Message.
Finish and click on Continue; on the next page, click Test Step.
You can check the received mail.
Note: If Gmail setup fails with Streamlit, you can resolve the issues by following the steps mentioned in this discussion.
Hurray! With this you have successfully set up the alert system.
Launching an AI-powered application or building an AI feature doesn’t require massive upfront investments or a dedicated internal team. Superteams.ai enables businesses to start with a focused, cost-effective proof-of-concept—using your existing data—to validate ROI before scaling.
Ready to get started?
Let’s discuss your data, goals, and challenges. In 30 minutes, we’ll outline a roadmap to build an AI system that delivers accurate, reliable, and actionable results—not hallucinations.
Request a meeting now: