Object Detection

A variety of object detection models for different applications

PyTorch YOLO SSD COCO OpenAI ROS
Computer Vision Project

Overview

Below are a few of the object detection models I've implemented and used in the past. There are a variety of state of the art object detection models, developing custom implementations in some cases. Some models were trained on custom datasets, develoepd for the specific task of the project and have been implemented on a variety of robotic platforms.

Key Features

  • Real-time object detection
  • Custom implementations
  • Custom datasets
  • Custom models for specific tasks
  • Integration with ROS (Robot Operating System)

Technical Implementation

Bird Detection and Classification Model

Specialized YOLOv6 model trained on the Caltech-UCSD Birds dataset. Achieves high accuracy in detecting birds in natural environments.

YOLOv6 PyTorch Caltech-UCSD Birds Dataset
Bird Detection and Classification Model

Performance Metrics

--% mAP@0.5
2.52ms Inference Time

Training Details

Built off the pretrained YOLOv6s model, fine-tuned on the Caltech-UCSD Birds dataset, which is an extensive dataset of 11,788 images of 200 species of birds. The model was trained for 100 epochs with a batch size of 16 and an initial learning rate of 0.01.

Weed Detection Model

Custom SSD (Single Shot Detector) model designed for MAPLEE (Modular Autonomous Platform for Lanscaping and Environmental Engineering) trained on a custom blend of datasets containing 8,000+ labeled images. Identifies weeds in environments such as grassy fields or brick walkways, enabling MAPLEE to accurately detect and pull out weeds across the Northeastern campus.

SSD Architecture Detection in various environments (grass, brick) Custom dataset Integration with ROS
Weed Detection Model

Performance Metrics

--% mAP@0.5
6 FPS Inference Speed

Training Details

The custom implementation of the SSD model allowed us to change the resolution of the input image, training both an SSD300 model and an SSD512 model.

Integration Details

The SSD300 model developed, although not used in the final implementation, was implemented on MAPLEE's Jetson Xavier NX with ROS. During live testing, we were only able to run at 6 FPS using the SSD300 model, which was significantly slower than the YOLOv8n (~15 FPS) model I ended up training and using for the final implementaiton.

Trash Detection Model

Fine-tuned YOLOv8n model for environmental monitoring and waste management. Detects various types of litter and debris, supporting MAPLEE and CARL-T (Compact Autonomous Robot for Locating Trash) in locating trash and autonomously collecting it across the Northeastern campus.

YOLOv8n Architecture Custom dataset Integration with ROS
Trash Detection Model

Performance Metrics

--% mAP@0.5
15 FPS Inference Speed

Training Details

Trained on a diverse dataset of 12,000+ images containing various types of litter and debris. The model incorporates instance segmentation for precise boundary detection, enabling accurate waste classification and volume estimation.

Integration Details

The YOLOv8n model was implemented on MAPLEE's Jetson Xavier NX as well as CARL-T's Raspberry Pi 4 with ROS. For the Raspberry Pi implementation, model was offloaded to a ROS Base Station using action servers. This allowed for the robot to run the model in real-time, but with a slightly slower inference speed at 12 FPS.

General Object Detection

Versatile YOLO-12x model for general-purpose object detection. Handles a wide range of everyday objects and scenarios, serving as a fallback when specialized models aren't applicable. Covers 80+ COCO classes with a 0.6 confidence threshold, making it suitable for general computer vision applications and research.

YOLO-12x 80+ COCO Classes General Purpose
General Object Detection Model

Performance Metrics

55.2% mAP@0.5
11.79ms Inference Time

Training Details

The YOLOv12 model uses a new area-attention mechanism that allows for focusing on the most important features in an image. By only focusing on a smaller area of an image, YOLOv12 is able to get the benefits of attention mechanisms without the computational cost of a full attention mechanism. This model was pre-trained on the COCO dataset.

Applications

These models have been successfully deployed on a variety of robotic platforms and applications:

MAPLEE

Autonomous robot for detecting and cleaning weed and trash across the Northeastern campus. Runs with a Jetson Xavier NX with ROS Noetic

CARL-T

Compact, cheaper helper robot for locating trash across Northeastern's campus and labeling it's location for MAPLEE to clean. Runs with a Raspberry Pi 4 with ROS Noetic.

Bird Identification App

WIP

Conclusion & Future Work

These modles all have been developed and applied to various robotics platforms and applications and have performed sufficiently for their tasks.

Future work includes:

Try It Yourself

Upload an image to see our intelligent model selection in action. The system uses OpenAI's GPT-4 Vision to analyze your image and automatically choose the most appropriate detection model.

Original Image

Drag & drop an image here
or click to browse

Processed Image

Processed image will appear here