Skip to content

AirLens-Vision is a high-performance, real-time object detection application powered by the MediaPipe Tasks API and OpenCV. Utilizing the EfficientDet-Lite0 model, it identifies and labels up to 80 different object categories with low latency, providing accurate bounding boxes and confidence scores directly on the live camera feed.

License

Notifications You must be signed in to change notification settings

NKumar-B/ObjectSense-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AirLens-Vision: Real-Time Object Detection

AirLens-Vision is a high-speed computer vision application that identifies and labels objects in real-time. Built with the modern MediaPipe Tasks API and OpenCV, it utilizes the EfficientDet-Lite0 model to provide professional-grade object detection directly on a live camera feed.

ObjectDetect

Features

  • Real-Time Inference: Optimized for live video streams with minimal latency.
  • 80+ Categories: Detects a wide range of common objects (people, vehicles, laptops, bottles, etc.) based on the COCO dataset.
  • Dynamic UI: Overlays precise bounding boxes, class labels, and confidence scores.
  • Lightweight Architecture: Designed to run efficiently on standard hardware without requiring a dedicated high-end GPU.

Installation & Setup

1. Clone the Repository

git clone https://github.com/NKumar-B/ObjectSense-AI.git
cd ObjectSense-AI

2. Set Up a Virtual Environment (Recommended)

python -m venv .venv
.\.venv\Scripts\activate

3. Install Dependencies

Ensure you have Python 3.9+ installed, then run:

pip install -r requirements.txt

4. Download the Model

You must download the EfficientDet-Lite0 (float32) model and place it in the project root:


How to Use

  1. Run the application:
python ObjectDetect.py
  1. Interaction:
  • Point your webcam at objects to see bounding boxes and labels in real-time.
  • The Confidence Score (0.0 - 1.0) indicates the AI's certainty.
  1. Exit: Press 'q' on your keyboard to close the window.

How It Works

  1. Preprocessing: The input frame is mirrored and converted from BGR (OpenCV standard) to RGB (MediaPipe standard).
  2. Inference: The frame is passed to the ObjectDetector task, which performs a single pass to identify multiple objects simultaneously.
  3. Visualization: The result contains normalized coordinates which are mathematically mapped back to your screen's pixel dimensions to draw the bounding boxes accurately.

License

Distributed under the MIT License. See LICENSE for more information.


Acknowledgments


About

AirLens-Vision is a high-performance, real-time object detection application powered by the MediaPipe Tasks API and OpenCV. Utilizing the EfficientDet-Lite0 model, it identifies and labels up to 80 different object categories with low latency, providing accurate bounding boxes and confidence scores directly on the live camera feed.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages