Skip to content

cdtalley/AI-and-ComputerVision-Development-Project-VisionDetect-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

VisionDetect: Advanced Object Detection with Deep Learning

License: MIT CI Status Python 3.8+ PyTorch TensorFlow

VisionDetect is a comprehensive computer vision framework for object detection using state-of-the-art deep learning techniques. It provides a modular, extensible architecture that supports multiple backends (PyTorch and TensorFlow) and various model architectures.

Features

  • Multiple Model Architectures: Support for Faster R-CNN, with extensibility for other architectures
  • Multiple Backends: Implementations in both PyTorch and TensorFlow
  • Transfer Learning: Utilize pre-trained models for faster training and better performance
  • Data Augmentation: Comprehensive data augmentation pipeline for improved model generalization
  • Evaluation Metrics: Detailed performance metrics including mAP, precision, and recall
  • Visualization Tools: Utilities for visualizing predictions and model performance
  • Model Serving: REST API for serving models in production environments
  • Command-Line Interface: Easy-to-use CLI for training, evaluation, and inference
  • Comprehensive Documentation: Detailed documentation and examples

Installation

Prerequisites

  • Python 3.8+
  • CUDA-compatible GPU (recommended for training)

Install from Source

# Clone the repository
git clone https://github.com/yourusername/visiondetect.git
cd visiondetect

# Create a virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install the package
pip install -e .

Quick Start

from src import VisionDetect

# Create VisionDetect instance
vd = VisionDetect()

# Train model
trainer, metrics = vd.train(
    data_dir="path/to/data",
    model_type="faster_rcnn",
    backbone="resnet50",
    num_classes=91,
    epochs=50
)

# Make prediction
result = vd.predict(
    image_path="path/to/image.jpg",
    model_path="checkpoints/best_model.pth"
)

Documentation

Project Structure

visiondetect/
├── config/               # Configuration files
├── data/                 # Data storage (gitignored)
├── docs/                 # Documentation
├── notebooks/            # Jupyter notebooks for exploration and demos
├── src/                  # Source code
│   ├── data/             # Data processing modules
│   ├── models/           # Model implementations
│   ├── utils/            # Utility functions
│   └── api/              # API for model serving
├── tests/                # Unit and integration tests
├── train.py              # Training script
├── evaluate.py           # Evaluation script
├── infer.py              # Inference script
├── .gitignore            # Git ignore file
├── LICENSE               # License file
├── README.md             # Project documentation
└── requirements.txt      # Python dependencies

Command-Line Interface

Training

python train.py --data-dir data --model-type faster_rcnn --backbone resnet50 --epochs 50

Evaluation

python evaluate.py --model-path checkpoints/best_model.pth --data-dir data --visualize

Inference

python infer.py --model-path checkpoints/best_model.pth --input path/to/image.jpg

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

  • The project structure and design patterns are inspired by best practices in the deep learning community
  • Pre-trained models are based on the work of various research teams
  • Special thanks to the PyTorch and TensorFlow teams for their excellent frameworks

About

End to end AI development project.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors