copied
Readme
Files and versions
Updated 8 months ago
object-detection
Object Detection using Detectron2
author: filip-halt, fzliu
Description
This operator uses Facebook's Detectron2 library to compute bounding boxes, class labels, and class scores for detected objects in a given image.
Code Example
from towhee import pipe, ops, DataCollection
p = (
pipe.input('path')
.map('path', 'img', ops.image_decode())
.map('img', ('boxes', 'classes', 'scores'), ops.object_detection.detectron2(model_name='retinanet_resnet50'))
.output('img', 'boxes', 'classes', 'scores')
)
DataCollection(p('./example.jpg')).show()

Factory Constructor
Create the operator via the following factory method
object_detection.detectron2(model_name='retinanet_resnet50', thresh=0.5, num_classes=1000, skip_preprocess=False)
Parameters:
model_name: str
A string indicating which model to use. Available options:
faster_rcnn_resnet50_c4
faster_rcnn_resnet50_dc5
faster_rcnn_resnet50_fpn
faster_rcnn_resnet101_c4
faster_rcnn_resnet101_dc5
faster_rcnn_resnet101_fpn
faster_rcnn_resnext101
retinanet_resnet50
retinanet_resnet101
thresh: float
The threshold value for which an object is detected (default value: 0.5
). Set this value lower to detect more objects at the expense of accuracy, or higher to reduce the total number of detections but increase the quality of detected objects.
Interface
This operator takes an image as input. It first detects the objects appeared in the image, and generates a bounding box around each object.
Parameters:
img: towhee._types.Image
Image data wrapped in a (as a Towhee Image
).
Return: List[numpy.ndarray[4], ...], List[str], numpy.ndarray
The return value is a tuple of (boxes, classes, scores)
. boxes
is a list of bounding boxes. Each bounding box is represented as a 1-dimensional numpy array consisting of the top-left and the bottom-right corners, i.e. numpy.ndarray([x1, y1, x2, y2])
. classes
is a list of prediction labels for each bounding box. scores
is a list of confidence scores corresponding to each class and bounding box.
More Resources
- Approximate Nearest Neighbors Oh Yeah (Annoy) - Zilliz blog: Discover the capabilities of Annoy, an innovative algorithm revolutionizing approximate nearest neighbor searches for enhanced efficiency and precision.
- CLIP Object Detection: Merging AI Vision with Language Understanding - Zilliz blog: CLIP Object Detection combines CLIP's text-image understanding with object detection tasks, allowing CLIP to locate and identify objects in images using texts.
- What is a Convolutional Neural Network? An Engineer's Guide: Convolutional Neural Network is a type of deep neural network that processes images, speeches, and videos. Let's find out more about CNN.
- Using Vector Search to Better Understand Computer Vision Data - Zilliz blog: How Vector Search improves your understanding of Computer Vision Data
- Understanding ImageNet: A Key Resource for Computer Vision and AI Research: The large-scale image database with over 14 million annotated images. Learn how this dataset supports advancements in computer vision.
- What is Detection Transformers (DETR)? - Zilliz blog: DETR (DEtection TRansformer) is a deep learning model for end-to-end object detection using transformers.
- What is approximate nearest neighbor search (ANNS)?: Learn how to use Approximate nearest neighbor search (ANNS) for efficient nearest-neighbor search in large datasets.
| 17 Commits | ||
---|---|---|---|
|
840 B
|
3 years ago | |
|
3.0 KiB
|
3 years ago | |
|
4.1 KiB
|
8 months ago | |
|
108 B
|
3 years ago | |
|
2.4 KiB
|
3 years ago | |
|
1.5 MiB
|
3 years ago | |
|
136 B
|
2 years ago | |
|
103 KiB
|
2 years ago |