# Object Detection using Detectron2 *author: [filip-halt](https://github.com/filip-halt), [fzliu](https://github.com/fzliu)*
## Description This operator uses Facebook's [Detectron2](https://github.com/facebookresearch/detectron2) library to compute bounding boxes, class labels, and class scores for detected objects in a given image.
## Code Example ```python from towhee import pipe, ops, DataCollection p = ( pipe.input('path') .map('path', 'img', ops.image_decode()) .map('img', ('boxes', 'classes', 'scores'), ops.object_detection.detectron2(model_name='retinanet_resnet50')) .output('img', 'boxes', 'classes', 'scores') ) DataCollection(p('./example.jpg')).show() ``` result ## Factory Constructor Create the operator via the following factory method ***object_detection.detectron2(model_name='retinanet_resnet50', thresh=0.5, num_classes=1000, skip_preprocess=False)*** **Parameters:** ***model_name:*** `str` A string indicating which model to use. Available options: 1. `faster_rcnn_resnet50_c4` 2. `faster_rcnn_resnet50_dc5` 3. `faster_rcnn_resnet50_fpn` 4. `faster_rcnn_resnet101_c4` 5. `faster_rcnn_resnet101_dc5` 6. `faster_rcnn_resnet101_fpn` 7. `faster_rcnn_resnext101` 8. `retinanet_resnet50` 9. `retinanet_resnet101` ***thresh:*** `float` The threshold value for which an object is detected (default value: `0.5`). Set this value lower to detect more objects at the expense of accuracy, or higher to reduce the total number of detections but increase the quality of detected objects. ### Interface This operator takes an image as input. It first detects the objects appeared in the image, and generates a bounding box around each object. **Parameters:** ​ **img**: `towhee._types.Image` Image data wrapped in a (as a Towhee `Image`). **Return**: `List[numpy.ndarray[4], ...], List[str], numpy.ndarray` The return value is a tuple of `(boxes, classes, scores)`. `boxes` is a list of bounding boxes. Each bounding box is represented as a 1-dimensional numpy array consisting of the top-left and the bottom-right corners, i.e. `numpy.ndarray([x1, y1, x2, y2])`. `classes` is a list of prediction labels for each bounding box. `scores` is a list of confidence scores corresponding to each class and bounding box. # More Resources - [Approximate Nearest Neighbors Oh Yeah (Annoy) - Zilliz blog](https://zilliz.com/learn/approximate-nearest-neighbor-oh-yeah-ANNOY): Discover the capabilities of Annoy, an innovative algorithm revolutionizing approximate nearest neighbor searches for enhanced efficiency and precision. - [CLIP Object Detection: Merging AI Vision with Language Understanding - Zilliz blog](https://zilliz.com/learn/CLIP-object-detection-merge-AI-vision-with-language-understanding): CLIP Object Detection combines CLIP's text-image understanding with object detection tasks, allowing CLIP to locate and identify objects in images using texts. - [What is a Convolutional Neural Network? An Engineer's Guide](https://zilliz.com/glossary/convolutional-neural-network): Convolutional Neural Network is a type of deep neural network that processes images, speeches, and videos. Let's find out more about CNN. - [Using Vector Search to Better Understand Computer Vision Data - Zilliz blog](https://zilliz.com/blog/use-vector-search-to-better-understand-computer-vision-data): How Vector Search improves your understanding of Computer Vision Data - [Understanding ImageNet: A Key Resource for Computer Vision and AI Research](https://zilliz.com/glossary/imagenet): The large-scale image database with over 14 million annotated images. Learn how this dataset supports advancements in computer vision. - [What is Detection Transformers (DETR)? - Zilliz blog](https://zilliz.com/learn/detection-transformers-detr-end-to-end-object-detection-with-transformers): DETR (DEtection TRansformer) is a deep learning model for end-to-end object detection using transformers. - [What is approximate nearest neighbor search (ANNS)?](https://zilliz.com/glossary/anns): Learn how to use Approximate nearest neighbor search (ANNS) for efficient nearest-neighbor search in large datasets.