yolov5/README.md

# Object Detection with Yolov5

*author: shiyu22*


<br />


### Description

**Object Detection** is a computer vision technique that locates and identifies people, items, or other objects in an image. Object detection has applications in many areas of computer vision, including image retrieval, image annotation, vehicle counting, object tracking, etc.

This operator uses [PyTorch.yolov5](https://pytorch.org/hub/ultralytics_yolov5/) to detect the object.


<br />


### Code Example

Load an image from path './test.png' and use yolov5 model to detect objects in the image.

*Write a same pipeline with explicit inputs/outputs name specifications:*

```Python
from towhee import pipe, ops, DataCollection

p = (
    pipe.input('path')
        .map('path', 'img', ops.image_decode())
        .map('img', ('box', 'class', 'score'), ops.object_detection.yolov5())
        .map(('img', 'box'), 'object', ops.image_crop(clamp=True))
        .output('img', 'object', 'class')
)

DataCollection(p('./test.png')).show()
```

<img src="./result.png" alt="result" height="140px"/>

<br />


## Factory Constructor

Create the operator via the following factory method:

***object_detection.yolov5()***


<br />


### Interface

The operator takes an image as input. It first detects the objects appeared in the image, and generates a bounding box around each object.

**Parameters:**

	**img**: numpy.ndarray

	Image data in ndarray format.


**Return**: List[List[(int, int, int, int)], ...], List[str], List[float]

The return value is a tuple of (boxes, classes, scores). The *boxes* is a list of bounding boxes. Each bounding box is represented by the top-left and the bottom right points, i.e. (x1, y1, x2, y2). The *classes* is a list of prediction labels. The *scores* is a list of confidence scores.


# More Resources

- [CLIP Object Detection: Merging AI Vision with Language Understanding - Zilliz blog](https://zilliz.com/learn/CLIP-object-detection-merge-AI-vision-with-language-understanding): CLIP Object Detection combines CLIP's text-image understanding with object detection tasks, allowing CLIP to locate and identify objects in images using texts.
- [Computer Vision with FiftyOne | Milvus & Zilliz Cloud](https://zilliz.com/product/integrations/FiftyOne): nan
- [What is a Convolutional Neural Network? An Engineer's Guide](https://zilliz.com/glossary/convolutional-neural-network): Convolutional Neural Network is a type of deep neural network that processes images, speeches, and videos. Let's find out more about CNN.
- [Understanding Computer Vision  - Zilliz blog](https://zilliz.com/learn/what-is-computer-vision): Computer Vision is a field of Artificial Intelligence that enables machines to capture and interpret visual information from the world just like humans do.
- [Using Vector Search to Better Understand Computer Vision Data - Zilliz blog](https://zilliz.com/blog/use-vector-search-to-better-understand-computer-vision-data): How Vector Search improves your understanding of Computer Vision Data
- [What are Vision Transformers (ViT)? - Zilliz blog](https://zilliz.com/learn/understanding-vision-transformers-vit): Vision Transformers (ViTs) are neural network models that use transformers to perform computer vision tasks like object detection and image classification.
- [What is Detection Transformers (DETR)?  - Zilliz blog](https://zilliz.com/learn/detection-transformers-detr-end-to-end-object-detection-with-transformers): DETR (DEtection TRansformer) is a deep learning model for end-to-end object detection using transformers.
Update yolov5 README Signed-off-by: shiyu22 <shiyu.chen@zilliz.com> 3 years ago			`# Object Detection with Yolov5`

			`author: shiyu22`



			`<br />`



			`### Description`

			`Object Detection is a computer vision technique that locates and identifies people, items, or other objects in an image. Object detection has applications in many areas of computer vision, including image retrieval, image annotation, vehicle counting, object tracking, etc.`

			`This operator uses [PyTorch.yolov5](https://pytorch.org/hub/ultralytics_yolov5/) to detect the object.`



			`<br />`



			`### Code Example`

Update for new pipe Signed-off-by: Jael Gu <mengjia.gu@zilliz.com> 2 years ago			`Load an image from path './test.png' and use yolov5 model to detect objects in the image.`
Update yolov5 README Signed-off-by: shiyu22 <shiyu.chen@zilliz.com> 3 years ago
Update for new pipe Signed-off-by: Jael Gu <mengjia.gu@zilliz.com> 2 years ago			`Write a same pipeline with explicit inputs/outputs name specifications:`
Update yolov5 README Signed-off-by: shiyu22 <shiyu.chen@zilliz.com> 3 years ago
			```Python
remove dc2 Signed-off-by: Jael Gu <mengjia.gu@zilliz.com> 2 years ago			`from towhee import pipe, ops, DataCollection`
Update yolov5 README Signed-off-by: shiyu22 <shiyu.chen@zilliz.com> 3 years ago
Update for new pipe Signed-off-by: Jael Gu <mengjia.gu@zilliz.com> 2 years ago			`p = (`
			`pipe.input('path')`
			`.map('path', 'img', ops.image_decode())`
			`.map('img', ('box', 'class', 'score'), ops.object_detection.yolov5())`
			`.map(('img', 'box'), 'object', ops.image_crop(clamp=True))`
			`.output('img', 'object', 'class')`
			`)`
Update yolov5 README Signed-off-by: shiyu22 <shiyu.chen@zilliz.com> 3 years ago
Update for new pipe Signed-off-by: Jael Gu <mengjia.gu@zilliz.com> 2 years ago			`DataCollection(p('./test.png')).show()`
			```
Update yolov5 README Signed-off-by: shiyu22 <shiyu.chen@zilliz.com> 3 years ago
Update for new pipe Signed-off-by: Jael Gu <mengjia.gu@zilliz.com> 2 years ago			`<img src="./result.png" alt="result" height="140px"/>`
Update yolov5 README Signed-off-by: shiyu22 <shiyu.chen@zilliz.com> 3 years ago
			`<br />`



			`## Factory Constructor`

[DOC] Refine Readme Signed-off-by: LocoRichard <lichen.wang@zilliz.com> 3 years ago			`Create the operator via the following factory method:`
Update yolov5 README Signed-off-by: shiyu22 <shiyu.chen@zilliz.com> 3 years ago
			`*object_detection.yolov5()*`



			`<br />`



			`### Interface`

[DOC] Refine Readme Signed-off-by: LocoRichard <lichen.wang@zilliz.com> 3 years ago			`The operator takes an image as input. It first detects the objects appeared in the image, and generates a bounding box around each object.`
Update yolov5 README Signed-off-by: shiyu22 <shiyu.chen@zilliz.com> 3 years ago
			`Parameters:`

			`img: numpy.ndarray`

			`Image data in ndarray format.`



Update yolov5 README Signed-off-by: shiyu22 <shiyu.chen@zilliz.com> 3 years ago			`Return: List[List[(int, int, int, int)], ...], List[str], List[float]`
Update yolov5 README Signed-off-by: shiyu22 <shiyu.chen@zilliz.com> 3 years ago
[DOC] Refine Readme Signed-off-by: LocoRichard <lichen.wang@zilliz.com> 3 years ago			`The return value is a tuple of (boxes, classes, scores). The boxes is a list of bounding boxes. Each bounding box is represented by the top-left and the bottom right points, i.e. (x1, y1, x2, y2). The classes is a list of prediction labels. The scores is a list of confidence scores.`
Initial commit 3 years ago
Add more resources Signed-off-by: Jael Gu <mengjia.gu@zilliz.com> 1 month ago


Add more resources Signed-off-by: Jael Gu <mengjia.gu@zilliz.com> 1 month ago
			`# More Resources`

			`- [CLIP Object Detection: Merging AI Vision with Language Understanding - Zilliz blog](https://zilliz.com/learn/CLIP-object-detection-merge-AI-vision-with-language-understanding): CLIP Object Detection combines CLIP's text-image understanding with object detection tasks, allowing CLIP to locate and identify objects in images using texts.`
Add more resources Signed-off-by: Jael Gu <mengjia.gu@zilliz.com> 1 month ago			`- [Computer Vision with FiftyOne \| Milvus & Zilliz Cloud](https://zilliz.com/product/integrations/FiftyOne): nan`
			`- [What is a Convolutional Neural Network? An Engineer's Guide](https://zilliz.com/glossary/convolutional-neural-network): Convolutional Neural Network is a type of deep neural network that processes images, speeches, and videos. Let's find out more about CNN.`
			`- [Understanding Computer Vision - Zilliz blog](https://zilliz.com/learn/what-is-computer-vision): Computer Vision is a field of Artificial Intelligence that enables machines to capture and interpret visual information from the world just like humans do.`
			`- [Using Vector Search to Better Understand Computer Vision Data - Zilliz blog](https://zilliz.com/blog/use-vector-search-to-better-understand-computer-vision-data): How Vector Search improves your understanding of Computer Vision Data`
			`- [What are Vision Transformers (ViT)? - Zilliz blog](https://zilliz.com/learn/understanding-vision-transformers-vit): Vision Transformers (ViTs) are neural network models that use transformers to perform computer vision tasks like object detection and image classification.`
Add more resources Signed-off-by: Jael Gu <mengjia.gu@zilliz.com> 1 month ago			`- [What is Detection Transformers (DETR)? - Zilliz blog](https://zilliz.com/learn/detection-transformers-detr-end-to-end-object-detection-with-transformers): DETR (DEtection TRansformer) is a deep learning model for end-to-end object detection using transformers.`