logo
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Readme
Files and versions

96 lines
3.1 KiB

3 years ago
# Image Crop Implementation with CV2
*author: David Wang*
<br />
## Description
3 years ago
An image crop operator implementation with OpenCV.
<br />
## Code Example
Crop the face from 'avengers.jpg'.
3 years ago
```
from towhee import pipe, ops, DataCollection
p = (
pipe.input('path')
.map('path', 'img', ops.image_decode())
.map('img', ('box','score'), ops.face_detection.retinaface())
.map(('img', 'box'), 'crop', ops.image_crop(clamp = True))
.output('img', 'crop')
)
3 years ago
DataCollection(p('./avengers.jpg')).show()
```
3 years ago
<img src="./result2.png" height="150px"/>
3 years ago
<br />
## Factory Constructor
Create the operator via the following factory method
***image_crop(clamp = True)***
3 years ago
**Parameters:**
**clamp:** *bool*
​ If set True, coordinates of bounding boxes would be clamped into image size.
3 years ago
<br />
## Interface
An image crop operator takes an image and bounding boxes as input. It cropes the image into ROIs(region of interest).
3 years ago
**Parameters:**
**img:** *towhee.types.Image (a sub-class of numpy.ndarray)*
3 years ago
​ The image need to be cropped.
3 years ago
**bboxes:** *numpy.ndarray*
3 years ago
​ The nx4 numpy tensor for n bounding boxes need to crop, each row is formatted as (x1, y1, x2, y2).
3 years ago
**Returns**: *towhee.types.Image (a sub-class of numpy.ndarray)*
​ The cropped image data as numpy.ndarray.
3 years ago
3 years ago
# More Resources
- [Supercharged Semantic Similarity Search in Production - Zilliz blog](https://zilliz.com/learn/supercharged-semantic-similarity-search-in-production): Building a Blazing Fast, Highly Scalable Text-to-Image Search with CLIP embeddings and Milvus, the most advanced open-source vector database.
- [The guide to clip-vit-base-patch32 | OpenAI](https://zilliz.com/ai-models/clip-vit-base-patch32): clip-vit-base-patch32: a CLIP multimodal model variant by OpenAI for image and text embedding.
- [Using Vector Search to Better Understand Computer Vision Data - Zilliz blog](https://zilliz.com/blog/use-vector-search-to-better-understand-computer-vision-data): How Vector Search improves your understanding of Computer Vision Data
- [Demystifying Color Histograms: A Guide to Image Processing and Analysis - Zilliz blog](https://zilliz.com/learn/demystifying-color-histograms): Mastering color histograms is indispensable for anyone involved in image processing and analysis. By understanding the nuances of color distributions and leveraging advanced techniques, practitioners can unlock the full potential of color histograms in various imaging projects and research endeavors.
- [Understanding ImageNet: A Key Resource for Computer Vision and AI Research](https://zilliz.com/glossary/imagenet): The large-scale image database with over 14 million annotated images. Learn how this dataset supports advancements in computer vision.
- [From Text to Image: Fundamentals of CLIP - Zilliz blog](https://zilliz.com/blog/fundamentals-of-clip): Search algorithms rely on semantic similarity to retrieve the most relevant results. With the CLIP model, the semantics of texts and images can be connected in a high-dimensional vector space. Read this simple introduction to see how CLIP can help you build a powerful text-to-image service.