|
|
|
# Image Embedding with Timm
|
|
|
|
|
|
|
|
*author: Jael Gu, Filip*
|
|
|
|
|
|
|
|
<br />
|
|
|
|
|
|
|
|
## Desription
|
|
|
|
|
|
|
|
An image embedding operator generates a vector when given an image.
|
|
|
|
This operator extracts features for images with pretrained models provided by [Timm](https://github.com/rwightman/pytorch-image-models).
|
|
|
|
Timm is a deep-learning library developed by [Ross Wightman](https://twitter.com/wightmanr),
|
|
|
|
which maintains SOTA deep-learning models and tools in computer vision.
|
|
|
|
|
|
|
|
<br />
|
|
|
|
|
|
|
|
## Code Example
|
|
|
|
|
|
|
|
Load an image from path './towhee.jpg'
|
|
|
|
and use the pretrained ResNet50 model ('resnet50') to generate an image embedding.
|
|
|
|
|
|
|
|
*Write the pipeline in simplified style*:
|
|
|
|
|
|
|
|
```python
|
|
|
|
import towhee
|
|
|
|
|
|
|
|
towhee.glob('./towhee.jpg') \
|
|
|
|
.image_decode.cv2() \
|
|
|
|
.image_embedding.timm(model_name='resnet50') \
|
|
|
|
.show()
|
|
|
|
```
|
|
|
|
<img src="./result1.png" height="50px"/>
|
|
|
|
|
|
|
|
*Write a same pipeline with explicit inputs/outputs name specifications:*
|
|
|
|
|
|
|
|
```python
|
|
|
|
import towhee
|
|
|
|
|
|
|
|
towhee.glob['path']('./towhee.jpg') \
|
|
|
|
.image_decode.cv2['path', 'img']() \
|
|
|
|
.image_embedding.timm['img', 'vec'](model_name='resnet50') \
|
|
|
|
.select('img', 'vec') \
|
|
|
|
.show()
|
|
|
|
```
|
|
|
|
<img src="./result2.png" height="150px"/>
|
|
|
|
|
|
|
|
<br />
|
|
|
|
|
|
|
|
## Factory Constructor
|
|
|
|
|
|
|
|
Create the operator via the following factory method
|
|
|
|
|
|
|
|
***image_embedding.timm(model_name='resnet34', num_classes=1000, skip_preprocess=False)***
|
|
|
|
|
|
|
|
**Parameters:**
|
|
|
|
|
|
|
|
***model_name***: *str*
|
|
|
|
|
|
|
|
The model name in string. The default value is "resnet34".
|
|
|
|
Refer [Timm Docs](https://fastai.github.io/timmdocs/#List-Models-with-Pretrained-Weights) to get a full list of supported models.
|
|
|
|
|
|
|
|
|
|
|
|
***num_classes***: *int*
|
|
|
|
|
|
|
|
The number of classes. The default value is 1000.
|
|
|
|
It is related to model and dataset.
|
|
|
|
|
|
|
|
***skip_preprocess***: *bool*
|
|
|
|
|
|
|
|
The flag to control whether to skip image preprocess.
|
|
|
|
The default value is False.
|
|
|
|
If set to True, it will skip image preprocessing steps (transforms).
|
|
|
|
In this case, input image data must be prepared in advance in order to properly fit the model.
|
|
|
|
|
|
|
|
<br />
|
|
|
|
|
|
|
|
## Interface
|
|
|
|
|
|
|
|
An image embedding operator takes a towhee image as input.
|
|
|
|
It uses the pre-trained model specified by model name to generate an image embedding in ndarray.
|
|
|
|
|
|
|
|
|
|
|
|
**Parameters:**
|
|
|
|
|
|
|
|
***img***: *towhee.types.Image (a sub-class of numpy.ndarray)*
|
|
|
|
|
|
|
|
The decoded image data in numpy.ndarray.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
**Returns**:
|
|
|
|
|
|
|
|
*numpy.ndarray*
|
|
|
|
|
|
|
|
The image embedding extracted by model.
|
|
|
|
|
|
|
|
|
|
|
|
|