diff --git a/README.md b/README.md index d431bdf..9242a55 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,81 @@ +# Image Embedding with data2vec +*author: David Wang* -# More Resources - - \ No newline at end of file +
+ + + +## Description + +This operator extracts features for image with [data2vec](https://arxiv.org/abs/2202.03555). The core idea is to predict latent representations of the full input data based on a masked view of the input in a self-distillation setup using a standard Transformer architecture. + +
+ + +## Code Example + +Load an image from path './towhee.jpg' to generate an image embedding. + +*Write a pipeline with explicit inputs/outputs name specifications:* + +```python +from towhee import pipe, ops, DataCollection + +p = ( + pipe.input('path') + .map('path', 'img', ops.image_decode()) + .map('img', 'vec', ops.image_embedding.data2vec(model_name='facebook/data2vec-vision-base-ft1k')) + .output('img', 'vec') +) + +DataCollection(p('towhee.jpeg')).show() +``` +result2 + + +
+ + + +## Factory Constructor + +Create the operator via the following factory method + +***data2vec(model_name='facebook/data2vec-vision-base')*** + +**Parameters:** + + +​ ***model_name***: *str* + +The model name in string. +The default value is "facebook/data2vec-vision-base-ft1k". + +Supported model name: +- facebook/data2vec-vision-base-ft1k +- facebook/data2vec-vision-large-ft1k + +
+ + + +## Interface + +An image embedding operator takes a [towhee image](link/to/towhee/image/api/doc) as input. +It uses the pre-trained model specified by model name to generate an image embedding in ndarray. + + +**Parameters:** + +​ ***img:*** *towhee.types.Image (a sub-class of numpy.ndarray)* + +​ The decoded image data in towhee.types.Image (numpy.ndarray). + + + +**Returns:** *numpy.ndarray* + +​ The image embedding extracted by model. +