This operator extracts features for image with [data2vec](https://arxiv.org/abs/2202.03555). The core idea is to predict latent representations of the full input data based on a masked view of the input in a self-distillation setup using a standard Transformer architecture.
<br/>
## Code Example
Load an image from path './towhee.jpg' to generate an image embedding.
*Write a pipeline with explicit inputs/outputs name specifications:*