Text Embedding with data2vec

author: David Wang

Description

This operator extracts features for text with data2vec. The core idea is to predict latent representations of the full input data based on a masked view of the input in a self-distillation setup using a standard Transformer architecture.

Code Example

Use the pre-trained model to generate a text embedding for the sentence "Hello, world.".

Write the pipeline in simplified style:

import towhee

towhee.dc(["Hello, world."]) \
      .text_embedding.data2vec() \
      .show()

Factory Constructor

Create the operator via the following factory method

data2vec()

Interface

Parameters:

text: str

The text in string.

Returns: numpy.ndarray

The text embedding extracted by model.

wxywb 2faa93290e change data2vec_text to data2vec. Signed-off-by: wxywb <xy.wang@zilliz.com>			5 Commits
.gitattributes	1.1 KiB	Initial commit	4 years ago
README.md	904 B	change data2vec_text to data2vec.	4 years ago
__init__.py	730 B	change data2vec_text to data2vec.	4 years ago
data2vec_text.py	1.1 KiB	change data2vec_text to data2vec.	4 years ago
main.py	0 B	init the operator.	4 years ago
requirements.txt	39 B	change data2vec_text to data2vec.	4 years ago

Readme

Files and versions

Text Embedding with data2vec

Description

Code Example

Factory Constructor

Interface