logo
Readme
Files and versions

1.1 KiB

Text Embedding with data2vec

author: David Wang


Description

This operator extracts features for text with data2vec. The core idea is to predict latent representations of the full input data based on a masked view of the input in a self-distillation setup using a standard Transformer architecture.


Code Example

Use the pre-trained model to generate a text embedding for the sentence "Hello, world.".

Write the pipeline in simplified style:

import towhee

towhee.dc(["Hello, world."]) \
      .text_embedding.data2vec() \
      .show()


Factory Constructor

Create the operator via the following factory method

data2vec(model_name='facebook/data2vec-text-base')

Parameters:

model_name: str

The model name in string. The default value is "facebook/data2vec-text-base".

Supported model name:

  • facebook/data2vec-text-base


Interface

Parameters:

text: str

​ The text in string.

Returns: numpy.ndarray

​ The text embedding extracted by model.

1.1 KiB

Text Embedding with data2vec

author: David Wang


Description

This operator extracts features for text with data2vec. The core idea is to predict latent representations of the full input data based on a masked view of the input in a self-distillation setup using a standard Transformer architecture.


Code Example

Use the pre-trained model to generate a text embedding for the sentence "Hello, world.".

Write the pipeline in simplified style:

import towhee

towhee.dc(["Hello, world."]) \
      .text_embedding.data2vec() \
      .show()


Factory Constructor

Create the operator via the following factory method

data2vec(model_name='facebook/data2vec-text-base')

Parameters:

model_name: str

The model name in string. The default value is "facebook/data2vec-text-base".

Supported model name:

  • facebook/data2vec-text-base


Interface

Parameters:

text: str

​ The text in string.

Returns: numpy.ndarray

​ The text embedding extracted by model.