logo
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Readme
Files and versions

Updated 3 years ago

text-embedding

Text Embedding with data2vec

author: David Wang


Description

This operator extracts features for text with data2vec. The core idea is to predict latent representations of the full input data based on a masked view of the input in a self-distillation setup using a standard Transformer architecture.


Code Example

Use the pre-trained model to generate a text embedding for the sentence "Hello, world.".

Write the pipeline in simplified style:

import towhee

towhee.dc(["Hello, world."]) \
      .text_embedding.data2vec() \
      .show()


Factory Constructor

Create the operator via the following factory method

data2vec()


Interface

Parameters:

text: str

​ The text in string.

Returns: numpy.ndarray

​ The text embedding extracted by model.

wxywb 2faa93290e change data2vec_text to data2vec. 5 Commits
file-icon .gitattributes
1.1 KiB
download-icon
Initial commit 3 years ago
file-icon README.md
904 B
download-icon
change data2vec_text to data2vec. 3 years ago
file-icon __init__.py
730 B
download-icon
change data2vec_text to data2vec. 3 years ago
file-icon data2vec_text.py
1.1 KiB
download-icon
change data2vec_text to data2vec. 3 years ago
file-icon main.py
0 B
download-icon
init the operator. 3 years ago
file-icon requirements.txt
39 B
download-icon
change data2vec_text to data2vec. 3 years ago