You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
|
|
|
# BERT Text Embedding Operator (Pytorch)
|
|
|
|
|
|
|
|
Authors: Kyle He
|
|
|
|
|
|
|
|
## Overview
|
|
|
|
|
|
|
|
This operator transforms text into embedding using BERT[1], which stands for
|
|
|
|
Bidirectional Encoder Representations from Transformers.
|
|
|
|
|
|
|
|
## Interface
|
|
|
|
|
|
|
|
```python
|
|
|
|
__call__(self, text: str)
|
|
|
|
```
|
|
|
|
|
|
|
|
**Args:**
|
|
|
|
|
|
|
|
- audio_path:
|
|
|
|
- the text to be embedded
|
|
|
|
- supported types: str
|
|
|
|
|
|
|
|
**Returns:**
|
|
|
|
|
|
|
|
The Operator returns a tuple Tuple[('embs', numpy.ndarray)] containing following fields:
|
|
|
|
|
|
|
|
- embs:
|
|
|
|
- embeddings of the text
|
|
|
|
- data type: `numpy.ndarray`
|
|
|
|
- shape: 768
|
|
|
|
|
|
|
|
## Requirements
|
|
|
|
|
|
|
|
You can get the required python package by [requirements.txt](./requirements.txt).
|
|
|
|
|
|
|
|
## How it works
|
|
|
|
|
|
|
|
The `towhee/torch-bert` Operator is based on Huggingface[2].
|
|
|
|
|
|
|
|
## Reference
|
|
|
|
|
|
|
|
[1]. https://arxiv.org/pdf/1810.04805.pdf
|
|
|
|
|
|
|
|
[2]. https://huggingface.co/docs/transformers
|