|
@ -4,17 +4,17 @@ |
|
|
|
|
|
|
|
|
<br /> |
|
|
<br /> |
|
|
|
|
|
|
|
|
## Desription |
|
|
|
|
|
|
|
|
## Description |
|
|
|
|
|
|
|
|
A text embedding operator takes a sentence, paragraph, or document in string as an input |
|
|
A text embedding operator takes a sentence, paragraph, or document in string as an input |
|
|
and output an embedding vector in ndarray which captures the input's core semantic elements. |
|
|
and output an embedding vector in ndarray which captures the input's core semantic elements. |
|
|
This operator is implemented with pretrained models from [Huggingface Transformers](https://huggingface.co/docs/transformers). |
|
|
|
|
|
|
|
|
This operator is implemented with pre-trained models from [Huggingface Transformers](https://huggingface.co/docs/transformers). |
|
|
|
|
|
|
|
|
<br /> |
|
|
<br /> |
|
|
|
|
|
|
|
|
## Code Example |
|
|
## Code Example |
|
|
|
|
|
|
|
|
Use the pretrained model 'distilbert-base-cased' |
|
|
|
|
|
|
|
|
Use the pre-trained model 'distilbert-base-cased' |
|
|
to generate a text embedding for the sentence "Hello, world.". |
|
|
to generate a text embedding for the sentence "Hello, world.". |
|
|
|
|
|
|
|
|
*Write the pipeline*: |
|
|
*Write the pipeline*: |
|
@ -30,7 +30,7 @@ towhee.dc(["Hello, world."]) \ |
|
|
|
|
|
|
|
|
## Factory Constructor |
|
|
## Factory Constructor |
|
|
|
|
|
|
|
|
Create the operator via the following factory method |
|
|
|
|
|
|
|
|
Create the operator via the following factory method: |
|
|
|
|
|
|
|
|
***text_embedding.transformers(model_name="bert-base-uncased")*** |
|
|
***text_embedding.transformers(model_name="bert-base-uncased")*** |
|
|
|
|
|
|
|
@ -38,10 +38,10 @@ Create the operator via the following factory method |
|
|
|
|
|
|
|
|
***model_name***: *str* |
|
|
***model_name***: *str* |
|
|
|
|
|
|
|
|
The model name in string. |
|
|
|
|
|
|
|
|
The model name in string. |
|
|
The default model name is "bert-base-uncased". |
|
|
The default model name is "bert-base-uncased". |
|
|
|
|
|
|
|
|
Supported model names: |
|
|
|
|
|
|
|
|
Supported model names: |
|
|
|
|
|
|
|
|
<details><summary>Albert</summary> |
|
|
<details><summary>Albert</summary> |
|
|
|
|
|
|
|
@ -294,7 +294,7 @@ The default model name is "bert-base-uncased". |
|
|
|
|
|
|
|
|
## Interface |
|
|
## Interface |
|
|
|
|
|
|
|
|
The operator takes a text in string as input. |
|
|
|
|
|
|
|
|
The operator takes a piece of text in string as input. |
|
|
It loads tokenizer and pre-trained model using model name. |
|
|
It loads tokenizer and pre-trained model using model name. |
|
|
and then return text embedding in ndarray. |
|
|
and then return text embedding in ndarray. |
|
|
|
|
|
|
|
|