From 75499219efb46945750cb357212889f420dac6f8 Mon Sep 17 00:00:00 2001 From: Jael Gu Date: Thu, 7 Apr 2022 17:33:59 +0800 Subject: [PATCH] Update README Signed-off-by: Jael Gu --- README.md | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index a692b19..42c1394 100644 --- a/README.md +++ b/README.md @@ -2,6 +2,7 @@ *author: Kyle He* +
## Desription @@ -22,12 +23,14 @@ length, making it easy to process documents of thousands of tokens or longer[2]. [2].https://arxiv.org/pdf/2004.05150.pdf +
+ ## Code Example Use the pretrained model "facebook/dpr-ctx_encoder-single-nq-base" to generate a text embedding for the sentence "Hello, world.". - *Write the pipeline*: +*Write the pipeline*: ```python from towhee import dc @@ -38,6 +41,8 @@ dc.stream(["Hello, world."]) \ .to_list() ``` +
+ ## Factory Constructor Create the operator via the following factory method @@ -46,12 +51,13 @@ Create the operator via the following factory method **Parameters:** -​ ***model_name***: *str* +***model_name***: *str* -​ The model name in string. +The model name in string. The default value is "allenai/longformer-base-4096". You can get the list of supported model names by calling `get_model_list` from [longformer.py](https://towhee.io/text-embedding/longformer/src/branch/main/longformer.py). +
## Interface @@ -61,10 +67,9 @@ and then return text embedding in ndarray. **Parameters:** -​ ***text***: *str* - -​ The text in string. +***text***: *str* +The text in string. **Returns**: