|
|
@ -1,10 +1,10 @@ |
|
|
|
# Text Embedding with Dpr |
|
|
|
# Text Embedding with DPR |
|
|
|
|
|
|
|
*author: Kyle He* |
|
|
|
|
|
|
|
<br /> |
|
|
|
|
|
|
|
## Desription |
|
|
|
## Description |
|
|
|
|
|
|
|
This operator uses Dense Passage Retrieval (DPR) to convert long text to embeddings. |
|
|
|
|
|
|
@ -14,7 +14,7 @@ Barlas Oğuz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, We |
|
|
|
|
|
|
|
**DPR** models were proposed in "Dense Passage Retrieval for Open-Domain Question Answering"[2]. |
|
|
|
|
|
|
|
In this work, we show that retrieval can be practically implemented using dense representations alone, |
|
|
|
In this work, they show that retrieval can be practically implemented using dense representations alone, |
|
|
|
where embeddings are learned from a small number of questions and passages by a simple dual-encoder framework[2]. |
|
|
|
|
|
|
|
### References |
|
|
@ -27,7 +27,7 @@ where embeddings are learned from a small number of questions and passages by a |
|
|
|
|
|
|
|
## Code Example |
|
|
|
|
|
|
|
Use the pretrained model "facebook/dpr-ctx_encoder-single-nq-base" |
|
|
|
Use the pre-trained model "facebook/dpr-ctx_encoder-single-nq-base" |
|
|
|
to generate a text embedding for the sentence "Hello, world.". |
|
|
|
|
|
|
|
*Write the pipeline*: |
|
|
@ -43,7 +43,7 @@ towhee.dc(["Hello, world."]) \ |
|
|
|
|
|
|
|
## Factory Constructor |
|
|
|
|
|
|
|
Create the operator via the following factory method |
|
|
|
Create the operator via the following factory method: |
|
|
|
|
|
|
|
***text_embedding.dpr(model_name="facebook/dpr-ctx_encoder-single-nq-base")*** |
|
|
|
|
|
|
@ -63,7 +63,7 @@ Supported model names: |
|
|
|
## Interface |
|
|
|
|
|
|
|
The operator takes a text in string as input. |
|
|
|
It loads tokenizer and pre-trained model using model name. |
|
|
|
It loads tokenizer and pre-trained model using model name |
|
|
|
and then return text embedding in ndarray. |
|
|
|
|
|
|
|
**Parameters:** |
|
|
|