3 changed files with 87 additions and 3 deletions
@ -1,2 +1,86 @@ |
|||||
# rerank |
|
||||
|
# Rerank QA Content |
||||
|
|
||||
|
## Description |
||||
|
|
||||
|
The Rerank operator is used to reorder the list of relevant documents for a query. It uses the [MS MARCO Cross-Encoders](https://www.sbert.net/docs/pretrained_cross-encoders.html#ms-marco) model to get the relevant scores and then reorders the documents. |
||||
|
|
||||
|
<br /> |
||||
|
|
||||
|
|
||||
|
|
||||
|
## Code Example |
||||
|
|
||||
|
- Run with ops |
||||
|
|
||||
|
```Python |
||||
|
from towhee import ops |
||||
|
|
||||
|
op = ops.rerank() |
||||
|
res = op('What is Towhee?', |
||||
|
['Towhee is Towhee is a cutting-edge framework to deal with unstructure data.', 'I do not know about towhee', 'Towhee has many powerful operators.', 'The weather is good' ], |
||||
|
0) |
||||
|
``` |
||||
|
|
||||
|
- Run a pipeline |
||||
|
|
||||
|
```python |
||||
|
from towhee import ops, pipe, DataCollection |
||||
|
|
||||
|
p = (pipe.input('query', 'doc', 'threshold') |
||||
|
.map(('query', 'doc', 'threshold'), ('doc', 'score'), ops.rerank()) |
||||
|
.flat_map(('doc', 'score'), ('doc', 'score'), lambda x, y: [(i, j) for i, j in zip(x, y)]) |
||||
|
.output('query', 'doc', 'score') |
||||
|
) |
||||
|
|
||||
|
DataCollection(p('What is Towhee?', |
||||
|
['Towhee is Towhee is a cutting-edge framework to deal with unstructure data.', 'I do not know about towhee', 'Towhee has many powerful operators.', 'The weather is good' ], |
||||
|
0) |
||||
|
).show() |
||||
|
``` |
||||
|
|
||||
|
<img src="./result.png" height="100px"/> |
||||
|
|
||||
|
<br /> |
||||
|
|
||||
|
|
||||
|
|
||||
|
## Factory Constructor |
||||
|
|
||||
|
Create the operator via the following factory method |
||||
|
|
||||
|
***towhee.rerank(model_name: str = 'cross-encoder/ms-marco-MiniLM-L-12-v2')*** |
||||
|
|
||||
|
**Parameters:** |
||||
|
|
||||
|
***model_name***: str |
||||
|
|
||||
|
The model name of CrossEncoder, you can set it according to the [Model List](https://www.sbert.net/docs/pretrained-models/ce-msmarco.html#models-performance). |
||||
|
|
||||
|
<br /> |
||||
|
|
||||
|
|
||||
|
|
||||
|
## Interface |
||||
|
|
||||
|
This operator is used to sort the documents of the query content and return the score, and can also set a threshold to filter the results. |
||||
|
|
||||
|
**Parameters:** |
||||
|
|
||||
|
***query***: str |
||||
|
|
||||
|
The query content. |
||||
|
|
||||
|
***docs***: list |
||||
|
|
||||
|
A list of sentences to check the correlation with the query content. |
||||
|
|
||||
|
***threshold***: float |
||||
|
|
||||
|
The threshold for filtering with score, defaults to none, i.e., no filtering. |
||||
|
|
||||
|
|
||||
|
<br /> |
||||
|
|
||||
|
**Return**: List[str], List[float] |
||||
|
|
||||
|
The list of documents after rerank and the list of corresponding scores. |
After Width: | Height: | Size: 47 KiB |
Loading…
Reference in new issue