logo
rerank
repo-copy-icon

copied

Browse Source

Update README

Signed-off-by: shiyu22 <shiyu.chen@zilliz.com>
main
shiyu22 2 years ago
parent
commit
ef89201e5b
  1. 86
      README.md
  2. 2
      rerank.py
  3. BIN
      result.png

86
README.md

@ -1,2 +1,86 @@
# rerank
# Rerank QA Content
## Description
The Rerank operator is used to reorder the list of relevant documents for a query. It uses the [MS MARCO Cross-Encoders](https://www.sbert.net/docs/pretrained_cross-encoders.html#ms-marco) model to get the relevant scores and then reorders the documents.
<br />
## Code Example
- Run with ops
```Python
from towhee import ops
op = ops.rerank()
res = op('What is Towhee?',
['Towhee is Towhee is a cutting-edge framework to deal with unstructure data.', 'I do not know about towhee', 'Towhee has many powerful operators.', 'The weather is good' ],
0)
```
- Run a pipeline
```python
from towhee import ops, pipe, DataCollection
p = (pipe.input('query', 'doc', 'threshold')
.map(('query', 'doc', 'threshold'), ('doc', 'score'), ops.rerank())
.flat_map(('doc', 'score'), ('doc', 'score'), lambda x, y: [(i, j) for i, j in zip(x, y)])
.output('query', 'doc', 'score')
)
DataCollection(p('What is Towhee?',
['Towhee is Towhee is a cutting-edge framework to deal with unstructure data.', 'I do not know about towhee', 'Towhee has many powerful operators.', 'The weather is good' ],
0)
).show()
```
<img src="./result.png" height="100px"/>
<br />
## Factory Constructor
Create the operator via the following factory method
***towhee.rerank(model_name: str = 'cross-encoder/ms-marco-MiniLM-L-12-v2')***
**Parameters:**
***model_name***: str
​ The model name of CrossEncoder, you can set it according to the [Model List](https://www.sbert.net/docs/pretrained-models/ce-msmarco.html#models-performance).
<br />
## Interface
This operator is used to sort the documents of the query content and return the score, and can also set a threshold to filter the results.
**Parameters:**
***query***: str
The query content.
***docs***: list
A list of sentences to check the correlation with the query content.
***threshold***: float
​ The threshold for filtering with score, defaults to none, i.e., no filtering.
<br />
**Return**: List[str], List[float]
The list of documents after rerank and the list of corresponding scores.

2
rerank.py

@ -6,7 +6,7 @@ from towhee.operator import NNOperator
class ReRank(NNOperator): class ReRank(NNOperator):
def __init__(self, model_name: str = 'cross-encoder/ms-marco-MiniLM-L-6-v2'):
def __init__(self, model_name: str = 'cross-encoder/ms-marco-MiniLM-L-12-v2'):
super().__init__() super().__init__()
self._model_name = model_name self._model_name = model_name
self._model = CrossEncoder(self._model_name, max_length=1000) self._model = CrossEncoder(self._model_name, max_length=1000)

BIN
result.png

Binary file not shown.

After

Width:  |  Height:  |  Size: 47 KiB

Loading…
Cancel
Save