Readme
Files and versions
Updated 1 year ago
towhee
Rerank QA Content
Description
The Rerank operator is used to reorder the list of relevant documents for a query. It uses the MS MARCO Cross-Encoders model to get the relevant scores and then reorders the documents.
Code Example
- Run with ops
from towhee import ops
op = ops.rerank(threshold=0)
res = op('What is Towhee?',
['Towhee is Towhee is a cutting-edge framework to deal with unstructure data.', 'I do not know about towhee', 'Towhee has many powerful operators.', 'The weather is good' ])
- Run a pipeline
from towhee import ops, pipe, DataCollection
p = (pipe.input('query', 'doc')
.map(('query', 'doc'), ('doc', 'score'), ops.rerank(threshold=0))
.flat_map(('doc', 'score'), ('doc', 'score'), lambda x, y: [(i, j) for i, j in zip(x, y)])
.output('query', 'doc', 'score')
)
DataCollection(p('What is Towhee?',
['Towhee is Towhee is a cutting-edge framework to deal with unstructure data.', 'I do not know about towhee', 'Towhee has many powerful operators.', 'The weather is good' ])
).show()
Factory Constructor
Create the operator via the following factory method
towhee.rerank(model_name: str = 'cross-encoder/ms-marco-MiniLM-L-12-v2')
Parameters:
model_name: str
The model name of CrossEncoder, you can set it according to the Model List.
threshold: float
The threshold for filtering with score
device: str
Interface
This operator is used to sort the documents of the query content and return the score, and can also set a threshold to filter the results.
Parameters:
query: str
The query content.
docs: list
A list of sentences to check the correlation with the query content.
Return: List[str], List[float]
The list of documents after rerank and the list of corresponding scores.
| 21 Commits | ||
---|---|---|---|
|
1.1 KiB
|
2 years ago | |
|
2.1 KiB
|
2 years ago | |
|
91 B
|
2 years ago | |
|
13 B
|
1 year ago | |
|
5.5 KiB
|
2 years ago | |
|
47 KiB
|
2 years ago | |
|
4.1 KiB
|
2 years ago |