Readme
Files and versions
Updated 2 years ago
towhee
Rerank QA Content
Description
The Rerank operator is used to reorder the list of relevant documents for a query. It uses the MS MARCO Cross-Encoders model to get the relevant scores and then reorders the documents.
Code Example
- Run with ops
from towhee import ops
op = ops.rerank()
res = op('What is Towhee?',
['Towhee is Towhee is a cutting-edge framework to deal with unstructure data.', 'I do not know about towhee', 'Towhee has many powerful operators.', 'The weather is good' ],
0)
- Run a pipeline
from towhee import ops, pipe, DataCollection
p = (pipe.input('query', 'doc', 'threshold')
.map(('query', 'doc', 'threshold'), ('doc', 'score'), ops.rerank())
.flat_map(('doc', 'score'), ('doc', 'score'), lambda x, y: [(i, j) for i, j in zip(x, y)])
.output('query', 'doc', 'score')
)
DataCollection(p('What is Towhee?',
['Towhee is Towhee is a cutting-edge framework to deal with unstructure data.', 'I do not know about towhee', 'Towhee has many powerful operators.', 'The weather is good' ],
0)
).show()

Factory Constructor
Create the operator via the following factory method
towhee.rerank(model_name: str = 'cross-encoder/ms-marco-MiniLM-L-12-v2')
Parameters:
model_name: str
The model name of CrossEncoder, you can set it according to the Model List.
Interface
This operator is used to sort the documents of the query content and return the score, and can also set a threshold to filter the results.
Parameters:
query: str
The query content.
docs: list
A list of sentences to check the correlation with the query content.
threshold: float
The threshold for filtering with score, defaults to none, i.e., no filtering.
Return: List[str], List[float]
The list of documents after rerank and the list of corresponding scores.
| 7 Commits | ||
---|---|---|---|
|
1.1 KiB
|
2 years ago | |
|
2.2 KiB
|
2 years ago | |
|
91 B
|
2 years ago | |
|
21 B
|
2 years ago | |
|
965 B
|
2 years ago | |
|
47 KiB
|
2 years ago |