logo
rerank
repo-copy-icon

copied

You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Readme
Files and versions

91 lines
3.2 KiB

# Rerank QA Content
2 years ago
## Description
The Rerank operator is used to reorder the list of relevant documents for a query. It uses the [MS MARCO Cross-Encoders](https://www.sbert.net/docs/pretrained_cross-encoders.html#ms-marco) model to get the relevant scores and then reorders the documents.
<br />
## Code Example
- Run with ops
```Python
from towhee import ops
op = ops.rerank(threshold=0)
res = op('What is Towhee?',
['Towhee is Towhee is a cutting-edge framework to deal with unstructure data.', 'I do not know about towhee', 'Towhee has many powerful operators.', 'The weather is good' ])
```
- Run a pipeline
```python
from towhee import ops, pipe, DataCollection
p = (pipe.input('query', 'doc')
.map(('query', 'doc'), ('doc', 'score'), ops.rerank(threshold=0))
.flat_map(('doc', 'score'), ('doc', 'score'), lambda x, y: [(i, j) for i, j in zip(x, y)])
.output('query', 'doc', 'score')
)
DataCollection(p('What is Towhee?',
['Towhee is Towhee is a cutting-edge framework to deal with unstructure data.', 'I do not know about towhee', 'Towhee has many powerful operators.', 'The weather is good' ])
).show()
```
<br />
## Factory Constructor
Create the operator via the following factory method
***towhee.rerank(model_name: str = 'cross-encoder/ms-marco-MiniLM-L-12-v2')***
**Parameters:**
***model_name***: str
​ The model name of CrossEncoder, you can set it according to the [Model List](https://www.sbert.net/docs/pretrained-models/ce-msmarco.html#models-performance).
***threshold***: float
​ The threshold for filtering with score
***device***: str
<br />
## Interface
This operator is used to sort the documents of the query content and return the score, and can also set a threshold to filter the results.
**Parameters:**
***query***: str
The query content.
***docs***: list
A list of sentences to check the correlation with the query content.
<br />
**Return**: List[str], List[float]
The list of documents after rerank and the list of corresponding scores.
# More Resources
- [The guide to rerank-english-v3.0 | Cohere](https://zilliz.com/ai-models/rerank-english-v3.0): rerank-english-v3.0: a reranking model for English documents and semi-structured data (JSON); context length: 4096 tokens.
- [Optimizing RAG with Rerankers: The Role and Trade-offs - Zilliz blog](https://zilliz.com/learn/optimize-rag-with-rerankers-the-role-and-tradeoffs): Rerankers can enhance the accuracy and relevance of answers in RAG systems, but these benefits come with increased latency and computational costs.
- [What Are Rerankers and How They Enhance Information Retrieval - Zilliz blog](https://zilliz.com/learn/what-are-rerankers-enhance-information-retrieval): Rerankers are specialized components in information retrieval systems that perform a crucial second-stage evaluation of search results.
- [Building an Intelligent QA System with NLP and Milvus - Zilliz blog](https://zilliz.com/blog/building-intelligent-chatbot-with-nlp-and-milvus): The Next-Gen QA Bot is here
- [The guide to rerank-english-v2.0 | Cohere](https://zilliz.com/ai-models/rerank-english-v2.0): rerank-english-v2.0: a reranking model for English language documents with a context length of 512 tokens.