rerank/README.md

# Rerank QA Content

## Description

The Rerank operator is used to reorder the list of relevant documents for a query. It uses the [MS MARCO Cross-Encoders](https://www.sbert.net/docs/pretrained_cross-encoders.html#ms-marco) model to get the relevant scores and then reorders the documents.

<br />


## Code Example

- Run with ops

```Python
from towhee import ops

op = ops.rerank(threshold=0)
res = op('What is Towhee?',
         ['Towhee is Towhee is a cutting-edge framework to deal with unstructure data.', 'I do not know about towhee', 'Towhee has many powerful operators.', 'The weather is good' ])
```

- Run a pipeline

```python
from towhee import ops, pipe, DataCollection

p = (pipe.input('query', 'doc')
         .map(('query', 'doc'), ('doc', 'score'), ops.rerank(threshold=0))
         .flat_map(('doc', 'score'), ('doc', 'score'), lambda x, y: [(i, j) for i, j in zip(x, y)])
         .output('query', 'doc', 'score')
     )

DataCollection(p('What is Towhee?',
                 ['Towhee is Towhee is a cutting-edge framework to deal with unstructure data.', 'I do not know about towhee', 'Towhee has many powerful operators.', 'The weather is good' ])
              ).show()
```


<br />


## Factory Constructor

Create the operator via the following factory method

***towhee.rerank(model_name: str = 'cross-encoder/ms-marco-MiniLM-L-12-v2')***

**Parameters:**

   ***model_name***: str

	The model name of CrossEncoder, you can set it according to the [Model List](https://www.sbert.net/docs/pretrained-models/ce-msmarco.html#models-performance).

	***threshold***: float

    The threshold for filtering with score
   ***device***: str
<br />


## Interface

This operator is used to sort the documents of the query content and return the score, and can also set a threshold to filter the results.

**Parameters:**

   ***query***: str

   The query content.

	***docs***: list

   A list of sentences to check the correlation with the query content.


<br />

**Return**: List[str], List[float]

The list of documents after rerank and the list of corresponding scores.

# More Resources

- [The guide to rerank-english-v3.0 | Cohere](https://zilliz.com/ai-models/rerank-english-v3.0): rerank-english-v3.0: a reranking model for English documents and semi-structured data (JSON); context length: 4096 tokens.
- [Optimizing RAG with Rerankers: The Role and Trade-offs  - Zilliz blog](https://zilliz.com/learn/optimize-rag-with-rerankers-the-role-and-tradeoffs): Rerankers can enhance the accuracy and relevance of answers in RAG systems, but these benefits come with increased latency and computational costs.
- [What Are Rerankers and How They Enhance Information Retrieval  - Zilliz blog](https://zilliz.com/learn/what-are-rerankers-enhance-information-retrieval): Rerankers are specialized components in information retrieval systems that perform a crucial second-stage evaluation of search results.
- [Building an Intelligent QA System with NLP and Milvus - Zilliz blog](https://zilliz.com/blog/building-intelligent-chatbot-with-nlp-and-milvus): The Next-Gen QA Bot is here
- [The guide to rerank-english-v2.0 | Cohere](https://zilliz.com/ai-models/rerank-english-v2.0): rerank-english-v2.0: a reranking model for English language documents with a context length of 512 tokens.
Update README Signed-off-by: shiyu22 <shiyu.chen@zilliz.com> 3 years ago			`# Rerank QA Content`
Initial commit 3 years ago
Update README Signed-off-by: shiyu22 <shiyu.chen@zilliz.com> 3 years ago			`## Description`

			`The Rerank operator is used to reorder the list of relevant documents for a query. It uses the [MS MARCO Cross-Encoders](https://www.sbert.net/docs/pretrained_cross-encoders.html#ms-marco) model to get the relevant scores and then reorders the documents.`

			`<br />`



			`## Code Example`

			`- Run with ops`

			```Python
			`from towhee import ops`

Update rerank Signed-off-by: shiyu22 <shiyu.chen@zilliz.com> 3 years ago			`op = ops.rerank(threshold=0)`
Update README Signed-off-by: shiyu22 <shiyu.chen@zilliz.com> 3 years ago			`res = op('What is Towhee?',`
Update rerank Signed-off-by: shiyu22 <shiyu.chen@zilliz.com> 3 years ago			`['Towhee is Towhee is a cutting-edge framework to deal with unstructure data.', 'I do not know about towhee', 'Towhee has many powerful operators.', 'The weather is good' ])`
Update README Signed-off-by: shiyu22 <shiyu.chen@zilliz.com> 3 years ago			```

			`- Run a pipeline`

			```python
			`from towhee import ops, pipe, DataCollection`

Update rerank Signed-off-by: shiyu22 <shiyu.chen@zilliz.com> 3 years ago			`p = (pipe.input('query', 'doc')`
			`.map(('query', 'doc'), ('doc', 'score'), ops.rerank(threshold=0))`
Update README Signed-off-by: shiyu22 <shiyu.chen@zilliz.com> 3 years ago			`.flat_map(('doc', 'score'), ('doc', 'score'), lambda x, y: [(i, j) for i, j in zip(x, y)])`
			`.output('query', 'doc', 'score')`
			`)`

			`DataCollection(p('What is Towhee?',`
Update rerank Signed-off-by: shiyu22 <shiyu.chen@zilliz.com> 3 years ago			`['Towhee is Towhee is a cutting-edge framework to deal with unstructure data.', 'I do not know about towhee', 'Towhee has many powerful operators.', 'The weather is good' ])`
Update README Signed-off-by: shiyu22 <shiyu.chen@zilliz.com> 3 years ago			`).show()`
			```


			`<br />`



			`## Factory Constructor`

			`Create the operator via the following factory method`

			`*towhee.rerank(model_name: str = 'cross-encoder/ms-marco-MiniLM-L-12-v2')*`

			`Parameters:`

			`*model_name*: str`

			`The model name of CrossEncoder, you can set it according to the [Model List](https://www.sbert.net/docs/pretrained-models/ce-msmarco.html#models-performance).`

Update rerank Signed-off-by: shiyu22 <shiyu.chen@zilliz.com> 3 years ago			`*threshold*: float`

add device and sigmoid Signed-off-by: junjie.jiang <junjie.jiang@zilliz.com> 3 years ago			`The threshold for filtering with score`
			`*device*: str`
Update README Signed-off-by: shiyu22 <shiyu.chen@zilliz.com> 3 years ago			`<br />`



			`## Interface`

			`This operator is used to sort the documents of the query content and return the score, and can also set a threshold to filter the results.`

			`Parameters:`

			`*query*: str`

			`The query content.`

			`*docs*: list`

			`A list of sentences to check the correlation with the query content.`


			`<br />`

			`Return: List[str], List[float]`

Add more resources Signed-off-by: Jael Gu <mengjia.gu@zilliz.com> 1 year ago			`The list of documents after rerank and the list of corresponding scores.`

			`# More Resources`

			`- [The guide to rerank-english-v3.0 \| Cohere](https://zilliz.com/ai-models/rerank-english-v3.0): rerank-english-v3.0: a reranking model for English documents and semi-structured data (JSON); context length: 4096 tokens.`
			`- [Optimizing RAG with Rerankers: The Role and Trade-offs - Zilliz blog](https://zilliz.com/learn/optimize-rag-with-rerankers-the-role-and-tradeoffs): Rerankers can enhance the accuracy and relevance of answers in RAG systems, but these benefits come with increased latency and computational costs.`
			`- [What Are Rerankers and How They Enhance Information Retrieval - Zilliz blog](https://zilliz.com/learn/what-are-rerankers-enhance-information-retrieval): Rerankers are specialized components in information retrieval systems that perform a crucial second-stage evaluation of search results.`
			`- [Building an Intelligent QA System with NLP and Milvus - Zilliz blog](https://zilliz.com/blog/building-intelligent-chatbot-with-nlp-and-milvus): The Next-Gen QA Bot is here`
			`- [The guide to rerank-english-v2.0 \| Cohere](https://zilliz.com/ai-models/rerank-english-v2.0): rerank-english-v2.0: a reranking model for English language documents with a context length of 512 tokens.`