3 changed files with 87 additions and 3 deletions
			
			
		| @ -1,2 +1,86 @@ | |||||
| # rerank |  | ||||
|  | # Rerank QA Content | ||||
| 
 | 
 | ||||
|  | ## Description | ||||
|  | 
 | ||||
|  | The Rerank operator is used to reorder the list of relevant documents for a query. It uses the [MS MARCO Cross-Encoders](https://www.sbert.net/docs/pretrained_cross-encoders.html#ms-marco) model to get the relevant scores and then reorders the documents. | ||||
|  | 
 | ||||
|  | <br /> | ||||
|  | 
 | ||||
|  | 
 | ||||
|  | 
 | ||||
|  | ## Code Example | ||||
|  | 
 | ||||
|  | - Run with ops | ||||
|  | 
 | ||||
|  | ```Python | ||||
|  | from towhee import ops | ||||
|  | 
 | ||||
|  | op = ops.rerank() | ||||
|  | res = op('What is Towhee?', | ||||
|  |          ['Towhee is Towhee is a cutting-edge framework to deal with unstructure data.', 'I do not know about towhee', 'Towhee has many powerful operators.', 'The weather is good' ], | ||||
|  |          0) | ||||
|  | ``` | ||||
|  | 
 | ||||
|  | - Run a pipeline | ||||
|  | 
 | ||||
|  | ```python | ||||
|  | from towhee import ops, pipe, DataCollection | ||||
|  | 
 | ||||
|  | p = (pipe.input('query', 'doc', 'threshold') | ||||
|  |          .map(('query', 'doc', 'threshold'), ('doc', 'score'), ops.rerank()) | ||||
|  |          .flat_map(('doc', 'score'), ('doc', 'score'), lambda x, y: [(i, j) for i, j in zip(x, y)]) | ||||
|  |          .output('query', 'doc', 'score') | ||||
|  |      ) | ||||
|  | 
 | ||||
|  | DataCollection(p('What is Towhee?', | ||||
|  |                  ['Towhee is Towhee is a cutting-edge framework to deal with unstructure data.', 'I do not know about towhee', 'Towhee has many powerful operators.', 'The weather is good' ], | ||||
|  |                  0) | ||||
|  |               ).show() | ||||
|  | ``` | ||||
|  | 
 | ||||
|  | <img src="./result.png" height="100px"/> | ||||
|  | 
 | ||||
|  | <br /> | ||||
|  | 
 | ||||
|  | 
 | ||||
|  | 
 | ||||
|  | ## Factory Constructor | ||||
|  | 
 | ||||
|  | Create the operator via the following factory method | ||||
|  | 
 | ||||
|  | ***towhee.rerank(model_name: str = 'cross-encoder/ms-marco-MiniLM-L-12-v2')*** | ||||
|  | 
 | ||||
|  | **Parameters:** | ||||
|  | 
 | ||||
|  |    ***model_name***: str | ||||
|  | 
 | ||||
|  | 	The model name of CrossEncoder, you can set it according to the [Model List](https://www.sbert.net/docs/pretrained-models/ce-msmarco.html#models-performance). | ||||
|  | 
 | ||||
|  | <br /> | ||||
|  | 
 | ||||
|  | 
 | ||||
|  | 
 | ||||
|  | ## Interface | ||||
|  | 
 | ||||
|  | This operator is used to sort the documents of the query content and return the score, and can also set a threshold to filter the results. | ||||
|  | 
 | ||||
|  | **Parameters:** | ||||
|  | 
 | ||||
|  |    ***query***: str | ||||
|  | 
 | ||||
|  |    The query content. | ||||
|  | 
 | ||||
|  | 	***docs***: list | ||||
|  | 
 | ||||
|  |    A list of sentences to check the correlation with the query content. | ||||
|  | 
 | ||||
|  | 	***threshold***: float | ||||
|  | 
 | ||||
|  |     The threshold for filtering with score, defaults to none, i.e., no filtering. | ||||
|  | 
 | ||||
|  | 
 | ||||
|  | <br /> | ||||
|  | 
 | ||||
|  | **Return**: List[str], List[float] | ||||
|  | 
 | ||||
|  | The list of documents after rerank and the list of corresponding scores. | ||||
| After Width: | Height: | Size: 47 KiB | 
					Loading…
					
					
				
		Reference in new issue
	
	