logo
rerank
repo-copy-icon

copied

You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Readme
Files and versions

2.2 KiB

Rerank QA Content

Description

The Rerank operator is used to reorder the list of relevant documents for a query. It uses the MS MARCO Cross-Encoders model to get the relevant scores and then reorders the documents.


Code Example

  • Run with ops
from towhee import ops

op = ops.rerank()
res = op('What is Towhee?',
         ['Towhee is Towhee is a cutting-edge framework to deal with unstructure data.', 'I do not know about towhee', 'Towhee has many powerful operators.', 'The weather is good' ],
         0)
  • Run a pipeline
from towhee import ops, pipe, DataCollection

p = (pipe.input('query', 'doc', 'threshold')
         .map(('query', 'doc', 'threshold'), ('doc', 'score'), ops.rerank())
         .flat_map(('doc', 'score'), ('doc', 'score'), lambda x, y: [(i, j) for i, j in zip(x, y)])
         .output('query', 'doc', 'score')
     )

DataCollection(p('What is Towhee?',
                 ['Towhee is Towhee is a cutting-edge framework to deal with unstructure data.', 'I do not know about towhee', 'Towhee has many powerful operators.', 'The weather is good' ],
                 0)
              ).show()


Factory Constructor

Create the operator via the following factory method

towhee.rerank(model_name: str = 'cross-encoder/ms-marco-MiniLM-L-12-v2')

Parameters:

model_name: str

​ The model name of CrossEncoder, you can set it according to the Model List.


Interface

This operator is used to sort the documents of the query content and return the score, and can also set a threshold to filter the results.

Parameters:

query: str

The query content.

docs: list

A list of sentences to check the correlation with the query content.

threshold: float

​ The threshold for filtering with score, defaults to none, i.e., no filtering.


Return: List[str], List[float]

The list of documents after rerank and the list of corresponding scores.

2.2 KiB

Rerank QA Content

Description

The Rerank operator is used to reorder the list of relevant documents for a query. It uses the MS MARCO Cross-Encoders model to get the relevant scores and then reorders the documents.


Code Example

  • Run with ops
from towhee import ops

op = ops.rerank()
res = op('What is Towhee?',
         ['Towhee is Towhee is a cutting-edge framework to deal with unstructure data.', 'I do not know about towhee', 'Towhee has many powerful operators.', 'The weather is good' ],
         0)
  • Run a pipeline
from towhee import ops, pipe, DataCollection

p = (pipe.input('query', 'doc', 'threshold')
         .map(('query', 'doc', 'threshold'), ('doc', 'score'), ops.rerank())
         .flat_map(('doc', 'score'), ('doc', 'score'), lambda x, y: [(i, j) for i, j in zip(x, y)])
         .output('query', 'doc', 'score')
     )

DataCollection(p('What is Towhee?',
                 ['Towhee is Towhee is a cutting-edge framework to deal with unstructure data.', 'I do not know about towhee', 'Towhee has many powerful operators.', 'The weather is good' ],
                 0)
              ).show()


Factory Constructor

Create the operator via the following factory method

towhee.rerank(model_name: str = 'cross-encoder/ms-marco-MiniLM-L-12-v2')

Parameters:

model_name: str

​ The model name of CrossEncoder, you can set it according to the Model List.


Interface

This operator is used to sort the documents of the query content and return the score, and can also set a threshold to filter the results.

Parameters:

query: str

The query content.

docs: list

A list of sentences to check the correlation with the query content.

threshold: float

​ The threshold for filtering with score, defaults to none, i.e., no filtering.


Return: List[str], List[float]

The list of documents after rerank and the list of corresponding scores.