logo
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Readme
Files and versions

143 lines
3.5 KiB

2 years ago
# eqa-search
# Enhanced QA Search
## Description
**Enhanced question-answering** is the process of creating the knowledge base and generating answers with LLMs(large language model), thus preventing illusions. It involves inserting data as knowledge base and querying questions, and **eqa-search** is used to query questions from knowledge base.
<br />
## Code Example
### **Create pipeline and set the configuration**
> More parameters refer to the Configuration.
```python
from towhee import AutoPipes, AutoConfig
config = AutoConfig.load_config('eqa-search')
config.host = '127.0.0.1'
config.port = '19530'
config.collection_name = 'chatbot'
config.top_k = 5
# If using zilliz cloud
config.user = [zilliz-cloud-username]
config.password = [zilliz-cloud-password]
# OpenAI api key
config.openai_api_key = [your-openai-api-key]
# Embedding model
config.embedding_model = 'all-MiniLM-L6-v2'
# Embedding model device
config.embedding_device = -1
# The threshold to filter milvus search result
config.threshold = 0.5
# The llm model source, openai or dolly
config.llm_src = 'openai'
# The openai model name
config.openai_model = 'gpt-3.5-turbo'
# The dolly model name
# config.dolly_model = 'databricks/dolly-v2-12b'
p = AutoPipes.pipeline('eqa-search', config=config)
res = p('What is towhee?', [])
```
<br />
## Enhanced QA Search Config
### Configuration for Sentence Embedding
***model (str):***
The model name in the sentence embedding pipeline, defaults to `'all-MiniLM-L6-v2'`.
You can refer to the above [Model(s) list ](https://towhee.io/tasks/detail/operator?field_name=Natural-Language-Processing&task_name=Sentence-Embedding)to set the model, some of these models are from [HuggingFace](https://huggingface.co/) (open source), and some are from [OpenAI](https://openai.com/) (not open, required API key).
***openai_api_key (str):***
The api key of openai, default to `None`.
This key is required if the model is from OpenAI, you can check the model provider in the above [Model(s) list](https://towhee.io/sentence-embedding/openai).
***embedding_device (int):***
The number of devices, defaults to `-1`, which means using the CPU.
If the setting is not `-1`, the specified GPU device will be used.
### Configuration for [Milvus](https://towhee.io/ann-search/milvus-client)
***host (str):***
Host of Milvus vector database, default is `'127.0.0.1'`.
***port (str):***
Port of Milvus vector database, default is `'19530'`.
***top_k (int):***
The number of nearest search results, defaults to 5.
***collection_name (str):***
The collection name for Milvus vector database.
***user (str):***
The user name for [Cloud user](https://zilliz.com/cloud), defaults to `None`.
***password (str):***
The user password for [Cloud user](https://zilliz.com/cloud), defaults to `None`.
### Configuration for Similarity Evaluation
***threshold (Union[float, int]):***
The threshold to filter the milvus search result.
### Configuration for LLM
***llm_src (str):***
The llm model source, `openai` or `dolly`, defaults to `openai`.
***openai_model (str):***
The openai model name, defaults to `gpt-3.5-turbo`.
***dolly_model (str):***
The dolly model name, defaults to `databricks/dolly-v2-3b`.
**customize_llm (Any):***
Users customize LLM.
<br />
## Interface
Query a question from Milvus knowledge base.
**Parameters:**
- ***question (str):*** The question to query.
- ***history (List[str]):*** The chat history to provide background information.
**Returns:**
- ***Answer (str):*** The answer to the question.