# eqa-search

# Enhanced QA Search

## Description

**Enhanced question-answering** is the process of creating the knowledge base and generating answers with LLMs(large language model), thus preventing illusions. It involves inserting data as knowledge base and querying questions, and **eqa-search** is used to query questions from knowledge base.


<br />


## Code Example

### **Create pipeline and set the configuration**

> More parameters refer to the Configuration.

```python
from towhee import AutoPipes, AutoConfig

config = AutoConfig.load_config('eqa-search')
config.host = '127.0.0.1'
config.port = '19530'
config.collection_name = 'chatbot'
config.top_k = 5

# If using zilliz cloud
# config.user = [zilliz-cloud-username]
# config.password = [zilliz-cloud-password]

# OpenAI api key
config.openai_api_key = [your-openai-api-key]
# Embedding model
config.embedding_model = 'all-MiniLM-L6-v2'
# Embedding model device
config.embedding_device = -1

# Rerank the docs searched from knowledge base
config.rerank = True

# The llm model source, openai or dolly
config.llm_src = 'openai'
# The openai model name
config.openai_model = 'gpt-3.5-turbo'
# The dolly model name
# config.dolly_model = 'databricks/dolly-v2-12b'

p = AutoPipes.pipeline('eqa-search', config=config)
res = p('What is towhee?', [])
```


<br />


## Enhanced QA Search Config

### Configuration for Sentence Embedding

***model (str):***

The model name in the sentence embedding pipeline, defaults to `'all-MiniLM-L6-v2'`.
You can refer to the above [Model(s) list ](https://towhee.io/tasks/detail/operator?field_name=Natural-Language-Processing&task_name=Sentence-Embedding)to set the model, some of these models are from [HuggingFace](https://huggingface.co/) (open source), and some are from [OpenAI](https://openai.com/) (not open, required API key).

***openai_api_key (str):***

The api key of openai, default to `None`. 
This key is required if  the model is from OpenAI, you can check the model provider in the above [Model(s) list](https://towhee.io/sentence-embedding/openai).

***embedding_device (int):***

The number of devices, defaults to `-1`, which means using the CPU. 
If the setting is not `-1`, the specified GPU device will be used.

### Configuration for [Milvus](https://towhee.io/ann-search/milvus-client)

***host (str):***

Host of Milvus vector database, default is `'127.0.0.1'`.

***port (str):***

Port of Milvus vector database, default is `'19530'`. 

***top_k (int):***

The number of nearest search results, defaults to 5.

***collection_name (str):***

The collection name for Milvus vector database.

***user (str):***

The user name for [Cloud user](https://zilliz.com/cloud), defaults to `None`.

***password (str):***

The user password for [Cloud user](https://zilliz.com/cloud), defaults to `None`.

### Configuration for Rerank

***rerank***: bool

Whether to rerank the docs searched from knowledge base, defaults to False. If set it to True it will using the [rerank](https://towhee.io/towhee/rerank) operator.

***rerank_model***: str

The name of rerank model, you can set it according to the [rerank](https://towhee.io/towhee/rerank) operator.

***threshold:*** Union[float, int]

The threshold for rerank, defaults to 0.6. If the `rerank` is `False`, it will filter the milvus search result, otherwise it will be filtered with the [rerank](https://towhee.io/towhee/rerank) operator.


### Configuration for LLM

***llm_src (str):***

The llm model source, `openai` or `dolly`, defaults to `openai`.

***openai_model (str):***

The openai model name, defaults to `gpt-3.5-turbo`.

***dolly_model (str):***

The dolly model name, defaults to `databricks/dolly-v2-3b`.

**customize_llm (Any):***

Users customize LLM.

**customize_prompt (Any):***

Users customize prompt.

***ernie_api_key (str):***

ernie_api_key for ernie bot

***ernie_secret_key (str):***

ernie_secret_key for ernie bot


<br />


## Interface

Query a question from Milvus knowledge base.

**Parameters:**

- ***question (str):*** The question to query.

- ***history (List[str]):*** The chat history to provide background information.

**Returns:**

- ***Answer (str):*** The answer to the question.


# More Resources

- [Search and Information Retrieval in the Era of Generative AI  - Zilliz blog](https://zilliz.com/learn/search-still-matters-enhance-information-retrieval-with-genai-and-vector-databases): Despite advances in LLMs like ChatGPT, search still matters. Combining GenAI with search and vector databases enhances search accuracy and experience.
- [Semantic Search with Milvus and OpenAI - Zilliz blog](https://zilliz.com/learn/semantic-search-with-milvus-and-openai): In this guide, weâll explore semantic search capabilities through the integration of Milvus and OpenAIâs Embedding API, using a book title search as an example use case.
- [Enhancing RAG with Knowledge Graphs - Zilliz blog](https://zilliz.com/blog/enhance-rag-with-knowledge-graphs): Knowledge Graphs (KGs) store and link data based on their relationships. KG-enhanced RAG can significantly improve retrieval capabilities and answer quality.
- [Compare Vector Databases, Vector Search Libraries and Plugins - Zilliz blog](https://zilliz.com/learn/comparing-vector-database-vector-search-library-and-vector-search-plugin): Deep diving into better understanding vector databases and comparing them to vector search libraries and vector search plugins.
- [Metrics-Driven Development of RAGs - Zilliz blog](https://zilliz.com/blog/metrics-driven-development-of-rags): Evaluating and improving Retrieval-Augmented Generation (RAG) systems is a nuanced but essential task in the realm of AI-driven information retrieval. By leveraging a metrics-driven approach, as demonstrated by Jithin James and Shahul Es, you can systematically refine your RAG systems to ensure they deliver accurate, relevant, and trustworthy information.
- [What Is Semantic Search?](https://zilliz.com/glossary/semantic-search): Semantic search is a search technique that uses natural language processing (NLP) and machine learning (ML) to understand the context and meaning behind a user's search query.
- [Similarity Metrics for Vector Search - Zilliz blog](https://zilliz.com/blog/similarity-metrics-for-vector-search): Exploring five similarity metrics for vector search: L2 or Euclidean distance, cosine distance, inner product, and hamming distance.