# eqa-search
# Enhanced QA Search
## Description
**Enhanced question-answering** is the process of creating the knowledge base and generating answers with LLMs(large language model), thus preventing illusions. It involves inserting data as knowledge base and querying questions, and **eqa-search** is used to query questions from knowledge base.
## Code Example
### **Create pipeline and set the configuration**
> More parameters refer to the Configuration.
```python
from towhee import AutoPipes, AutoConfig
config = AutoConfig.load_config('eqa-search')
config.host = '127.0.0.1'
config.port = '19530'
config.collection_name = 'chatbot'
config.top_k = 5
# If using zilliz cloud
# config.user = [zilliz-cloud-username]
# config.password = [zilliz-cloud-password]
# OpenAI api key
config.openai_api_key = [your-openai-api-key]
# Embedding model
config.embedding_model = 'all-MiniLM-L6-v2'
# Embedding model device
config.embedding_device = -1
# Rerank the docs searched from knowledge base
config.rerank = True
# The llm model source, openai or dolly
config.llm_src = 'openai'
# The openai model name
config.openai_model = 'gpt-3.5-turbo'
# The dolly model name
# config.dolly_model = 'databricks/dolly-v2-12b'
p = AutoPipes.pipeline('eqa-search', config=config)
res = p('What is towhee?', [])
```
## Enhanced QA Search Config
### Configuration for Sentence Embedding
***model (str):***
The model name in the sentence embedding pipeline, defaults to `'all-MiniLM-L6-v2'`.
You can refer to the above [Model(s) list ](https://towhee.io/tasks/detail/operator?field_name=Natural-Language-Processing&task_name=Sentence-Embedding)to set the model, some of these models are from [HuggingFace](https://huggingface.co/) (open source), and some are from [OpenAI](https://openai.com/) (not open, required API key).
***openai_api_key (str):***
The api key of openai, default to `None`.
This key is required if the model is from OpenAI, you can check the model provider in the above [Model(s) list](https://towhee.io/sentence-embedding/openai).
***embedding_device (int):***
The number of devices, defaults to `-1`, which means using the CPU.
If the setting is not `-1`, the specified GPU device will be used.
### Configuration for [Milvus](https://towhee.io/ann-search/milvus-client)
***host (str):***
Host of Milvus vector database, default is `'127.0.0.1'`.
***port (str):***
Port of Milvus vector database, default is `'19530'`.
***top_k (int):***
The number of nearest search results, defaults to 5.
***collection_name (str):***
The collection name for Milvus vector database.
***user (str):***
The user name for [Cloud user](https://zilliz.com/cloud), defaults to `None`.
***password (str):***
The user password for [Cloud user](https://zilliz.com/cloud), defaults to `None`.
### Configuration for Rerank
***rerank***: bool
Whether to rerank the docs searched from knowledge base, defaults to False. If set it to True it will using the [rerank](https://towhee.io/towhee/rerank) operator.
***rerank_model***: str
The name of rerank model, you can set it according to the [rerank](https://towhee.io/towhee/rerank) operator.
***threshold:*** Union[float, int]
The threshold for rerank, defaults to 0.6. If the `rerank` is `False`, it will filter the milvus search result, otherwise it will be filtered with the [rerank](https://towhee.io/towhee/rerank) operator.
### Configuration for LLM
***llm_src (str):***
The llm model source, `openai` or `dolly`, defaults to `openai`.
***openai_model (str):***
The openai model name, defaults to `gpt-3.5-turbo`.
***dolly_model (str):***
The dolly model name, defaults to `databricks/dolly-v2-3b`.
**customize_llm (Any):***
Users customize LLM.
**customize_prompt (Any):***
Users customize prompt.
***ernie_api_key (str):***
ernie_api_key for ernie bot
***ernie_secret_key (str):***
ernie_secret_key for ernie bot
## Interface
Query a question from Milvus knowledge base.
**Parameters:**
- ***question (str):*** The question to query.
- ***history (List[str]):*** The chat history to provide background information.
**Returns:**
- ***Answer (str):*** The answer to the question.
# More Resources
- [Search and Information Retrieval in the Era of Generative AI - Zilliz blog](https://zilliz.com/learn/search-still-matters-enhance-information-retrieval-with-genai-and-vector-databases): Despite advances in LLMs like ChatGPT, search still matters. Combining GenAI with search and vector databases enhances search accuracy and experience.
- [Semantic Search with Milvus and OpenAI - Zilliz blog](https://zilliz.com/learn/semantic-search-with-milvus-and-openai): In this guide, weâll explore semantic search capabilities through the integration of Milvus and OpenAIâs Embedding API, using a book title search as an example use case.
- [Enhancing RAG with Knowledge Graphs - Zilliz blog](https://zilliz.com/blog/enhance-rag-with-knowledge-graphs): Knowledge Graphs (KGs) store and link data based on their relationships. KG-enhanced RAG can significantly improve retrieval capabilities and answer quality.
- [Compare Vector Databases, Vector Search Libraries and Plugins - Zilliz blog](https://zilliz.com/learn/comparing-vector-database-vector-search-library-and-vector-search-plugin): Deep diving into better understanding vector databases and comparing them to vector search libraries and vector search plugins.
- [Metrics-Driven Development of RAGs - Zilliz blog](https://zilliz.com/blog/metrics-driven-development-of-rags): Evaluating and improving Retrieval-Augmented Generation (RAG) systems is a nuanced but essential task in the realm of AI-driven information retrieval. By leveraging a metrics-driven approach, as demonstrated by Jithin James and Shahul Es, you can systematically refine your RAG systems to ensure they deliver accurate, relevant, and trustworthy information.
- [What Is Semantic Search?](https://zilliz.com/glossary/semantic-search): Semantic search is a search technique that uses natural language processing (NLP) and machine learning (ML) to understand the context and meaning behind a user's search query.
- [Similarity Metrics for Vector Search - Zilliz blog](https://zilliz.com/blog/similarity-metrics-for-vector-search): Exploring five similarity metrics for vector search: L2 or Euclidean distance, cosine distance, inner product, and hamming distance.