# Llama-2 Chat

*author: Jael*

<br />

## Description

A LLM operator generates answer given prompt in messages using a large language model or service.
This operator uses a pretrained [Llama-2](https://ai.meta.com/llama) to generate response.
By default, it will download the model file from [HuggingFace](https://huggingface.co/TheBloke) 
and then run the model with [Llama-cpp](https://github.com/ggerganov/llama.cpp).

This operator will automatically install and run model with llama-cpp.
If the automatic installation fails in your environment, please refer to [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) for instructions of manual installation.

<br />

## Code Example

Use the default model to continue the conversation from given messages.

*Use operator:*

```python
from towhee import ops

chat = ops.LLM.Llama_2('llama-2-13b-chat', n_ctx=4096, max_tokens=200)

message = [
    {'system': 'You are a very helpful assistant.'},
    {'question': 'Who won the world series in 2020?', 'answer': 'The Los Angeles Dodgers won the World Series in 2020.'},
    {'question': 'Where was it played?'}
]
answer = chat(message)
```

*Write a pipeline with explicit inputs/outputs name specifications:*

```python
from towhee import pipe, ops

p = (
    pipe.input('question', 'docs', 'history')
        .map(('question', 'docs', 'history'), 'prompt', ops.prompt.question_answer())
        .map('prompt', 'answer', ops.LLM.Llama_2('llama-2-7b-chat'))
        .output('answer')
)

history=[('What is Towhee?', 'Towhee is a cutting-edge framework designed to streamline the processing of unstructured data through the use of Large Language Model (LLM) based pipeline orchestration.')]
knowledge = ['You can install towhee via `pip install towhee`.']
question = 'How to install it?'
answer = p(question, knowledge, history).get()[0]
```

<br />

## Factory Constructor

Create the operator via the following factory method:

***LLM.Llama_2(model_name_or_file: str)***

**Parameters:**

***model_name_or_file***: *str*

The model name or path to the model file in string, defaults to 'llama-2-7b-chat'.
If model name is in `supported_model_names`, it will download corresponding model file from HuggingFace models.
You can also use the local path of a model file, which can be ran by llama-cpp-python.

***\*\*kwargs***

Other model parameters such as temperature, max_tokens.

<br />

## Interface

The operator takes a piece of text in string as input.
It returns answer in json.

***\_\_call\_\_(txt)***

**Parameters:**

***messages***: *list*

​	A list of messages to set up chat.
Must be a list of dictionaries with key value from "system", "question", "answer". For example, [{"question": "a past question?", "answer": "a past answer."}, {"question": "current question?"}]

**Returns**:

*answer: str*

​	The answer generated.

<br />

***supported_model_names()***

**Returns**:

A dictionary of supported models with model name as key and huggingface hub id & model filename as value.

    {
        'llama-2-7b-chat': {
            'hf_id': 'TheBloke/Llama-2-7B-Chat-GGML',
            'filename': 'llama-2-7b-chat.ggmlv3.q4_0.bin'
            },
        'llama-2-13b-chat': {
            'hf_id': 'TheBloke/Llama-2-13B-chat-GGML',
            'filename': 'llama-2-13b-chat.ggmlv3.q4_0.bin'
        }
    }


# More Resources

- [Local RAG Setup with Llama 3, Ollama, Milvus & LangChain - Zilliz blog](https://zilliz.com/blog/a-beginners-guide-to-using-llama-3-with-ollama-milvus-langchain): A Beginner's Guide to Using Llama 3 with Ollama, Milvus, and Langchain
- [LLama2 vs ChatGPT: How They Perform in Question Answering - Zilliz blog](https://zilliz.com/blog/comparing-meta-ai-Llama2-openai-chatgpt): What is Llama 2, and how does it perform in question answering compared to ChatGPT?
- [Boost your LLM with Private Data Using LlamaIndex | Zilliz Webinar](https://zilliz.com/event/boost-your-llm-with-private-data-using-llamaindex/success): Zilliz webinar covering how to boost your LLM with private data with LlamaIndex to generate accurate and meaningful responses that reflect unique data inputs.
- [Chat with Towards Data Science Using LlamaIndex - Zilliz blog](https://zilliz.com/learn/chat-with-towards-data-science-using-llamaindex): In this second post of the four-part Chat Towards Data Science blog series, we show why LlamaIndex is the leading open source data retrieval framework.
- [What is Llama 2?](https://zilliz.com/glossary/llama2): Learn all about Llama 2, get  how to create vector embeddings, and more.