logo
Llama-2
repo-copy-icon

copied

You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Readme
Files and versions

112 lines
2.9 KiB

# Llama-2 Chat
*author: Jael*
<br />
## Description
A LLM operator generates answer given prompt in messages using a large language model or service.
This operator uses a pretrained [Llama-2](https://ai.meta.com/llama) to generate response.
By default, it will download the model file from [HuggingFace](https://huggingface.co/TheBloke)
and then run the model with [Llama-cpp](https://github.com/ggerganov/llama.cpp).
This operator will automatically install and run model with llama-cpp.
If the automatic installation fails in your environment, please refer to [llama-cpp-python](https://github.com/abetlen/llama-cpp-python) for instructions of manual installation.
<br />
## Code Example
Use the default model to continue the conversation from given messages.
*Use operator:*
```python
from towhee import ops
chat = ops.LLM.Llama_2('path/to/model_file.bin', max_tokens=2048)
message = [{"question": "Building a website can be done in 10 simple steps:"}]
answer = chat(message)
```
*Write a pipeline with explicit inputs/outputs name specifications:*
```python
from towhee import pipe, ops
p = (
pipe.input('question', 'docs', 'history')
.map(('question', 'docs', 'history'), 'prompt', ops.prompt.question_answer())
.map('prompt', 'answer', ops.LLM.Llama_2('llama-2-7b-chat', stop='</s>'))
.output('answer')
)
history=[('Who won the world series in 2020?', 'The Los Angeles Dodgers won the World Series in 2020.')]
question = 'Where was it played?'
answer = p(question, [], history).get()[0]
```
<br />
## Factory Constructor
Create the operator via the following factory method:
***LLM.Llama_2(model_name_or_file: str)***
**Parameters:**
***model_name_or_file***: *str*
The model name or path to the model file in string, defaults to 'llama-2-7b-chat'.
If model name is in `supported_model_names`, it will download corresponding model file from HuggingFace models.
You can also use the local path of a model file, which can be ran by llama-cpp-python.
***\*\*kwargs***
Other model parameters such as temperature, max_tokens.
<br />
## Interface
The operator takes a piece of text in string as input.
It returns answer in json.
***\_\_call\_\_(txt)***
**Parameters:**
***messages***: *list*
​ A list of messages to set up chat.
Must be a list of dictionaries with key value from "system", "question", "answer". For example, [{"question": "a past question?", "answer": "a past answer."}, {"question": "current question?"}]
**Returns**:
*answer: str*
​ The answer generated.
<br />
***supported_model_names()***
**Returns**:
A dictionary of supported models with model name as key and huggingface hub id & model filename as value.
{
'llama-2-7b-chat': {
'hf_id': 'TheBloke/Llama-2-7B-GGML',
'filename': 'llama-2-7b.ggmlv3.q4_0.bin'
},
'llama-2-13b-chat': {
'hf_id': 'TheBloke/Llama-2-13B-chat-GGML',
'filename': 'llama-2-13b-chat.ggmlv3.q4_0.bin'
}
}
1 year ago