Readme
Files and versions
Updated 2 months ago
LLM
Llama-2 Chat
author: Jael
Description
A LLM operator generates answer given prompt in messages using a large language model or service. This operator uses a pretrained Llama-2 to generate response. By default, it will download the model file from HuggingFace and then run the model with Llama-cpp.
This operator will automatically install and run model with llama-cpp. If the automatic installation fails in your environment, please refer to llama-cpp-python for instructions of manual installation.
Code Example
Use the default model to continue the conversation from given messages.
Use operator:
from towhee import ops
chat = ops.LLM.Llama_2('llama-2-13b-chat', n_ctx=4096, max_tokens=200)
message = [
{'system': 'You are a very helpful assistant.'},
{'question': 'Who won the world series in 2020?', 'answer': 'The Los Angeles Dodgers won the World Series in 2020.'},
{'question': 'Where was it played?'}
]
answer = chat(message)
Write a pipeline with explicit inputs/outputs name specifications:
from towhee import pipe, ops
p = (
pipe.input('question', 'docs', 'history')
.map(('question', 'docs', 'history'), 'prompt', ops.prompt.question_answer())
.map('prompt', 'answer', ops.LLM.Llama_2('llama-2-7b-chat'))
.output('answer')
)
history=[('What is Towhee?', 'Towhee is a cutting-edge framework designed to streamline the processing of unstructured data through the use of Large Language Model (LLM) based pipeline orchestration.')]
knowledge = ['You can install towhee via `pip install towhee`.']
question = 'How to install it?'
answer = p(question, knowledge, history).get()[0]
Factory Constructor
Create the operator via the following factory method:
LLM.Llama_2(model_name_or_file: str)
Parameters:
model_name_or_file: str
The model name or path to the model file in string, defaults to 'llama-2-7b-chat'.
If model name is in supported_model_names
, it will download corresponding model file from HuggingFace models.
You can also use the local path of a model file, which can be ran by llama-cpp-python.
**kwargs
Other model parameters such as temperature, max_tokens.
Interface
The operator takes a piece of text in string as input. It returns answer in json.
__call__(txt)
Parameters:
messages: list
A list of messages to set up chat. Must be a list of dictionaries with key value from "system", "question", "answer". For example, [{"question": "a past question?", "answer": "a past answer."}, {"question": "current question?"}]
Returns:
answer: str
The answer generated.
supported_model_names()
Returns:
A dictionary of supported models with model name as key and huggingface hub id & model filename as value.
{
'llama-2-7b-chat': {
'hf_id': 'TheBloke/Llama-2-7B-Chat-GGML',
'filename': 'llama-2-7b-chat.ggmlv3.q4_0.bin'
},
'llama-2-13b-chat': {
'hf_id': 'TheBloke/Llama-2-13B-chat-GGML',
'filename': 'llama-2-13b-chat.ggmlv3.q4_0.bin'
}
}
More Resources
- Local RAG Setup with Llama 3, Ollama, Milvus & LangChain - Zilliz blog: A Beginner's Guide to Using Llama 3 with Ollama, Milvus, and Langchain
- LLama2 vs ChatGPT: How They Perform in Question Answering - Zilliz blog: What is Llama 2, and how does it perform in question answering compared to ChatGPT?
- Boost your LLM with Private Data Using LlamaIndex | Zilliz Webinar: Zilliz webinar covering how to boost your LLM with private data with LlamaIndex to generate accurate and meaningful responses that reflect unique data inputs.
- Chat with Towards Data Science Using LlamaIndex - Zilliz blog: In this second post of the four-part Chat Towards Data Science blog series, we show why LlamaIndex is the leading open source data retrieval framework.
- What is Llama 2?: Learn all about Llama 2, get how to create vector embeddings, and more.
Jael Gu
43ee51e95e
| 12 Commits | ||
---|---|---|---|
.gitattributes |
1.1 KiB
|
1 year ago | |
README.md |
4.5 KiB
|
2 months ago | |
__init__.py |
98 B
|
1 year ago | |
llama2.py |
3.2 KiB
|
1 year ago | |
requirements.txt |
33 B
|
1 year ago |