From d79b451d91a67e0cdd1b03db89e9bc522b7e31dd Mon Sep 17 00:00:00 2001 From: Jael Gu Date: Wed, 8 Feb 2023 15:39:51 +0800 Subject: [PATCH] Update Signed-off-by: Jael Gu --- README.md | 86 ++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 85 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 39de539..4ff78c4 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,86 @@ -# openai +# Sentence Embedding with OpenAI + +*author: Junjie, Jael* + +
+ +## Description + +A sentence embedding operator generates one embedding vector in ndarray for each input text. +The embedding represents the semantic information of the whole input text as one vector. +This operator is implemented with embedding models from [OpenAI](https://platform.openai.com/docs/guides/embeddings). +Please note you need an [OpenAI API key](https://platform.openai.com/account/api-keys) to access OpenAI. + +
+ +## Code Example + +Use the pre-trained model '' +to generate an embedding for the sentence "Hello, world.". + +*Write a pipeline with explicit inputs/outputs name specifications:* + +```python +from towhee.dc2 import pipe, ops, DataCollection + +p = ( + pipe.input('text') + .map('text', 'vec', + ops.sentence_embedding.openai(model_name='text-embedding-ada-002', api_key=OPENAI_API_KEY)) + .output('text', 'vec') +) + +DataCollection(p('Hello, world.')).show() +``` + +
+ +## Factory Constructor + +Create the operator via the following factory method: + +***sentence_embedding.openai(model_name='text-embedding-ada-002')*** + +**Parameters:** + +***model_name***: *str* + +The model name in string, defaults to 'text-embedding-ada-002'. Supported model names: +- text-embedding-ada-002 +- text-similarity-davinci-001 +- text-similarity-curie-001 +- text-similarity-babbage-001 +- text-similarity-ada-001 + +***api_key***: *str=None* + +The OpenAI API key in string, defaults to None. + +
+ +## Interface + +The operator takes a piece of text in string as input. +It returns a text emabedding in numpy.ndarray. + +***\_\_call\_\_(txt)*** + +**Parameters:** + +***text***: *str* + +​ The text in string. + +**Returns**: + +*numpy.ndarray or list* + +​ The text embedding extracted by model. + +
+ +***supported_model_names()*** + +Get a list of supported model names. +