logo
Browse Source

Modify the readme.

Signed-off-by: jinlingxu06 <jinling.xu@zilliz.com>
main
jinlingxu06 2 years ago
parent
commit
97050578b9
  1. 25
      README.md
  2. 4
      azure_openai_embedding.py

25
README.md

@ -1,6 +1,6 @@
# Sentence Embedding with OpenAI
# Sentence Embedding with Azure OpenAI
*author: Junjie, Jael*
*author: David*
<br /> <br />
@ -9,7 +9,7 @@
A sentence embedding operator generates one embedding vector in ndarray for each input text. A sentence embedding operator generates one embedding vector in ndarray for each input text.
The embedding represents the semantic information of the whole input text as one vector. The embedding represents the semantic information of the whole input text as one vector.
This operator is implemented with embedding models from [OpenAI](https://platform.openai.com/docs/guides/embeddings). This operator is implemented with embedding models from [OpenAI](https://platform.openai.com/docs/guides/embeddings).
Please note you need an [OpenAI API key](https://platform.openai.com/account/api-keys) to access OpenAI.
This operator is designed specifically for Azure OpenAI, get more information from [link](https://learn.microsoft.com/en-us/azure/ai-services/openai/tutorials/embeddings?tabs=command-line)
<br /> <br />
@ -25,11 +25,9 @@ from towhee import pipe, ops, DataCollection
p = ( p = (
pipe.input('text') pipe.input('text')
.map('text', 'vec',
ops.sentence_embedding.openai(model_name='text-embedding-ada-002', api_key=OPENAI_API_KEY))
.map('text', 'vec', ops.sentence_embedding.azure_openai(model_name='text-embedding-ada-002', api_key=api_key, api_base=api_base))
.output('text', 'vec') .output('text', 'vec')
) )
DataCollection(p('Hello, world.')).show() DataCollection(p('Hello, world.')).show()
``` ```
@ -39,7 +37,7 @@ DataCollection(p('Hello, world.')).show()
Create the operator via the following factory method: Create the operator via the following factory method:
***sentence_embedding.openai(model_name='text-embedding-ada-002')***
***sentence_embedding.azure_openai(model_name='text-embedding-ada-002')***
**Parameters:** **Parameters:**
@ -52,10 +50,23 @@ The model name in string, defaults to 'text-embedding-ada-002'. Supported model
- text-similarity-babbage-001 - text-similarity-babbage-001
- text-similarity-ada-001 - text-similarity-ada-001
***api_type***: *str='azure'*
The OpenAI type in string, defaults to 'azure'.
***api_version***: *str='2023-07-01-preview'*
The OpenAI version in string, defaults to '2023-07-01-preview'.
***api_key***: *str=None* ***api_key***: *str=None*
The OpenAI API key in string, defaults to None. The OpenAI API key in string, defaults to None.
***api_base***: *str=None*
The OpenAI base in string, defaults to None.
<br /> <br />
## Interface ## Interface

4
azure_openai_embedding.py

@ -20,12 +20,12 @@ from towhee.operator.base import PyOperator
class AzureOpenaiEmbeding(PyOperator): class AzureOpenaiEmbeding(PyOperator):
def __init__(self, def __init__(self,
engine='text-embedding-ada-002',
model_name='text-embedding-ada-002',
api_type: str = 'azure', api_type: str = 'azure',
api_version: str = '2023-07-01-preview', api_version: str = '2023-07-01-preview',
api_key=None, api_key=None,
api_base=None): api_base=None):
self._engine = engine
self._engine = model_name
self._api_type = api_type self._api_type = api_type
self._api_version = api_version self._api_version = api_version
self._api_key = api_key self._api_key = api_key

Loading…
Cancel
Save