Readme

Files and versions

3.3 KiB

Raw Permalink Blame History

Sentence Embedding with OpenAI

author: Junjie, Jael

Description

A sentence embedding operator generates one embedding vector in ndarray for each input text. The embedding represents the semantic information of the whole input text as one vector. This operator is implemented with embedding models from OpenAI. Please note you need an OpenAI API key to access OpenAI.

Code Example

Use the pre-trained model '' to generate an embedding for the sentence "Hello, world.".

Write a pipeline with explicit inputs/outputs name specifications:

from towhee import pipe, ops, DataCollection

p = (
    pipe.input('text')
        .map('text', 'vec', 
             ops.sentence_embedding.openai(model_name='text-embedding-ada-002', api_key=OPENAI_API_KEY))
        .output('text', 'vec')
)

DataCollection(p('Hello, world.')).show()

Factory Constructor

Create the operator via the following factory method:

sentence_embedding.openai(model_name='text-embedding-ada-002')

Parameters:

model_name: str

The model name in string, defaults to 'text-embedding-ada-002'. Supported model names:

text-embedding-ada-002
text-similarity-davinci-001
text-similarity-curie-001
text-similarity-babbage-001
text-similarity-ada-001

api_key: str=None

The OpenAI API key in string, defaults to None.

Interface

The operator takes a piece of text in string as input. It returns a text emabedding in numpy.ndarray.

__call__(txt)

Parameters:

text: str

The text in string.

Returns:

numpy.ndarray or list

The text embedding extracted by model.

supported_model_names()

Get a list of supported model names.

More Resources

All-Mpnet-Base-V2: Enhancing Sentence Embedding with AI - Zilliz blog: Delve into one of the deep learning models that has played a significant role in the development of sentence embedding: MPNet.
The guide to text-embedding-ada-002 model | OpenAI: text-embedding-ada-002: OpenAI's legacy text embedding model; average price/performance compared to text-embedding-3-large and text-embedding-3-small.
OpenAI text-embedding-3-large | Zilliz: Building GenAI applications with text-embedding-3-large model and Zilliz Cloud / Milvus
What Are Vector Embeddings?: Learn the definition of vector embeddings, how to create vector embeddings, and more.
A Guide to Using OpenAI Text Embedding Models for NLP Tasks - Zilliz blog: A comprehensive guide to using OpenAI text embedding models for embedding creation and semantic search.
Training Text Embeddings with Jina AI - Zilliz blog: In a recent talk by Bo Wang, he discussed the creation of Jina text embeddings for modern vector search and RAG systems. He also shared methodologies for training embedding models that effectively encode extensive information, along with guidance o

3.3 KiB

Raw Permalink Blame History

Sentence Embedding with OpenAI

author: Junjie, Jael

Description

Code Example

Use the pre-trained model '' to generate an embedding for the sentence "Hello, world.".

Write a pipeline with explicit inputs/outputs name specifications:

from towhee import pipe, ops, DataCollection

p = (
    pipe.input('text')
        .map('text', 'vec', 
             ops.sentence_embedding.openai(model_name='text-embedding-ada-002', api_key=OPENAI_API_KEY))
        .output('text', 'vec')
)

DataCollection(p('Hello, world.')).show()

Factory Constructor

Create the operator via the following factory method:

sentence_embedding.openai(model_name='text-embedding-ada-002')

Parameters:

model_name: str

The model name in string, defaults to 'text-embedding-ada-002'. Supported model names:

text-embedding-ada-002
text-similarity-davinci-001
text-similarity-curie-001
text-similarity-babbage-001
text-similarity-ada-001

api_key: str=None

The OpenAI API key in string, defaults to None.

Interface

The operator takes a piece of text in string as input. It returns a text emabedding in numpy.ndarray.

__call__(txt)

Parameters:

text: str

The text in string.

Returns:

numpy.ndarray or list

The text embedding extracted by model.

supported_model_names()

Get a list of supported model names.

More Resources

All-Mpnet-Base-V2: Enhancing Sentence Embedding with AI - Zilliz blog: Delve into one of the deep learning models that has played a significant role in the development of sentence embedding: MPNet.
The guide to text-embedding-ada-002 model | OpenAI: text-embedding-ada-002: OpenAI's legacy text embedding model; average price/performance compared to text-embedding-3-large and text-embedding-3-small.
OpenAI text-embedding-3-large | Zilliz: Building GenAI applications with text-embedding-3-large model and Zilliz Cloud / Milvus
What Are Vector Embeddings?: Learn the definition of vector embeddings, how to create vector embeddings, and more.
A Guide to Using OpenAI Text Embedding Models for NLP Tasks - Zilliz blog: A comprehensive guide to using OpenAI text embedding models for embedding creation and semantic search.
Training Text Embeddings with Jina AI - Zilliz blog: In a recent talk by Bo Wang, he discussed the creation of Jina text embeddings for modern vector search and RAG systems. He also shared methodologies for training embedding models that effectively encode extensive information, along with guidance o

Readme

Files and versions

3.3 KiB Raw Permalink Blame History

Sentence Embedding with OpenAI

Description

Code Example

Factory Constructor

Interface

More Resources

3.3 KiB Raw Permalink Blame History

Sentence Embedding with OpenAI

Description

Code Example

Factory Constructor

Interface

More Resources

3.3 KiB

Raw Permalink Blame History

3.3 KiB

Raw Permalink Blame History