From ccf0085e5286f23498d8103be1f1753423020447 Mon Sep 17 00:00:00 2001 From: Jael Gu Date: Wed, 18 Sep 2024 13:37:50 +0800 Subject: [PATCH] Add more resources Signed-off-by: Jael Gu --- README.md | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 020d8c2..8136591 100644 --- a/README.md +++ b/README.md @@ -40,4 +40,13 @@ The `towhee/torch-bert` Operator is based on Huggingface[2]. [1]. https://arxiv.org/pdf/1810.04805.pdf -[2]. https://huggingface.co/docs/transformers \ No newline at end of file +[2]. https://huggingface.co/docs/transformers + +# More Resources + +- [The guide to text-embedding-ada-002 model | OpenAI](https://zilliz.com/ai-models/text-embedding-ada-002): text-embedding-ada-002: OpenAI's legacy text embedding model; average price/performance compared to text-embedding-3-large and text-embedding-3-small. +- [Sentence Transformers for Long-Form Text - Zilliz blog](https://zilliz.com/learn/Sentence-Transformers-for-Long-Form-Text): Deep diving into modern transformer-based embeddings for long-form text. +- [What is BERT (Bidirectional Encoder Representations from Transformers)? - Zilliz blog](https://zilliz.com/learn/what-is-bert): Learn what Bidirectional Encoder Representations from Transformers (BERT) is and how it uses pre-training and fine-tuning to achieve its remarkable performance. +- [Training Your Own Text Embedding Model - Zilliz blog](https://zilliz.com/learn/training-your-own-text-embedding-model): Explore how to train your text embedding model using the `sentence-transformers` library and generate our training data by leveraging a pre-trained LLM. +- [The guide to gte-base-en-v1.5 | Alibaba](https://zilliz.com/ai-models/gte-base-en-v1.5): gte-base-en-v1.5: specialized for English text; Built upon the transformer++ encoder backbone (BERT + RoPE + GLU) +- [Training Text Embeddings with Jina AI - Zilliz blog](https://zilliz.com/blog/training-text-embeddings-with-jina-ai): In a recent talk by Bo Wang, he discussed the creation of Jina text embeddings for modern vector search and RAG systems. He also shared methodologies for training embedding models that effectively encode extensive information, along with guidance o \ No newline at end of file