diff --git a/README.md b/README.md index 6912861..6c81d30 100644 --- a/README.md +++ b/README.md @@ -74,3 +74,16 @@ The operator split incoming the text and return chunks. A list of the chunked document. + + +# More Resources + +- [Experiment with 5 Chunking Strategies via LangChain for LLM - Zilliz blog](https://zilliz.com/blog/experimenting-with-different-chunking-strategies-via-langchain): Explore the complexities of text chunking in retrieval augmented generation applications and learn how different chunking strategies impact the same piece of data. +- [A Guide to Chunking Strategies for Retrieval Augmented Generation (RAG) - Zilliz blog](https://zilliz.com/learn/guide-to-chunking-strategies-for-rag): We explored various facets of chunking strategies within Retrieval-Augmented Generation (RAG) systems in this guide. +- [Sentence Transformers for Long-Form Text - Zilliz blog](https://zilliz.com/learn/Sentence-Transformers-for-Long-Form-Text): Deep diving into modern transformer-based embeddings for long-form text. +- [Key Strategies for Smart Retrieval Augmented Generation (RAG) - Zilliz blog](https://zilliz.com/blog/exploring-rag-chunking-llms-and-evaluations): Three key strategies to get the most out of RAG: smart text chunking, iterating on different embedding models, and experimenting with different LLMs +- [The guide to jina-embeddings-v2-small-en | Jina AI](https://zilliz.com/ai-models/jina-embeddings-v2-small-en): jina-embeddings-v2-small-en: specialized text embedding model for long English documents; up to 8192 tokens. +- [Massive Text Embedding Benchmark (MTEB)](https://zilliz.com/glossary/massive-text-embedding-benchmark-(mteb)): A standardized way to evaluate text embedding models across a range of tasks and languages, leading to better text embedding models for your app +- [OpenAI text-embedding-3-large | Zilliz](https://zilliz.com/ai-models/text-embedding-3-large): Building GenAI applications with text-embedding-3-large model and Zilliz Cloud / Milvus +- [The guide to jina-embeddings-v2-base-en | Jina AI](https://zilliz.com/ai-models/jina-embeddings-v2-base-en): jina-embeddings-v2-base-en: specialized embedding model for English text and long documents; support sequences of up to 8192 tokens +- [Text as Data, From Anywhere to Anywhere - Zilliz blog](https://zilliz.com/blog/text-as-data-from-anywhere-to-anywhere): Whether you prefer a no-code or minimal-code approach, Airbyte and PyAirbyte offer robust solutions for integrating both structured and unstructured data. AJ Steers' painted a good picture of the potential of these tools in revolutionizing data workflows. \ No newline at end of file