From 6c76f3aa1c04d699719e14b0905d4c640b374ae2 Mon Sep 17 00:00:00 2001
From: Jael Gu <mengjia.gu@zilliz.com>
Date: Wed, 18 Sep 2024 13:37:04 +0800
Subject: [PATCH] Add more resources

Signed-off-by: Jael Gu <mengjia.gu@zilliz.com>
---
 README.md | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/README.md b/README.md
index 4ffbcd0..565e3c9 100644
--- a/README.md
+++ b/README.md
@@ -73,3 +73,16 @@ An image captioning operator takes a [towhee image](link/to/towhee/image/api/doc
 
 ​   The caption generated by model.
 
+
+
+# More Resources
+
+- [What is a Generative Adversarial Network? An Easy Guide](https://zilliz.com/glossary/generative-adversarial-networks): Just like we classify animal fossils into domains, kingdoms, and phyla, we classify AI networks, too. At the highest level, we classify AI networks as "discriminative" and "generative." A generative neural network is an AI that creates something new. This differs from a discriminative network, which classifies something that already exists into particular buckets. Kind of like we're doing right now, by bucketing generative adversarial networks (GANs) into appropriate classifications.
+So, if you were in a situation where you wanted to use textual tags to create a new visual image, like with Midjourney, you'd use a generative network. However, if you had a giant pile of data that you needed to classify and tag, you'd use a discriminative model.
+- [Multimodal RAG locally with CLIP and Llama3  - Zilliz blog](https://zilliz.com/blog/multimodal-RAG-with-CLIP-Llama3-and-milvus): A tutorial walks you through how to build a multimodal RAG with CLIP, Llama3, and Milvus.
+- [Supercharged Semantic Similarity Search in Production - Zilliz blog](https://zilliz.com/learn/supercharged-semantic-similarity-search-in-production): Building a Blazing Fast, Highly Scalable Text-to-Image Search with CLIP embeddings and Milvus, the most advanced open-source vector database.
+- [The guide to clip-vit-base-patch32 | OpenAI](https://zilliz.com/ai-models/clip-vit-base-patch32): clip-vit-base-patch32: a CLIP multimodal model variant by OpenAI for image and text embedding.
+- [The guide to gte-base-en-v1.5 | Alibaba](https://zilliz.com/ai-models/gte-base-en-v1.5): gte-base-en-v1.5: specialized for English text; Built upon the transformer++ encoder backbone (BERT + RoPE + GLU)
+- [Multimodal RAG with Milvus and GPT-4o](https://zilliz.com/event/multimodal-rag-with-milvus-and-gpt-4o): Join us for a webinar for a demo of multimodal RAG with Milvus and GPT-4o
+- [An LLM Powered Text to Image Prompt Generation with Milvus - Zilliz blog](https://zilliz.com/blog/llm-powered-text-to-image-prompt-generation-with-milvus): An interesting LLM project powered by the Milvus vector database for generating more efficient text-to-image prompts.
+- [From Text to Image: Fundamentals of CLIP - Zilliz blog](https://zilliz.com/blog/fundamentals-of-clip): Search algorithms rely on semantic similarity to retrieve the most relevant results. With the CLIP model, the semantics of texts and images can be connected in a high-dimensional vector space. Read this simple introduction to see how CLIP can help you build a powerful text-to-image service.
\ No newline at end of file