From 6c76f3aa1c04d699719e14b0905d4c640b374ae2 Mon Sep 17 00:00:00 2001 From: Jael Gu Date: Wed, 18 Sep 2024 13:37:04 +0800 Subject: [PATCH] Add more resources Signed-off-by: Jael Gu --- README.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/README.md b/README.md index 4ffbcd0..565e3c9 100644 --- a/README.md +++ b/README.md @@ -73,3 +73,16 @@ An image captioning operator takes a [towhee image](link/to/towhee/image/api/doc ​ The caption generated by model. + + +# More Resources + +- [What is a Generative Adversarial Network? An Easy Guide](https://zilliz.com/glossary/generative-adversarial-networks): Just like we classify animal fossils into domains, kingdoms, and phyla, we classify AI networks, too. At the highest level, we classify AI networks as "discriminative" and "generative." A generative neural network is an AI that creates something new. This differs from a discriminative network, which classifies something that already exists into particular buckets. Kind of like we're doing right now, by bucketing generative adversarial networks (GANs) into appropriate classifications. +So, if you were in a situation where you wanted to use textual tags to create a new visual image, like with Midjourney, you'd use a generative network. However, if you had a giant pile of data that you needed to classify and tag, you'd use a discriminative model. +- [Multimodal RAG locally with CLIP and Llama3 - Zilliz blog](https://zilliz.com/blog/multimodal-RAG-with-CLIP-Llama3-and-milvus): A tutorial walks you through how to build a multimodal RAG with CLIP, Llama3, and Milvus. +- [Supercharged Semantic Similarity Search in Production - Zilliz blog](https://zilliz.com/learn/supercharged-semantic-similarity-search-in-production): Building a Blazing Fast, Highly Scalable Text-to-Image Search with CLIP embeddings and Milvus, the most advanced open-source vector database. +- [The guide to clip-vit-base-patch32 | OpenAI](https://zilliz.com/ai-models/clip-vit-base-patch32): clip-vit-base-patch32: a CLIP multimodal model variant by OpenAI for image and text embedding. +- [The guide to gte-base-en-v1.5 | Alibaba](https://zilliz.com/ai-models/gte-base-en-v1.5): gte-base-en-v1.5: specialized for English text; Built upon the transformer++ encoder backbone (BERT + RoPE + GLU) +- [Multimodal RAG with Milvus and GPT-4o](https://zilliz.com/event/multimodal-rag-with-milvus-and-gpt-4o): Join us for a webinar for a demo of multimodal RAG with Milvus and GPT-4o +- [An LLM Powered Text to Image Prompt Generation with Milvus - Zilliz blog](https://zilliz.com/blog/llm-powered-text-to-image-prompt-generation-with-milvus): An interesting LLM project powered by the Milvus vector database for generating more efficient text-to-image prompts. +- [From Text to Image: Fundamentals of CLIP - Zilliz blog](https://zilliz.com/blog/fundamentals-of-clip): Search algorithms rely on semantic similarity to retrieve the most relevant results. With the CLIP model, the semantics of texts and images can be connected in a high-dimensional vector space. Read this simple introduction to see how CLIP can help you build a powerful text-to-image service. \ No newline at end of file