@ -82,9 +82,11 @@ An image captioning operator takes a [towhee image](link/to/towhee/image/api/doc
# More Resources
- [What is a Transformer Model? An Engineer's Guide](https://zilliz.com/glossary/transformer-models): A transformer model is a neural network architecture. It's proficient in converting a particular type of input into a distinct output. Its core strength lies in its ability to handle inputs and outputs of different sequence length. It does this through encoding the input into a matrix with predefined dimensions and then combining that with another attention matrix to decode. This transformation unfolds through a sequence of collaborative layers, which deconstruct words into their corresponding numerical representations.
# More Resources
- [What is a Transformer Model? An Engineer's Guide](https://zilliz.com/glossary/transformer-models): A transformer model is a neural network architecture. It's proficient in converting a particular type of input into a distinct output. Its core strength lies in its ability to handle inputs and outputs of different sequence length. It does this through encoding the input into a matrix with predefined dimensions and then combining that with another attention matrix to decode. This transformation unfolds through a sequence of collaborative layers, which deconstruct words into their corresponding numerical representations.
At its heart, a transformer model is a bridge between disparate linguistic structures, employing sophisticated neural network configurations to decode and manipulate human language input. An example of a transformer model is GPT-3, which ingests human language and generates text output.
- [Multimodal RAG locally with CLIP and Llama3 - Zilliz blog](https://zilliz.com/blog/multimodal-RAG-with-CLIP-Llama3-and-milvus): A tutorial walks you through how to build a multimodal RAG with CLIP, Llama3, and Milvus.
@ -95,4 +97,3 @@ At its heart, a transformer model is a bridge between disparate linguistic struc
- [An LLM Powered Text to Image Prompt Generation with Milvus - Zilliz blog](https://zilliz.com/blog/llm-powered-text-to-image-prompt-generation-with-milvus): An interesting LLM project powered by the Milvus vector database for generating more efficient text-to-image prompts.
- [From Text to Image: Fundamentals of CLIP - Zilliz blog](https://zilliz.com/blog/fundamentals-of-clip): Search algorithms rely on semantic similarity to retrieve the most relevant results. With the CLIP model, the semantics of texts and images can be connected in a high-dimensional vector space. Read this simple introduction to see how CLIP can help you build a powerful text-to-image service.
- [Zilliz partnership with PyTorch - View image search solution tutorial](https://zilliz.com/partners/pytorch): Zilliz partnership with PyTorch