diff --git a/README.md b/README.md index 9d11e46..84b4cf2 100644 --- a/README.md +++ b/README.md @@ -67,3 +67,20 @@ An image captioning operator takes a [towhee image](link/to/towhee/image/api/doc **Returns:** *str* ​ The caption generated by model. + + + # More Resources + + - [What is a Transformer Model? An Engineer's Guide](https://zilliz.com/glossary/transformer-models): A transformer model is a neural network architecture. It's proficient in converting a particular type of input into a distinct output. Its core strength lies in its ability to handle inputs and outputs of different sequence length. It does this through encoding the input into a matrix with predefined dimensions and then combining that with another attention matrix to decode. This transformation unfolds through a sequence of collaborative layers, which deconstruct words into their corresponding numerical representations. + +At its heart, a transformer model is a bridge between disparate linguistic structures, employing sophisticated neural network configurations to decode and manipulate human language input. An example of a transformer model is GPT-3, which ingests human language and generates text output. +- [The guide to instructor-xl | HKU NLP](https://zilliz.com/ai-models/instructor-xl): instructor-xl: an instruction-finetuned model tailored for text embeddings with the best performance when compared to `instructor-base` and `instructor-large.` +- [The guide to voyage-large-2 | Voyage AI](https://zilliz.com/ai-models/voyage-large-2): voyage-large-2: general-purpose text embedding model; optimized for retrieval quality; ideal for tasks like summarization, clustering, and classification. +- [The guide to instructor-large | HKU NLP](https://zilliz.com/ai-models/instructor-large): instructor-large: an instruction-finetuned model tailored for text embeddings; better performance than `instructor-base`, but worse than `instructor-xl`. +- [OpenAI text-embedding-3-large | Zilliz](https://zilliz.com/ai-models/text-embedding-3-large): Building GenAI applications with text-embedding-3-large model and Zilliz Cloud / Milvus +- [The guide to clip-vit-base-patch32 | OpenAI](https://zilliz.com/ai-models/clip-vit-base-patch32): clip-vit-base-patch32: a CLIP multimodal model variant by OpenAI for image and text embedding. +- [Understanding ImageNet: A Key Resource for Computer Vision and AI Research](https://zilliz.com/glossary/imagenet): The large-scale image database with over 14 million annotated images. Learn how this dataset supports advancements in computer vision. +- [What is a Generative Adversarial Network? An Easy Guide](https://zilliz.com/glossary/generative-adversarial-networks): Just like we classify animal fossils into domains, kingdoms, and phyla, we classify AI networks, too. At the highest level, we classify AI networks as "discriminative" and "generative." A generative neural network is an AI that creates something new. This differs from a discriminative network, which classifies something that already exists into particular buckets. Kind of like we're doing right now, by bucketing generative adversarial networks (GANs) into appropriate classifications. +So, if you were in a situation where you wanted to use textual tags to create a new visual image, like with Midjourney, you'd use a generative network. However, if you had a giant pile of data that you needed to classify and tag, you'd use a discriminative model. +- [Zilliz partnership with PyTorch - View image search solution tutorial](https://zilliz.com/partners/pytorch): Zilliz partnership with PyTorch + \ No newline at end of file