This operator generates the caption with [BLIP](https://arxiv.org/abs/2201.12086) which describes the content of the given image. This is an adaptation from [salesforce/BLIP](https://github.com/salesforce/BLIP).
This operator generates the caption with [ClipCap](https://arxiv.org/abs/2111.09734) which describes the content of the given image. ClipCap uses CLIP encoding as a prefix to the caption, by employing a simple mapping network, and then fine-tunes a language model to generate the image captions. This is an adaptation from [rmokady/CLIP_prefix_caption](https://github.com/rmokady/CLIP_prefix_caption).
<br/>
@ -17,17 +17,16 @@ This operator generates the caption with [BLIP](https://arxiv.org/abs/2201.12086
## Code Example
Load an image from path './animals.jpg' to generate the caption.
Load an image from path './hulk.jpg' to generate the caption.