This operator generates the caption with [CLIPReward](https://arxiv.org/abs/2205.13115) which describes the content of the given image. CLIPReward uses CLIP as a reward function and a simple finetuning strategy of the CLIP text encoder to impove grammar that does not require extra text annotation, thus towards to more descriptive and distinctive caption generation. This is an adaptation from [j-min/CLIP-Caption-Reward](https://github.com/j-min/CLIP-Caption-Reward).
<br/>
## Code Example
Load an image from path './animals.jpg' to generate the caption.