update the readme.

Signed-off-by: wxywb <xy.wang@zilliz.com>
3 years ago · 23178db4c3
1 changed files with 80 additions and 1 deletions
--- a/README.md
+++ b/README.md
@ -1,2 +1,81 @@
 # clip-caption-reward
 # Fine-grained Image Captioning with CLIP Reward
 *author: David Wang*
 <br />
 ## Description
 This operator generates the caption with [CLIPReward](https://arxiv.org/abs/2205.13115) which describes the content of the given image. CLIPReward uses CLIP as a reward function and a simple finetuning strategy of the CLIP text encoder to impove grammar that does not require extra text annotation, thus towards to more descriptive and distinctive caption generation. This is an adaptation from [j-min/CLIP-Caption-Reward](https://github.com/j-min/CLIP-Caption-Reward).
 <br />
 ## Code Example
 Load an image from path './animals.jpg' to generate the caption. 
 *Write the pipeline in simplified style*:
 ```python
 import towhee
 towhee.glob('./animals.jpg') \
      .image_decode() \
      .image_captioning.clip_caption_reward(model_name='clipRN50_clips_grammar') \
      .show()
 ```
 <img src="./cap.png" alt="result1" style="height:20px;"/>
 *Write a same pipeline with explicit inputs/outputs name specifications:*
 ```python
 import towhee
 towhee.glob['path']('./animals.jpg') \
      .image_decode['path', 'img']() \
      .image_captioning.clip_caption_reward['img', 'text'](model_name='clipRN50_clips_grammar') \
      .select['img', 'text']() \
      .show()
 ```
 <img src="./tabular.png" alt="result2" style="height:60px;"/>
 <br />
 ## Factory Constructor
 Create the operator via the following factory method
 ***clip_caption_reward(model_name)***
 **Parameters:**
   ***model_name:*** *str*
   The model name of BLIP. Supported model names: 
 - clipRN50_clips_grammar     
 <br />
 ## Interface
 An image-text embedding operator takes a [towhee image](link/to/towhee/image/api/doc) as input and generate the correspoing caption.
 **Parameters:**
 	***img:*** *towhee.types.Image (a sub-class of numpy.ndarray)*
  The image to generate embedding.	
 **Returns:** *str*
   The caption generated by model.