From 23178db4c32a8980e3b54d0434d9ccf7b331f35e Mon Sep 17 00:00:00 2001 From: wxywb Date: Wed, 12 Oct 2022 20:02:11 +0800 Subject: [PATCH] update the readme. Signed-off-by: wxywb --- README.md | 81 ++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 80 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 5b1d780..c909341 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,81 @@ -# clip-caption-reward +# Fine-grained Image Captioning with CLIP Reward + +*author: David Wang* + + +
+ +## Description + +This operator generates the caption with [CLIPReward](https://arxiv.org/abs/2205.13115) which describes the content of the given image. CLIPReward uses CLIP as a reward function and a simple finetuning strategy of the CLIP text encoder to impove grammar that does not require extra text annotation, thus towards to more descriptive and distinctive caption generation. This is an adaptation from [j-min/CLIP-Caption-Reward](https://github.com/j-min/CLIP-Caption-Reward). + + +
+ + +## Code Example + +Load an image from path './animals.jpg' to generate the caption. + + *Write the pipeline in simplified style*: + +```python +import towhee + +towhee.glob('./animals.jpg') \ + .image_decode() \ + .image_captioning.clip_caption_reward(model_name='clipRN50_clips_grammar') \ + .show() +``` +result1 + +*Write a same pipeline with explicit inputs/outputs name specifications:* + +```python +import towhee + +towhee.glob['path']('./animals.jpg') \ + .image_decode['path', 'img']() \ + .image_captioning.clip_caption_reward['img', 'text'](model_name='clipRN50_clips_grammar') \ + .select['img', 'text']() \ + .show() +``` +result2 + + +
+ +## Factory Constructor + +Create the operator via the following factory method + +***clip_caption_reward(model_name)*** + +**Parameters:** + +​ ***model_name:*** *str* + +​ The model name of BLIP. Supported model names: +- clipRN50_clips_grammar + +
+ + + +## Interface + +An image-text embedding operator takes a [towhee image](link/to/towhee/image/api/doc) as input and generate the correspoing caption. + + +**Parameters:** + +​ ***img:*** *towhee.types.Image (a sub-class of numpy.ndarray)* + +​ The image to generate embedding. + + + +**Returns:** *str* + +​ The caption generated by model.