logo
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Readme
Files and versions

1.8 KiB

Image Captioning with ExpansionNet v2

author: David Wang


Description

This operator generates the caption with ExpansionNet v2 which describes the content of the given image. ExpansionNet v2 introduces the Block Static Expansion which distributes and processes the input over a heterogeneous and arbitrarily big collection of sequences characterized by a different length compared to the input one. This is an adaptation from jchenghu/ExpansionNet_v2.


Code Example

Load an image from path './image.jpg' to generate the caption.

Write the pipeline in simplified style:

import towhee

towhee.glob('./image.jpg') \
      .image_decode() \
      .image_captioning.expansionnet_v2(model_name='expansionnet_rf') \
      .show()
result1

Write a same pipeline with explicit inputs/outputs name specifications:

import towhee

towhee.glob['path']('./image.jpg') \
      .image_decode['path', 'img']() \
      .image_captioning.expansionnet_v2['img', 'text'](model_name='expansionnet_rf') \
      .select['img', 'text']() \
      .show()
result2


Factory Constructor

Create the operator via the following factory method

expansionnet_v2(model_name)

Parameters:

model_name: str

​ The model name of ExpansionNet v2. Supported model names:

  • expansionnet_rf


Interface

An image captioning operator takes a towhee image as input and generate the correspoing caption.

Parameters:

data: towhee.types.Image (a sub-class of numpy.ndarray)

​ The image to generate caption.

Returns: str

​ The caption generated by model.

1.8 KiB

Image Captioning with ExpansionNet v2

author: David Wang


Description

This operator generates the caption with ExpansionNet v2 which describes the content of the given image. ExpansionNet v2 introduces the Block Static Expansion which distributes and processes the input over a heterogeneous and arbitrarily big collection of sequences characterized by a different length compared to the input one. This is an adaptation from jchenghu/ExpansionNet_v2.


Code Example

Load an image from path './image.jpg' to generate the caption.

Write the pipeline in simplified style:

import towhee

towhee.glob('./image.jpg') \
      .image_decode() \
      .image_captioning.expansionnet_v2(model_name='expansionnet_rf') \
      .show()
result1

Write a same pipeline with explicit inputs/outputs name specifications:

import towhee

towhee.glob['path']('./image.jpg') \
      .image_decode['path', 'img']() \
      .image_captioning.expansionnet_v2['img', 'text'](model_name='expansionnet_rf') \
      .select['img', 'text']() \
      .show()
result2


Factory Constructor

Create the operator via the following factory method

expansionnet_v2(model_name)

Parameters:

model_name: str

​ The model name of ExpansionNet v2. Supported model names:

  • expansionnet_rf


Interface

An image captioning operator takes a towhee image as input and generate the correspoing caption.

Parameters:

data: towhee.types.Image (a sub-class of numpy.ndarray)

​ The image to generate caption.

Returns: str

​ The caption generated by model.