magic

Readme

Files and versions

Updated 4 years ago

image-captioning

Our Implementation of the ZeroCap Baseline Model

Catalogue:

1. Environment Preparation
2. Image Captioning on MSCOCO
3. Image Captioning on Flickr30k
4. Cross Domain Image Captioning on MSCOCO
5. Cross Domain Image Captioning on Flickr30k
6. Citation
7. Acknowledgements

1. Environment Preparation:

To install the correct environment, please run the following command:

pip install -r requirements.txt

2. Image Captioning on MSCOCO:

To perform image captioning on MSCOCO, please run the following command:

chmod +x ./mscoco_zerocap.sh
./mscoco_zerocap.sh

3. Image Captioning on Flickr30k:

To perform image captioning on Flickr30k, please run the following command:

chmod +x ./flickr30k_zerocap.sh
./flickr30k_zerocap.sh

4. Cross Domain Image Captioning on MSCOCO:

To perform image captioning on MSCOCO with the language model from Flickr30k domain, please run the following command:

chmod +x ./flickr30k_to_mscoco_zerocap.sh
./flickr30k_to_mscoco_zerocap.sh

5. Cross Domain Image Captioning on Flickr30k:

To perform image captioning on Flickr30k with the language model from MSCOCO domain, please run the following command:

chmod +x ./mscoco_to_flickr30k_zerocap.sh
./mscoco_to_flickr30k_zerocap.sh

6. Citation:

If you find our code helpful, please cite the original paper as

@article{tewel2021zero,
  title={Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic},
  author={Tewel, Yoad and Shalev, Yoav and Schwartz, Idan and Wolf, Lior},
  journal={arXiv preprint arXiv:2111.14447},
  year={2021}
}

7. Acknowledgements:

We thank the authors for releasing their code. Our reimplementation of the baseline is based on their original codebase [here].

wxywb 7e9e351cfc init the operator. Signed-off-by: wxywb <xy.wang@zilliz.com>			2 Commits
..
model		init the operator.	4 years ago
README.md	2.2 KiB	init the operator.	4 years ago
cog.yaml	316 B	init the operator.	4 years ago
flickr30k_zerocap.sh	473 B	init the operator.	4 years ago
forbidden_tokens.npy	7.3 KiB	init the operator.	4 years ago
mscoco_zerocap.sh	458 B	init the operator.	4 years ago
predict.py	4.5 KiB	init the operator.	4 years ago
predict_arithmetic.py	5.0 KiB	init the operator.	4 years ago
requirements.txt	18 B	init the operator.	4 years ago
run.py	5.4 KiB	init the operator.	4 years ago
setup.py	448 B	init the operator.	4 years ago