logo
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Readme
Files and versions

Updated 2 years ago

image-text-embedding

Universal-Image-Text-Transformer

Research code for pre-training universal vision and language models

Requirements

nvidia driver (418.xx), docker(19.03+), nvidia-container-toolkit

docker pull convaicontainerregistry1.azurecr.io/img-txt

lauching the environment

# can use CUDA_VISIBLE_DEVICES to seperate GPUs for each container
source launch_container.sh $TXT_DB $IMG_DIR $OUTPUT $PRETRAIN_PATH
# TXT_DB: convaistorage2share2/TXT_DB_v3
# IMG_DIR: convaistorage2share2/Bottom-up-features/adaptive/npy_per_img_id
# OUTPUT: somewhere to store model checkpoint (can be on share storage)
# PRETRAIN: path to pretrained model

# when need to preprocessing
source launch_container.sh $TXT_DB $IMG_DIR $OUTPUT $PRETRAIN_PATH --prepro
# this will make /db writable


# multi-node training
source launch_container_dist.sh $TXT_DB $IMG_DIR $OUTPUT $PRETRAIN_PATH

Pretrain

# inside the docker container
horovodrun -np $N_GPU -H localhost:$N_GPU \
    python pretrain.py --config config/config-pretrain-alltask.json

finetune VQA

horovodrun -np 2 -H localhost:2 \
    python train_vqa.py --config config/config-vqa-bert-2gpu-alldata.json

VQA inference

# single node only
# please refer to code for commandline options
horovodrun -np $N_GPU -H localhost:$N_GPU \
    python eval_vqa.py --txt_db /db/vqa_test_[base/large]-cased.db/ \
        --img_dir /img/coco_test2015 --checkpoint [NUM] \
        --output_dir /path/to/trained/vqa

NLVR2 official evaluation

Use official script to get both acc (our validation matched this) and consistency

# concat all output files
cat $OUTPUT/result/[val/test]_results_$STEP_rank*.csv > $OUTPUT.csv
python eval/nlvr2.py $OUTPUT.csv ANNOTATION.json

Referring Expression Comprehension: Finetuning and Evaluation

# train on gd-truth pairs of (ref, sent)
horovodrun -np $N_GPU -H localhost:$N_GPU \
    python train_re.py --config config/hps-refcoco+.json

# evaluate multiple splits on gd-truth boxes
horovodrun -np $N_GPU -H localhost:$N_GPU \
    python eval_re.py \
        --txt_db /db/refcoco+_val_base-cased.db:/db/refcoco+_testA_base-cased.db:/db/refcoco+_testB_base-cased.db \
        --img_dir /img/visual_grounding_coco_gt \
        --output_dir /storage/refcoco+/bert-base_mlm+itm+mrfr_pretrain-refcoco+_lr1e-4 \
        --checkpoint 26

# evaluate multiple splits on detected boxes
horovodrun -np $N_GPU -H localhost:$N_GPU \
    python eval_re.py \
        --txt_db /db/refcoco+_val_base-cased.db:/db/refcoco+_testA_base-cased.db:/db/refcoco+_testB_base-cased.db \
        --img_dir /img/visual_grounding_det_coco \
        --output_dir /storage/refcoco+/bert-base_mlm+itm+mrfr_pretrain-refcoco+_lr1e-4 \
        --checkpoint 26

Misc

  1. w/o horovodrun it will run on single GPU
    • useful for debugger (-m pdb)
  2. try --pin_mem it might give a tiny performance improvement
  3. --img_format [lmdb/lmdb-compress]
    • trade-off between memory/CPU
    • use --n_workers $N_CPU to specify data workers (default: 4)
wxywb 4059059c1b update the operator. 12 Commits
..
folder-icon config update the operator. 2 years ago
folder-icon data update the operator. 2 years ago
folder-icon eval update the operator. 2 years ago
folder-icon experiments update the operator. 2 years ago
folder-icon misc update the operator. 2 years ago
folder-icon model update the operator. 2 years ago
folder-icon optim update the operator. 2 years ago
folder-icon scripts update the operator. 2 years ago
folder-icon tests update the operator. 2 years ago
folder-icon utils update the operator. 2 years ago
file-icon Dockerfile
646 B
download-icon
update the operator. 2 years ago
file-icon LICENSE
1.0 KiB
download-icon
update the operator. 2 years ago
file-icon README.md
3.0 KiB
download-icon
update the operator. 2 years ago
file-icon eval_re.py
7.8 KiB
download-icon
update the operator. 2 years ago
file-icon eval_vcr.py
10 KiB
download-icon
update the operator. 2 years ago
file-icon eval_vqa.py
6.5 KiB
download-icon
update the operator. 2 years ago
file-icon format_vcr_predictions.py
2.0 KiB
download-icon
update the operator. 2 years ago
file-icon inf_itm.py
5.7 KiB
download-icon
update the operator. 2 years ago
file-icon launch_container.sh
668 B
download-icon
update the operator. 2 years ago
file-icon launch_container_dist.sh
818 B
download-icon
update the operator. 2 years ago
file-icon prepro.py
28 KiB
download-icon
update the operator. 2 years ago
file-icon pretrain.py
33 KiB
download-icon
update the operator. 2 years ago
file-icon pretrain_vcr.py
31 KiB
download-icon
update the operator. 2 years ago
file-icon requirements.txt
42 B
download-icon
update the operator. 2 years ago
file-icon train_itm.py
24 KiB
download-icon
update the operator. 2 years ago
file-icon train_itm_v2.py
21 KiB
download-icon
update the operator. 2 years ago
file-icon train_nlvr2.py
18 KiB
download-icon
update the operator. 2 years ago
file-icon train_re.py
18 KiB
download-icon
update the operator. 2 years ago
file-icon train_vcr.py
26 KiB
download-icon
update the operator. 2 years ago
file-icon train_ve.py
17 KiB
download-icon
update the operator. 2 years ago
file-icon train_vqa.py
17 KiB
download-icon
update the operator. 2 years ago