logo
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Readme
Files and versions

Updated 11 months ago

towhee

Video deduplication with Distill-and-Select

author: Chen Zhang


Description

This operator is made for video deduplication task base on DnS: Distill-and-Select for Efficient and Accurate Video Indexing and Retrieval.
Training with knowledge distillation method in large, unlabelled datasets, DnS learns: a) Student Networks at different retrieval performance and computational efficiency trade-offs and b) a Selection Network that at test time rapidly directs samples to the appropriate student to maintain both high retrieval performance and high computational efficiency.


Code Example

Load a video from path './demo_video.flv' using ffmpeg operator to decode it.

Then use distill_and_select operator to get the output using the specified model.

For fine-grained student model, get a 3d output with the temporal-dim information. For coarse-grained student model, get a 1d output representing the whole video. For selector model, get a scalar output.

For feature_extractor model:

import towhee
towhee.dc(['./demo_video.flv']) \
    .video_decode.ffmpeg(start_time=0.0, end_time=1000.0, sample_type='time_step_sample', args={'time_step': 1}) \
    .runas_op(func=lambda x: [y for y in x]) \
    .distill_and_select(model_name='feature_extractor') \
    .show()

For fg_att_student model:

import towhee
towhee.dc(['./demo_video.flv']) \
    .video_decode.ffmpeg(start_time=0.0, end_time=1000.0, sample_type='time_step_sample', args={'time_step': 1}) \
    .runas_op(func=lambda x: [y for y in x]) \
    .distill_and_select(model_name='fg_att_student') \
    .show()

For fg_bin_student model:

import towhee
towhee.dc(['./demo_video.flv']) \
    .video_decode.ffmpeg(start_time=0.0, end_time=1000.0, sample_type='time_step_sample', args={'time_step': 1}) \
    .runas_op(func=lambda x: [y for y in x]) \
    .distill_and_select(model_name='fg_bin_student') \
    .show()

For cg_student model:

import towhee
towhee.dc(['./demo_video.flv']) \
    .video_decode.ffmpeg(start_time=0.0, end_time=1000.0, sample_type='time_step_sample', args={'time_step': 1}) \
    .runas_op(func=lambda x: [y for y in x]) \
    .distill_and_select(model_name='cg_student') \
    .show()

For selector_att model:

import towhee
towhee.dc(['./demo_video.flv']) \
    .video_decode.ffmpeg(start_time=0.0, end_time=1000.0, sample_type='time_step_sample', args={'time_step': 1}) \
    .runas_op(func=lambda x: [y for y in x]) \
    .distill_and_select(model_name='selector_att') \
    .show()

For selector_bin model:

import towhee
towhee.dc(['./demo_video.flv']) \
    .video_decode.ffmpeg(start_time=0.0, end_time=1000.0, sample_type='time_step_sample', args={'time_step': 1}) \
    .runas_op(func=lambda x: [y for y in x]) \
    .distill_and_select(model_name='selector_bin') \
    .show()

Write a same pipeline with explicit inputs/outputs name specifications, take cg_student model for example:

import towhee
towhee.dc['path'](['./demo_video.flv']) \
    .video_decode.ffmpeg['path', 'frames'](start_time=0.0, end_time=1000.0, sample_type='time_step_sample', args={'time_step': 1}) \
    .runas_op['frames', 'frames'](func=lambda x: [y for y in x]) \
    .distill_and_select['frames', 'vec'](model_name='cg_student') \
    .show()


Factory Constructor

Create the operator via the following factory method

distill_and_select(model_name, **kwargs)

Parameters:

model_name: str

​ Can be one of them:
feature_extractor: Feature Extractor only,
fg_att_student: Fine Grained Student with attention,
fg_bin_student: Fine Grained Student with binarization,
cg_student: Coarse Grained Student,
selector_att: Selector Network with attention,
selector_bin: Selector Network with binarization.

model_weight_path: str

​ Default is None, download use the original pretrained weights.

feature_extractor: Union[str, nn.Module]

None, 'default' or a pytorch nn.Module instance.
None means this operator don't support feature extracting from the video data and this operator process embedding feature as input.
'default' means using the original pretrained feature extracting weights and this operator can process video data as input.
Or you can pass in a nn.Module instance as a specified feature extractor.
Default is default.

device: str
​ Model device, cpu or cuda.


Interface

Get the output from your specified model.

Parameters:

data: List[towhee.types.VideoFrame] or Any

​ The input type is List[VideoFrame] when using default feature_extractor, else the type for your customer feature_extractor.

Returns: numpy.ndarray

​ Output by specified model.

More Resources

Jael Gu d31d74edf7 Add more resources 5 Commits
folder-icon model build DnS operator 3 years ago
folder-icon output_imgs build DnS operator 3 years ago
file-icon .gitattributes
1.1 KiB
download-icon
Initial commit 3 years ago
file-icon DnS.png
424 KiB
download-icon
build DnS operator 3 years ago
file-icon README.md
6.5 KiB
download-icon
Add more resources 11 months ago
file-icon __init__.py
746 B
download-icon
build DnS operator 3 years ago
file-icon demo_video.flv
1016 KiB
download-icon
build DnS operator 3 years ago
file-icon distill_and_select.py
4.9 KiB
download-icon
change relative module 3 years ago
file-icon requirements.txt
26 B
download-icon
build DnS operator 3 years ago