An action classification operator generates labels of human activities (with corresponding scores)
and extracts features for the input video.
It transforms the video into frames and loads pre-trained models by model names.
This operator has implemented pre-trained models from [TimeSformer](https://arxiv.org/abs/2102.05095)
This operator has implemented pre-trained models from [VideoSwinTransformer](https://arxiv.org/abs/2106.13230)
and maps vectors with labels.
<br/>
## Code Example
Use the pretrained TimeSformer model ('timesformer_k400_8x224')
Use the pretrained VideoSwinTransformer model ('swin_tiny_patch244_window877_kinetics400_1k')
to classify and generate a vector for the given video path './archery.mp4' ([download](https://dl.fbaipublicfiles.com/pytorchvideo/projects/archery.mp4)).
*Write the pipeline in simplified style*:
@ -25,9 +25,8 @@ to classify and generate a vector for the given video path './archery.mp4' ([dow