omnivore
copied
gexy5
2 years ago
5 changed files with 108 additions and 2 deletions
@ -1,2 +1,108 @@ |
|||||
# omnivore |
|
||||
|
# Video Classification with Omnivore |
||||
|
|
||||
|
*Author: [Xinyu Ge](https://github.com/gexy185)* |
||||
|
|
||||
|
<br /> |
||||
|
|
||||
|
## Description |
||||
|
|
||||
|
A video classification operator generates labels (and corresponding scores) and extracts features for the input video. |
||||
|
It transforms the video into frames and loads pre-trained models by model names. |
||||
|
This operator has implemented pre-trained models from [Omnivore](https://arxiv.org/abs/2201.08377) |
||||
|
and maps vectors with labels provided by datasets used for pre-training. |
||||
|
|
||||
|
<br /> |
||||
|
|
||||
|
## Code Example |
||||
|
|
||||
|
Use the pretrained Omnivore model to classify and generate a vector for the given video path './archery.mp4' |
||||
|
([download](https://dl.fbaipublicfiles.com/pytorchvideo/projects/archery.mp4)). |
||||
|
|
||||
|
*Write the pipeline in simplified style*: |
||||
|
|
||||
|
- Predict labels (default): |
||||
|
```python |
||||
|
import towhee |
||||
|
|
||||
|
( |
||||
|
towhee.glob('./archery.mp4') |
||||
|
.video_decode.ffmpeg() |
||||
|
.action_classification.omnivore( |
||||
|
model_name='omnivore_swinT', topk=5) |
||||
|
.show() |
||||
|
) |
||||
|
``` |
||||
|
<img src="./result1.png" height="px"/> |
||||
|
|
||||
|
*Write a same pipeline with explicit inputs/outputs name specifications*: |
||||
|
|
||||
|
```python |
||||
|
import towhee |
||||
|
|
||||
|
( |
||||
|
towhee.glob['path']('./archery.mp4') |
||||
|
.video_decode.ffmpeg['path', 'frames']() |
||||
|
.action_classification.omnivore['frames', ('labels', 'scores', 'features')]( |
||||
|
model_name='omnivore_swinT') |
||||
|
.select['path', 'labels', 'scores', 'features']() |
||||
|
.show(formatter={'path': 'video_path'}) |
||||
|
) |
||||
|
``` |
||||
|
|
||||
|
<img src="./result2.png" height="px"/> |
||||
|
|
||||
|
<br /> |
||||
|
|
||||
|
## Factory Constructor |
||||
|
|
||||
|
Create the operator via the following factory method |
||||
|
|
||||
|
***video_classification.omnivore( |
||||
|
model_name='tsm_k400_r50_seg8', skip_preprocess=False, classmap=None, topk=5)*** |
||||
|
|
||||
|
**Parameters:** |
||||
|
|
||||
|
***model_name***: *str* |
||||
|
|
||||
|
The name of pre-trained tsm model. |
||||
|
|
||||
|
Supported model names: |
||||
|
- omnivore_swinT |
||||
|
- omnivore_swinS |
||||
|
- omnivore_swinB |
||||
|
- omnivore_swinB_in21k |
||||
|
- omnivore_swinL_in21k |
||||
|
- omnivore_swinB_epic |
||||
|
|
||||
|
***skip_preprocess***: *bool* |
||||
|
|
||||
|
Flag to control whether to skip video transforms, defaults to False. |
||||
|
If set to True, the step to transform videos will be skipped. |
||||
|
In this case, the user should guarantee that all the input video frames are already reprocessed properly, |
||||
|
and thus can be fed to model directly. |
||||
|
|
||||
|
***classmap***: *Dict[str: int]*: |
||||
|
|
||||
|
Dictionary that maps class names to one hot vectors. |
||||
|
If not given, the operator will load the default class map dictionary. |
||||
|
|
||||
|
***topk***: *int* |
||||
|
|
||||
|
The topk labels & scores to present in result. The default value is 5. |
||||
|
|
||||
|
## Interface |
||||
|
|
||||
|
A video classification operator generates a list of class labels |
||||
|
and a corresponding vector in numpy.ndarray given a video input data. |
||||
|
|
||||
|
**Parameters:** |
||||
|
|
||||
|
***video***: *Union[str, numpy.ndarray]* |
||||
|
|
||||
|
Input video data using local path in string or video frames in ndarray. |
||||
|
|
||||
|
|
||||
|
**Returns**: *(list, list, torch.Tensor)* |
||||
|
|
||||
|
A tuple of (labels, scores, features), |
||||
|
which contains lists of predicted class names and corresponding scores. |
||||
|
Binary file not shown.
After Width: | Height: | Size: 32 KiB |
After Width: | Height: | Size: 114 KiB |
Loading…
Reference in new issue