omnivore
copied
gexy5
2 years ago
5 changed files with 108 additions and 2 deletions
@ -1,2 +1,108 @@ |
|||
# omnivore |
|||
# Video Classification with Omnivore |
|||
|
|||
*Author: [Xinyu Ge](https://github.com/gexy185)* |
|||
|
|||
<br /> |
|||
|
|||
## Description |
|||
|
|||
A video classification operator generates labels (and corresponding scores) and extracts features for the input video. |
|||
It transforms the video into frames and loads pre-trained models by model names. |
|||
This operator has implemented pre-trained models from [Omnivore](https://arxiv.org/abs/2201.08377) |
|||
and maps vectors with labels provided by datasets used for pre-training. |
|||
|
|||
<br /> |
|||
|
|||
## Code Example |
|||
|
|||
Use the pretrained Omnivore model to classify and generate a vector for the given video path './archery.mp4' |
|||
([download](https://dl.fbaipublicfiles.com/pytorchvideo/projects/archery.mp4)). |
|||
|
|||
*Write the pipeline in simplified style*: |
|||
|
|||
- Predict labels (default): |
|||
```python |
|||
import towhee |
|||
|
|||
( |
|||
towhee.glob('./archery.mp4') |
|||
.video_decode.ffmpeg() |
|||
.action_classification.omnivore( |
|||
model_name='omnivore_swinT', topk=5) |
|||
.show() |
|||
) |
|||
``` |
|||
<img src="./result1.png" height="px"/> |
|||
|
|||
*Write a same pipeline with explicit inputs/outputs name specifications*: |
|||
|
|||
```python |
|||
import towhee |
|||
|
|||
( |
|||
towhee.glob['path']('./archery.mp4') |
|||
.video_decode.ffmpeg['path', 'frames']() |
|||
.action_classification.omnivore['frames', ('labels', 'scores', 'features')]( |
|||
model_name='omnivore_swinT') |
|||
.select['path', 'labels', 'scores', 'features']() |
|||
.show(formatter={'path': 'video_path'}) |
|||
) |
|||
``` |
|||
|
|||
<img src="./result2.png" height="px"/> |
|||
|
|||
<br /> |
|||
|
|||
## Factory Constructor |
|||
|
|||
Create the operator via the following factory method |
|||
|
|||
***video_classification.omnivore( |
|||
model_name='tsm_k400_r50_seg8', skip_preprocess=False, classmap=None, topk=5)*** |
|||
|
|||
**Parameters:** |
|||
|
|||
***model_name***: *str* |
|||
|
|||
The name of pre-trained tsm model. |
|||
|
|||
Supported model names: |
|||
- omnivore_swinT |
|||
- omnivore_swinS |
|||
- omnivore_swinB |
|||
- omnivore_swinB_in21k |
|||
- omnivore_swinL_in21k |
|||
- omnivore_swinB_epic |
|||
|
|||
***skip_preprocess***: *bool* |
|||
|
|||
Flag to control whether to skip video transforms, defaults to False. |
|||
If set to True, the step to transform videos will be skipped. |
|||
In this case, the user should guarantee that all the input video frames are already reprocessed properly, |
|||
and thus can be fed to model directly. |
|||
|
|||
***classmap***: *Dict[str: int]*: |
|||
|
|||
Dictionary that maps class names to one hot vectors. |
|||
If not given, the operator will load the default class map dictionary. |
|||
|
|||
***topk***: *int* |
|||
|
|||
The topk labels & scores to present in result. The default value is 5. |
|||
|
|||
## Interface |
|||
|
|||
A video classification operator generates a list of class labels |
|||
and a corresponding vector in numpy.ndarray given a video input data. |
|||
|
|||
**Parameters:** |
|||
|
|||
***video***: *Union[str, numpy.ndarray]* |
|||
|
|||
Input video data using local path in string or video frames in ndarray. |
|||
|
|||
|
|||
**Returns**: *(list, list, torch.Tensor)* |
|||
|
|||
A tuple of (labels, scores, features), |
|||
which contains lists of predicted class names and corresponding scores. |
|||
|
Binary file not shown.
After Width: | Height: | Size: 32 KiB |
After Width: | Height: | Size: 114 KiB |
Loading…
Reference in new issue