|
@ -21,7 +21,7 @@ Generate embeddings for the audio "test.wav". |
|
|
|
|
|
|
|
|
*Write a same pipeline with explicit inputs/outputs name specifications:* |
|
|
*Write a same pipeline with explicit inputs/outputs name specifications:* |
|
|
|
|
|
|
|
|
- option 1 (towhee>=0.9.0): |
|
|
|
|
|
|
|
|
- **option 1 (towhee>=0.9.0):** |
|
|
```python |
|
|
```python |
|
|
from towhee.dc2 import pipe, ops, DataCollection |
|
|
from towhee.dc2 import pipe, ops, DataCollection |
|
|
|
|
|
|
|
@ -36,7 +36,7 @@ DataCollection(p('test.wav')).show() |
|
|
``` |
|
|
``` |
|
|
<img src="./result.png" width="800px"/> |
|
|
<img src="./result.png" width="800px"/> |
|
|
|
|
|
|
|
|
- option 2: |
|
|
|
|
|
|
|
|
- **option 2:** |
|
|
```python |
|
|
```python |
|
|
import towhee |
|
|
import towhee |
|
|
|
|
|
|
|
@ -55,18 +55,17 @@ import towhee |
|
|
|
|
|
|
|
|
Create the operator via the following factory method |
|
|
Create the operator via the following factory method |
|
|
|
|
|
|
|
|
***audio_embedding.nnfp(params=None, model_path=None, framework='pytorch')*** |
|
|
|
|
|
|
|
|
***audio_embedding.nnfp(model_name='nnfp_default', model_path=None, framework='pytorch')*** |
|
|
|
|
|
|
|
|
**Parameters:** |
|
|
**Parameters:** |
|
|
|
|
|
|
|
|
*params: dict* |
|
|
|
|
|
|
|
|
*model_name: str* |
|
|
|
|
|
|
|
|
A dictionary of model parameters. If None, it will use default parameters to create model. |
|
|
|
|
|
|
|
|
Model name to create nnfp model with different parameters. |
|
|
|
|
|
|
|
|
*model_path: str* |
|
|
*model_path: str* |
|
|
|
|
|
|
|
|
The path to model. If None, it will load default model weights. |
|
|
The path to model. If None, it will load default model weights. |
|
|
When the path ends with '.onnx', the operator will use onnx inference. |
|
|
|
|
|
|
|
|
|
|
|
*framework: str* |
|
|
*framework: str* |
|
|
|
|
|
|
|
@ -88,7 +87,6 @@ An audio embedding operator generates vectors in numpy.ndarray given towhee audi |
|
|
Input audio data is a list of towhee audio frames. |
|
|
Input audio data is a list of towhee audio frames. |
|
|
The audio input should be at least 1s. |
|
|
The audio input should be at least 1s. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
**Returns**: |
|
|
**Returns**: |
|
|
|
|
|
|
|
|
*numpy.ndarray* |
|
|
*numpy.ndarray* |
|
@ -96,6 +94,7 @@ The audio input should be at least 1s. |
|
|
Audio embeddings in shape (num_clips, 128). |
|
|
Audio embeddings in shape (num_clips, 128). |
|
|
Each embedding stands for features of an audio clip with length of 1s. |
|
|
Each embedding stands for features of an audio clip with length of 1s. |
|
|
|
|
|
|
|
|
|
|
|
<br /> |
|
|
|
|
|
|
|
|
***save_model(format='pytorch', path='default')*** |
|
|
***save_model(format='pytorch', path='default')*** |
|
|
|
|
|
|
|
|