|
@ -15,7 +15,7 @@ The [default model weight](clmr_checkpoint_10000.pt) provided is pretrained on [ |
|
|
|
|
|
|
|
|
## Code Example |
|
|
## Code Example |
|
|
|
|
|
|
|
|
Generate embeddings for the audio "test.wav". |
|
|
|
|
|
|
|
|
Generate embeddings for the audio "test.wav". |
|
|
|
|
|
|
|
|
*Write the pipeline in simplified style*: |
|
|
*Write the pipeline in simplified style*: |
|
|
|
|
|
|
|
@ -42,6 +42,7 @@ import towhee |
|
|
.audio_decode.ffmpeg['path', 'frames']() |
|
|
.audio_decode.ffmpeg['path', 'frames']() |
|
|
.runas_op['frames', 'frames'](func=lambda x:[y[0] for y in x]) |
|
|
.runas_op['frames', 'frames'](func=lambda x:[y[0] for y in x]) |
|
|
.audio_embedding.clmr['frames', 'vecs']() |
|
|
.audio_embedding.clmr['frames', 'vecs']() |
|
|
|
|
|
.select['path', 'vecs']() |
|
|
.show() |
|
|
.show() |
|
|
) |
|
|
) |
|
|
``` |
|
|
``` |
|
@ -91,4 +92,4 @@ The input data should represent for an audio longer than 2s. |
|
|
*numpy.ndarray* |
|
|
*numpy.ndarray* |
|
|
|
|
|
|
|
|
Audio embeddings in shape (num_clips, 512). |
|
|
Audio embeddings in shape (num_clips, 512). |
|
|
Each embedding stands for features of an audio clip with length of 2s. |
|
|
|
|
|
|
|
|
Each embedding stands for features of an audio clip with length of 2s. |
|
|