diff --git a/README.md b/README.md index 1b3c305..e9b0963 100644 --- a/README.md +++ b/README.md @@ -13,7 +13,7 @@ The pre-trained model used here is from the paper **PANNs: Large-Scale Pretraine ## Code Example -Predict labels and generate embeddings given the audio path "test.wav". +Predict labels and generate embeddings given the audio path "test.wav". *Write the pipeline in simplified style*: @@ -25,7 +25,7 @@ import towhee .audio_decode.ffmpeg() .runas_op(func=lambda x:[y[0] for y in x]) .audio_classification.panns() - .show() + .show() ) ``` @@ -39,9 +39,11 @@ import towhee .audio_decode.ffmpeg['path', 'frames']() .runas_op['frames', 'frames'](func=lambda x:[y[0] for y in x]) .audio_classification.panns['frames', ('labels', 'scores', 'vec')]() - .show() + .select['path', 'labels', 'scores', 'vec']() + .show() ) ``` +
@@ -93,4 +95,3 @@ The input data should represent for an audio longer than 2s. - labels: a list of topk predicted labels by model. - scores: a list of scores corresponding to labels, representing for possibility. - vec: a audio embedding generated by model, shape of which is (2048,) - diff --git a/result.png b/result.png new file mode 100644 index 0000000..f7861af Binary files /dev/null and b/result.png differ