copied
Readme
Files and versions
1.5 KiB
Audio Classification with PANNS
Author: Jael Gu
Desription
The audio classification operator classify the given audio data with 527 labels from the large-scale AudioSet dataset. The pre-trained model used here is from the paper PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition (paper link).
import numpy as np
from towhee import ops
audio_classifier = ops.audio_classification.panns()
# Path or url as input
tags, audio_embedding = audio_classifier("/audio/path/or/url/")
# Audio data as input
audio_data = np.zeros((2, 441344))
sample_rate = 44100
tags, audio_embedding = audio_classifier(audio_data, sample_rate)
Factory Constructor
Create the operator via the following factory method
ops.audio_classification.panns()
Interface
Given an audio (file path, link, or waveform), the audio classification operator generates a list of labels and a vector in numpy.ndarray.
Parameters:
None.
Returns: numpy.ndarray
labels [(tag, score)], audio embedding in shape (2048,).
Code Example
Generate embeddings for the audio "test.wav".
Write the pipeline in simplified style:
from towhee import dc
dc.glob('test.wav')
.audio_classification.panns()
.show()
Write a same pipeline with explicit inputs/outputs name specifications:
from towhee import dc
dc.glob['path']('test.wav')
.audio_classification.panns['path', 'vecs']()
.select('vecs')
.show()
1.5 KiB
Audio Classification with PANNS
Author: Jael Gu
Desription
The audio classification operator classify the given audio data with 527 labels from the large-scale AudioSet dataset. The pre-trained model used here is from the paper PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition (paper link).
import numpy as np
from towhee import ops
audio_classifier = ops.audio_classification.panns()
# Path or url as input
tags, audio_embedding = audio_classifier("/audio/path/or/url/")
# Audio data as input
audio_data = np.zeros((2, 441344))
sample_rate = 44100
tags, audio_embedding = audio_classifier(audio_data, sample_rate)
Factory Constructor
Create the operator via the following factory method
ops.audio_classification.panns()
Interface
Given an audio (file path, link, or waveform), the audio classification operator generates a list of labels and a vector in numpy.ndarray.
Parameters:
None.
Returns: numpy.ndarray
labels [(tag, score)], audio embedding in shape (2048,).
Code Example
Generate embeddings for the audio "test.wav".
Write the pipeline in simplified style:
from towhee import dc
dc.glob('test.wav')
.audio_classification.panns()
.show()
Write a same pipeline with explicit inputs/outputs name specifications:
from towhee import dc
dc.glob['path']('test.wav')
.audio_classification.panns['path', 'vecs']()
.select('vecs')
.show()