diff --git a/README.md b/README.md index acd0d2d..8abd1ee 100644 --- a/README.md +++ b/README.md @@ -1,2 +1,76 @@ -# panns +# Audio Classification with PANNS + +*Author: Jael Gu* + + +## Desription + +The audio classification operator classify the given audio data with 527 labels from the large-scale [AudioSet dataset](https://research.google.com/audioset/). +The pre-trained model used here is from the paper **PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition** ([paper link](https://arxiv.org/abs/1912.10211)). + +```python +import numpy as np +from towhee import ops + +audio_classifier = ops.audio_classification.panns() + +# Path or url as input +tags, audio_embedding = audio_classifier("/audio/path/or/url/") + +# Audio data as input +audio_data = np.zeros((2, 441344)) +sample_rate = 44100 +tags, audio_embedding = audio_classifier(audio_data, sample_rate) +``` + +## Factory Constructor + +Create the operator via the following factory method + +***ops.audio_classification.panns()*** + + +## Interface + +Given an audio (file path, link, or waveform), +the audio classification operator generates a list of labels +and a vector in numpy.ndarray. + + +**Parameters:** + +​ None. + + +**Returns**: *numpy.ndarray* + +​ labels [(tag, score)], audio embedding in shape (2048,). + + + +## Code Example + +Generate embeddings for the audio "test.wav". + + *Write the pipeline in simplified style*: + +```python +from towhee import dc + +dc.glob('test.wav') + .audio_classification.panns() + .show() +``` + +*Write a same pipeline with explicit inputs/outputs name specifications:* + +```python +from towhee import dc + +dc.glob['path']('test.wav') + .audio_classification.panns['path', 'vecs']() + .select('vecs') + .show() +``` +