logo
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Readme
Files and versions

Updated 3 years ago

audio-classification

Audio Classification with PANNS

Author: Jael Gu

Desription

The audio classification operator classify the given audio data with 527 labels from the large-scale AudioSet dataset. The pre-trained model used here is from the paper PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition (paper link).

import numpy as np
from towhee import ops

audio_classifier = ops.audio_classification.panns()

# Path or url as input
tags, audio_embedding = audio_classifier("/audio/path/or/url/")

# Audio data as input
audio_data = np.zeros((2, 441344))
sample_rate = 44100
tags, audio_embedding = audio_classifier(audio_data, sample_rate)

Factory Constructor

Create the operator via the following factory method

ops.audio_classification.panns()

Interface

Given an audio (file path, link, or waveform), the audio classification operator generates a list of labels and a vector in numpy.ndarray.

Parameters:

​ None.

Returns: numpy.ndarray

​ labels [(tag, score)], audio embedding in shape (2048,).

Code Example

Generate embeddings for the audio "test.wav".

Write the pipeline in simplified style:

from towhee import dc

dc.glob('test.wav')
  .audio_classification.panns()
  .show()

Write a same pipeline with explicit inputs/outputs name specifications:

from towhee import dc

dc.glob['path']('test.wav')
  .audio_classification.panns['path', 'vecs']()
  .select('vecs')
  .show()
Jael Gu 718af46e1f Add readme 3 Commits
file-icon .gitattributes
1.1 KiB
download-icon
Initial commit 3 years ago
file-icon README.md
1.5 KiB
download-icon
Add readme 3 years ago
file-icon __init__.py
702 B
download-icon
Add 3 years ago
file-icon panns.py
2.9 KiB
download-icon
Add 3 years ago
file-icon requirements.txt
32 B
download-icon
Add 3 years ago