Readme

Files and versions

1.5 KiB

Raw Blame History

Audio Classification with PANNS

Author: Jael Gu

Desription

The audio classification operator classify the given audio data with 527 labels from the large-scale AudioSet dataset. The pre-trained model used here is from the paper PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition (paper link).

import numpy as np
from towhee import ops

audio_classifier = ops.audio_classification.panns()

# Path or url as input
tags, audio_embedding = audio_classifier("/audio/path/or/url/")

# Audio data as input
audio_data = np.zeros((2, 441344))
sample_rate = 44100
tags, audio_embedding = audio_classifier(audio_data, sample_rate)

Factory Constructor

Create the operator via the following factory method

ops.audio_classification.panns()

Interface

Given an audio (file path, link, or waveform), the audio classification operator generates a list of labels and a vector in numpy.ndarray.

Parameters:

None.

Returns: numpy.ndarray

labels [(tag, score)], audio embedding in shape (2048,).

Code Example

Generate embeddings for the audio "test.wav".

Write the pipeline in simplified style:

from towhee import dc

dc.glob('test.wav')
  .audio_classification.panns()
  .show()

Write a same pipeline with explicit inputs/outputs name specifications:

from towhee import dc

dc.glob['path']('test.wav')
  .audio_classification.panns['path', 'vecs']()
  .select('vecs')
  .show()

1.5 KiB

Raw Blame History

Audio Classification with PANNS

Author: Jael Gu

Desription

import numpy as np
from towhee import ops

audio_classifier = ops.audio_classification.panns()

# Path or url as input
tags, audio_embedding = audio_classifier("/audio/path/or/url/")

# Audio data as input
audio_data = np.zeros((2, 441344))
sample_rate = 44100
tags, audio_embedding = audio_classifier(audio_data, sample_rate)

Factory Constructor

Create the operator via the following factory method

ops.audio_classification.panns()

Interface

Given an audio (file path, link, or waveform), the audio classification operator generates a list of labels and a vector in numpy.ndarray.

Parameters:

None.

Returns: numpy.ndarray

labels [(tag, score)], audio embedding in shape (2048,).

Code Example

Generate embeddings for the audio "test.wav".

Write the pipeline in simplified style:

from towhee import dc

dc.glob('test.wav')
  .audio_classification.panns()
  .show()

Write a same pipeline with explicit inputs/outputs name specifications:

from towhee import dc

dc.glob['path']('test.wav')
  .audio_classification.panns['path', 'vecs']()
  .select('vecs')
  .show()

Readme

Files and versions

1.5 KiB Raw Blame History

Audio Classification with PANNS

Desription

Factory Constructor

Interface

Code Example

1.5 KiB Raw Blame History

Audio Classification with PANNS

Desription

Factory Constructor

Interface

Code Example

1.5 KiB

Raw Blame History

1.5 KiB

Raw Blame History