Pipeline: Audio Embedding using VGGish

Authors: Jael Gu

Overview

This pipeline extracts features of a given audio file using a VGGish model implemented in Pytorch. This is a supervised model pre-trained with AudioSet, which contains over 2 million sound clips.

Interface

Input Arguments:

audio_path:
- the input audio in .wav
- supported types: str (path to the audio)

Pipeline Output:

The Operator returns a tuple Tuple[('embs', numpy.ndarray)] containing following fields:

embs:
- embeddings of input audio
- data type: numpy.ndarray
- shape: (num_clips,128)

How to use

Install Towhee

$ pip3 install towhee

You can refer to Getting Started with Towhee for more details. If you have any questions, you can submit an issue to the towhee repository.

Run it with Towhee

>>> from towhee import pipeline

>>> embedding_pipeline = pipeline('towhee/audio-embedding-vggish')
>>> embedding = embedding_pipeline('path/to/your/audio')

How it works

This pipeline includes a main operator: audio-embedding (default: towhee/torch-vggish). The audio embedding operator encodes audio file and finally output a set of vectors of the given audio.

1.5 KiB

Raw Blame History

Pipeline: Audio Embedding using VGGish

Authors: Jael Gu

Overview

This pipeline extracts features of a given audio file using a VGGish model implemented in Pytorch. This is a supervised model pre-trained with AudioSet, which contains over 2 million sound clips.

Interface

Input Arguments:

audio_path:
- the input audio in .wav
- supported types: str (path to the audio)

Pipeline Output:

The Operator returns a tuple Tuple[('embs', numpy.ndarray)] containing following fields:

embs:
- embeddings of input audio
- data type: numpy.ndarray
- shape: (num_clips,128)

How to use

Install Towhee

$ pip3 install towhee

You can refer to Getting Started with Towhee for more details. If you have any questions, you can submit an issue to the towhee repository.

Run it with Towhee

>>> from towhee import pipeline

>>> embedding_pipeline = pipeline('towhee/audio-embedding-vggish')
>>> embedding = embedding_pipeline('path/to/your/audio')

How it works

This pipeline includes a main operator: audio-embedding (default: towhee/torch-vggish). The audio embedding operator encodes audio file and finally output a set of vectors of the given audio.

Readme

Files and versions

1.5 KiB Raw Blame History

Pipeline: Audio Embedding using VGGish

Overview

Interface

How to use

How it works

1.5 KiB Raw Blame History

Pipeline: Audio Embedding using VGGish

Overview

Interface

How to use

How it works

1.5 KiB

Raw Blame History

1.5 KiB

Raw Blame History