logo
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Readme
Files and versions

Updated 12 months ago

towhee

Audio Embedding

Description

The audio embedding pipeline converts an input audio into a dense vector which can be used to represent the audio clip's semantics. Each vector represents for an audio clip with a fixed length of around 0.9s. This operator is built on top of VGGish with Pytorch.

Code Example

  • Create audio embedding pipeline with the default configuration.
from towhee import AutoPipes

p = AutoPipes.pipeline('audio-embedding')
res = p('test.wav')
res.get()

Interface

AudioEmbeddingConfig

You can find some parameters in audio_decode.ffmpeg and audio_embedding.vggish operators.

weights_path: str

The path to model weights. If None, it will load default model weights.

framework: str

The framework of model implementation. Default value is "pytorch" since the model is implemented in Pytorch.

device: int

The number of GPU device, defaults to -1, which means using CPU.

shiyu22 de2508867d Update pipelines with pydantic 5 Commits
file-icon .gitattributes
1.1 KiB
download-icon
Initial commit 1 year ago
file-icon README.md
1.0 KiB
download-icon
Add audio embedding 1 year ago
file-icon audio_embedding.py
1.2 KiB
download-icon
Update pipelines with pydantic 12 months ago
file-icon test.wav
1.3 MiB
download-icon
Add audio embedding 1 year ago