Audio Embedding

Description

The audio embedding pipeline converts an input audio into a dense vector which can be used to represent the audio clip's semantics. Each vector represents for an audio clip with a fixed length of around 0.9s. This operator is built on top of VGGish with Pytorch.

Code Example

Create audio embedding pipeline with the default configuration.

from towhee import AutoPipes

p = AutoPipes.pipeline('audio-embedding')
res = p('test.wav')
res.get()

Interface

AudioEmbeddingConfig

You can find some parameters in audio_decode.ffmpeg and audio_embedding.vggish operators.

weights_path: str

The path to model weights. If None, it will load default model weights.

framework: str

The framework of model implementation. Default value is "pytorch" since the model is implemented in Pytorch.

device: int

The number of GPU device, defaults to -1, which means using CPU.

More Resources

Exploring Multimodal Embeddings with FiftyOne and Milvus - Zilliz blog: This post explored how multimodal embeddings work with Voxel51 and Milvus.
How to Get the Right Vector Embeddings - Zilliz blog: A comprehensive introduction to vector embeddings and how to generate them with popular open-source models.
Audio Retrieval Based on Milvus - Zilliz blog: Create an audio retrieval system using Milvus, an open-source vector database. Classify and analyze sound data in real time.
Vector Database Use Case: Audio Similarity Search - Zilliz: Building agile and reliable audio similarity search with Zilliz vector database (fully managed Milvus).
Sparse and Dense Embeddings: A Guide for Effective Information Retrieval with Milvus | Zilliz Webinar: Zilliz webinar covering what sparse and dense embeddings are and when you'd want to use one over the other.
Understanding Neural Network Embeddings - Zilliz blog: This article is dedicated to going a bit more in-depth into embeddings/embedding vectors, along with how they are used in modern ML algorithms and pipelines.
Sparse and Dense Embeddings: A Guide for Effective Information Retrieval with Milvus | Zilliz Webinar: Zilliz webinar covering what sparse and dense embeddings are and when you'd want to use one over the other.
An Introduction to Vector Embeddings: What They Are and How to Use Them - Zilliz blog: In this blog post, we will understand the concept of vector embeddings and explore its applications, best practices, and tools for working with embeddings.

audio-embedding

Jael Gu bbc2772376 Add more resources Signed-off-by: Jael Gu <mengjia.gu@zilliz.com>			7 Commits
.gitattributes	1.1 KiB	Initial commit	2 years ago
README.md	3.1 KiB	Add more resources	10 months ago
audio_embedding.py	1.2 KiB	Update pipelines with pydantic	2 years ago
test.wav	1.3 MiB	Add audio embedding	2 years ago