# Audio Embedding with CLMR *Author: Jael Gu* ## Desription The audio embedding operator converts an input audio into a dense vector which can be used to represent the audio clip's semantics. This operator is built on top of the original implementation of [CLMR](https://github.com/Spijkervet/CLMR). The [default model weight](./checkpoints/clmr_checkpoint_10000.pt) provided is pretrained on [Magnatagatune Dataset](https://paperswithcode.com/dataset/magnatagatune) with [SampleCNN](./models/sample_cnn.py). ```python import numpy as np from towhee import ops audio_encoder = ops.audio_embedding.clmr() # Path or url as input audio_embedding = audio_encoder("/audio/path/or/url/") # Audio data as input audio_data = np.zeros((2, 441344)) sample_rate = 44100 audio_embedding = audio_encoder(audio_data, sample_rate) ``` ## Factory Constructor Create the operator via the following factory method ***ops.audio_embedding.clmr()*** ## Interface An audio embedding operator generates vectors in numpy.ndarray given an audio file path or audio data in numpy.ndarray. **Parameters:** ​ None. **Returns**: *numpy.ndarray* ​ Audio embeddings in shape (num_clips, 512). ## Code Example Generate embeddings for the audio "test.wav". *Write the pipeline in simplified style*: ```python from towhee import dc dc.glob('test.wav') .audio_embedding.clmr() .show() ``` *Write a same pipeline with explicit inputs/outputs name specifications:* ```python from towhee import dc dc.glob['path']('test.wav') .audio_embedding.clmr['path', 'vecs']() .select('vecs') .show() ```