diff --git a/README.md b/README.md index 19a229d..19426bd 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,48 @@ -# audio-embedding-clmr +# Pipeline: Audio Embedding using CLMR -This is another test repo \ No newline at end of file +Authors: Jael Gu + +## Overview + +The pipeline uses a pre-trained CLMR model to extract embeddings of a given audio. It first transforms the input audio to a wave file with sample rate of 22050. Then the model splits the audio data into shorter clips with a fixed length. Finally it generates vectors of each clip, which composes the fingerprint of the input audio. + +## Interface + +**Input Arguments:** + +- filepath: + - the input audio + - supported types: `str` (path to the audio) + +**Pipeline Output:** + +The Operator returns a tuple `Tuple[('embs', numpy.ndarray)]` containing following fields: + +- embs: + - embeddings of input audio + - data type: numpy.ndarray + - shape: (num_clips,512) + +## How to use + +1. Install [Towhee](https://github.com/towhee-io/towhee) + +```bash +$ pip3 install towhee +``` + +> You can refer to [Getting Started with Towhee](https://towhee.io/) for more details. If you have any questions, you can [submit an issue to the towhee repository](https://github.com/towhee-io/towhee/issues). + +2. Run it with Towhee + +```python +>>> from towhee import pipeline + +>>> embedding_pipeline = pipeline('towhee/audio-embedding-clmr') +>>> embedding = embedding_pipeline('path/to/your/audio') +``` + +## How it works + +This pipeline includes a main operator: [audio embedding](https://hub.towhee.io/towhee/audio-embedding-operator-template) (implemented as [towhee/clmr-magnatagatune](https://hub.towhee.io/towhee/clmr-magnatagatune)). The audio embedding operator encodes fixed-length clips of an audio data and finally output a set of vectors of the given audio. +