diff --git a/README.md b/README.md index 1acf752..65fde3d 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,47 @@ -# audio-embedding-vggish +# Pipeline: Audio Embedding using VGGish -This is another test repo \ No newline at end of file +Authors: Jael Gu + +## Overview + +This pipeline extracts features of a given audio file using a VGGish model implemented in Tensorflow. This is a supervised model pre-trained with [AudioSet](https://research.google.com/audioset/), which contains over 2 million sound clips. + +## Interface + +**Input Arguments:** + +- filepath: + - the input audio + - supported types: `str` (path to the audio) + +**Pipeline Output:** + +The Operator returns a tuple `Tuple[('embs', numpy.ndarray)]` containing following fields: + +- embs: + - embeddings of input audio + - data type: numpy.ndarray + - shape: (num_clips,128) + +## How to use + +1. Install [Towhee](https://github.com/towhee-io/towhee) + +```bash +$ pip3 install towhee +``` + +> You can refer to [Getting Started with Towhee](https://towhee.io/) for more details. If you have any questions, you can [submit an issue to the towhee repository](https://github.com/towhee-io/towhee/issues). + +2. Run it with Towhee + +```python +>>> from towhee import pipeline + +>>> embedding_pipeline = pipeline('towhee/audio-embedding-vggish') +>>> embedding = embedding_pipeline('path/to/your/audio') +``` + +## How it works + +This pipeline includes a main operator: [audio embedding](https://hub.towhee.io/towhee/audio-embedding-operator-template) (implemented as [towhee/tf-vggish-audioset](https://hub.towhee.io/towhee/tf-vggish-audioset)). The audio embedding operator encodes fixed-length clips of an audio data and finally output a set of vectors of the given audio.