logo
Browse Source

Update README

Signed-off-by: Jael Gu <mengjia.gu@zilliz.com>
main
Jael Gu 2 years ago
parent
commit
16a513386e
  1. 2
      README.md

2
README.md

@ -6,6 +6,7 @@
## Desription
The audio embedding operator converts an input audio into a dense vector which can be used to represent the audio clip's semantics.
Each vector represents for an audio clip with a fixed length of around 0.9s.
This operator is built on top of [VGGish](https://github.com/tensorflow/models/tree/master/research/audioset/vggish) with Pytorch.
The model is a [VGG](https://arxiv.org/abs/1409.1556) variant pre-trained with a large scale of audio dataset [AudioSet](https://research.google.com/audioset).
As suggested, it is suitable to extract features at high level or warm up a larger model.
@ -81,6 +82,7 @@ An audio embedding operator generates vectors in numpy.ndarray given an audio fi
​ The audio path or link in string.
Or audio input data in towhee audio frames.
The input data should represent for an audio longer than 0.9s.
**Returns**:

Loading…
Cancel
Save