logo
Browse Source

Update README

Signed-off-by: Jael Gu <mengjia.gu@zilliz.com>
main
Jael Gu 2 years ago
parent
commit
16a513386e
  1. 2
      README.md

2
README.md

@ -6,6 +6,7 @@
## Desription ## Desription
The audio embedding operator converts an input audio into a dense vector which can be used to represent the audio clip's semantics. The audio embedding operator converts an input audio into a dense vector which can be used to represent the audio clip's semantics.
Each vector represents for an audio clip with a fixed length of around 0.9s.
This operator is built on top of [VGGish](https://github.com/tensorflow/models/tree/master/research/audioset/vggish) with Pytorch. This operator is built on top of [VGGish](https://github.com/tensorflow/models/tree/master/research/audioset/vggish) with Pytorch.
The model is a [VGG](https://arxiv.org/abs/1409.1556) variant pre-trained with a large scale of audio dataset [AudioSet](https://research.google.com/audioset). The model is a [VGG](https://arxiv.org/abs/1409.1556) variant pre-trained with a large scale of audio dataset [AudioSet](https://research.google.com/audioset).
As suggested, it is suitable to extract features at high level or warm up a larger model. As suggested, it is suitable to extract features at high level or warm up a larger model.
@ -81,6 +82,7 @@ An audio embedding operator generates vectors in numpy.ndarray given an audio fi
​ The audio path or link in string. ​ The audio path or link in string.
Or audio input data in towhee audio frames. Or audio input data in towhee audio frames.
The input data should represent for an audio longer than 0.9s.
**Returns**: **Returns**:

Loading…
Cancel
Save