logo
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Readme
Files and versions

1.9 KiB

Inference Performance

Test Scripts

from towhee import ops
import time

decode = ops.audio_decode.ffmpeg()
audio = [x[0] for x in decode('path/to/test.wav')]

op = ops.audio_embedding.nnfp()
# op = ops.audio_embedding.nnfp(model_path='path/to/torchscript/model.pt')
# op = ops.audio_embedding.nnfp(model_path='path/to/model.onnx')


start = time.time()
for _ in range(100):
	embs = op(audio)
	assert(embs.shape == (10, 128))
end = time.time()

print((end-start) / 100)

Performance (Default model)

  • Device: MacOS, 2.3 GHz Quad-Core Intel Core i7, 8 CPUs
  • Input: 10s mono audio, shape (1, ), sr , loop for 100 times
inference method mem usage avg time
pytorch 0.3G 0.451s
torchscript 0.3G 0.470s
onnx 0.3G 0.378s
  • Device: MacOS, 2.3 GHz Quad-Core Intel Core i7, 8 CPUs
  • Input: 188s stereo audio, shape (2, 8328408), sr 44100, loop for 100 times
inference method mem usage avg time
pytorch 2.6G 8.162s
torchscript 2.8G 7.507s
onnx 1.7G 6.769s
  • Device: MacOS, 2.3 GHz Quad-Core Intel Core i7, 8 CPUs
  • Input: 600s stereo audio, shape (2, 28800000), sr 48000, loop for 20 times
inference method mem usage avg time
pytorch 5G 22.540s
torchscript 4.9G 22.514s
onnx 3.4G 17.874s

Performance (Distilled model)

  • Device: MacOS, 2.3 GHz Quad-Core Intel Core i7, 8 CPUs
  • Input: 188s stereo audio, shape (2, 8328408), sr 44100, loop for 20 times
inference method mem usage avg time
pytorch 2.6G 7.215s
torchscript 2.8G 7.220s
onnx 1G 6.410s
  • Device: MacOS, 2.3 GHz Quad-Core Intel Core i7, 8 CPUs
  • Input: 600s stereo audio, shape (2, 28800000), sr 48000, loop for 20 times
inference method mem usage avg time
pytorch 4.9G 22.482s
torchscript 5.1G 21.511s
onnx 3.4G 17.709s