Readme

Files and versions

2.0 KiB

Raw Blame History

Evaluation Method

Model performance (WIP)
Pipeline speed

Model Performance

Build an image classification system based on similarity search across embeddings.

The core ideas in performance.py:

create a new Milvus collection each time
extract embeddings using a pretrained model with model name specified by --model
specify inference method with --format in value of pytorch or onnx
insert & search embeddings with Milvus collection without index
measure performance with accuracy at top 1, 5, 10
1. vote for the prediction from topk search results (most frequent one)
2. compare final prediction with ground truth
3. calculate percent of correct predictions over all queries

Example Usage

# Option 1:
python performance.py --model MODEL_NAME --format pytorch
python performance.py --model MODEL_NAME --format onnx

# Option 2:
chmod +x performance.sh
./performance.sh

Pipeline Speed

QPS test of the embedding pipeline including steps below:

load image from path (pipe.input)
decode image into arrays (ops.image_decode)
generate image embedding (preprocess, model inference, post-process)

There are 3 methods with different pipeline speeds:

Towhee pipe (regular method)
Onnxruntime (model inference using onnx at local)
TritonServe with onnx enabled (request as client)

Example usage

Please note that qps_test.py uses:

localhost:8000: to connect triton client
../towhee/jpeg: as test image path

python qps_test.py --model 'resnet50' --pipe --onnx --triton --num 100 --device cuda:0

Args:

--model: mandatory, string, model name
--pipe: optional, on/off flag to enable qps test for pipe
--onnx: optional, on/off flag to enable qps test for onnx
--triton: optional, on/off flag to enable qps for triton (please make sure that triton client is ready)
--num: optional, integer, defaults to 100, batch size in each loop (10 loops in total)
--device: optional, string, defaults to 'cpu'

2.0 KiB

Raw Blame History

Evaluation Method

Model performance (WIP)
Pipeline speed

Model Performance

Build an image classification system based on similarity search across embeddings.

The core ideas in performance.py:

create a new Milvus collection each time
extract embeddings using a pretrained model with model name specified by --model
specify inference method with --format in value of pytorch or onnx
insert & search embeddings with Milvus collection without index
measure performance with accuracy at top 1, 5, 10
1. vote for the prediction from topk search results (most frequent one)
2. compare final prediction with ground truth
3. calculate percent of correct predictions over all queries

Example Usage

# Option 1:
python performance.py --model MODEL_NAME --format pytorch
python performance.py --model MODEL_NAME --format onnx

# Option 2:
chmod +x performance.sh
./performance.sh

Pipeline Speed

QPS test of the embedding pipeline including steps below:

load image from path (pipe.input)
decode image into arrays (ops.image_decode)
generate image embedding (preprocess, model inference, post-process)

There are 3 methods with different pipeline speeds:

Towhee pipe (regular method)
Onnxruntime (model inference using onnx at local)
TritonServe with onnx enabled (request as client)

Example usage

Please note that qps_test.py uses:

localhost:8000: to connect triton client
../towhee/jpeg: as test image path

python qps_test.py --model 'resnet50' --pipe --onnx --triton --num 100 --device cuda:0

Args:

--model: mandatory, string, model name
--pipe: optional, on/off flag to enable qps test for pipe
--onnx: optional, on/off flag to enable qps test for onnx
--triton: optional, on/off flag to enable qps for triton (please make sure that triton client is ready)
--num: optional, integer, defaults to 100, batch size in each loop (10 loops in total)
--device: optional, string, defaults to 'cpu'

Readme

Files and versions

2.0 KiB Raw Blame History

Evaluation Method

Model Performance

Example Usage

Pipeline Speed

Example usage

2.0 KiB Raw Blame History

Evaluation Method

Model Performance

Example Usage

Pipeline Speed

Example usage

2.0 KiB

Raw Blame History

2.0 KiB

Raw Blame History