# Evaluation Method - Model performance (WIP) - Pipeline speed ## Model Performance Build an image classification system based on similarity search across embeddings. The core ideas in `performance.py`: 1. create a new Milvus collection each time 2. extract embeddings using a pretrained model with model name specified by `--model` 3. specify inference method with `--format` in value of `pytorch` or `onnx` 4. insert & search embeddings with Milvus collection without index 5. measure performance with accuracy at top 1, 5, 10 1. vote for the prediction from topk search results (most frequent one) 2. compare final prediction with ground truth 3. calculate percent of correct predictions over all queries ### Example Usage ```bash # Option 1: python performance.py --model MODEL_NAME --format pytorch python performance.py --model MODEL_NAME --format onnx # Option 2: chmod +x performance.sh ./performance.sh ``` ## Pipeline Speed QPS test of the embedding pipeline including steps below: 1. load image from path (pipe.input) 2. decode image into arrays (ops.image_decode) 3. generate image embedding (preprocess, model inference, post-process) There are 3 methods with different pipeline speeds: - Towhee pipe (regular method) - Onnxruntime (model inference using onnx at local) - TritonServe with onnx enabled (request as client) ### Example usage Please note that `qps_test.py` uses: - `localhost:8000`: to connect triton client - `../towhee/jpeg`: as test image path ```bash python qps_test.py --model 'resnet50' --pipe --onnx --triton --num 100 --device cuda:0 ``` **Args:** - `--model`: mandatory, string, model name - `--pipe`: optional, on/off flag to enable qps test for pipe - `--onnx`: optional, on/off flag to enable qps test for onnx - `--triton`: optional, on/off flag to enable qps for triton (please make sure that triton client is ready) - `--num`: optional, integer, defaults to 100, batch size in each loop (10 loops in total) - `--device`: optional, string, defaults to 'cpu'