logo
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Readme
Files and versions

Updated 1 year ago

text-embedding

Evaluate with Similarity Search

Introduction

Build a classification system based on similarity search across embeddings. The core ideas in run.py:

  1. create a new Milvus collection each time
  2. extract embeddings using a pretrained model with model name specified by --model
  3. specify inference method with --format in value of pytorch or onnx
  4. insert & search embeddings with Milvus collection without index
  5. measure performance with accuracy at top 1, 5, 10
    1. vote for the prediction from topk search results (most frequent one)
    2. compare final prediction with ground truth
    3. calculate percent of correct predictions over all queries

Example Usage

python evaluate.py --model MODEL_NAME --format pytorch
python evaluate.py --model MODEL_NAME --format onnx