Skip to content

Commit

Permalink
add ploting in readme
Browse files Browse the repository at this point in the history
  • Loading branch information
ChunjiangZhu committed May 20, 2020
1 parent bb7cc2c commit 321ef63
Showing 1 changed file with 90 additions and 1 deletion.
91 changes: 90 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ Run.py Parameters:
- chembl-1024-jaccard
- molport-1024-jaccard
algorithm: algorithm name (Required)
Options:
Choices:
- Balltree(Sklearn)
- Bruteforce
- Datasketch
Expand Down Expand Up @@ -100,6 +100,67 @@ Command Examples (for Singularity only):

python run.py --dataset=chembl-1024-jaccard --algorithm='Hnsw(Nmslib)' --count=100 --sif-dir="./singularity" --batch

# Visualization of Execution Results under a PC
1. Run python plot.py

plot.py Parameters:

dataset: dataset name (Required)
Examples:
- chembl-1024-jaccard
- molport-1024-jaccard
count: the value of K for top-K nearest neighbor search
Default: 10
output/-o: the output file
x-axis/-x: which metric to use on the X-axis
Choices:
- k-nn: Recall for top-K nearest neighbor search (Default)
- range: Recall for range query
- qps: Queries per second (1/s)
- build: Indexing time (s)
- indexsize: Index size (kB)
y-axis/-y: which metric to use on the Y-axis
Choices:
- k-nn: Recall for top-K nearest neighbor search
- range: Recall for range query
- qps: Queries per second (1/s) (Default)
- build: Indexing time (s)
- indexsize: Index size (kB)
x-log/-X: Draw the X-axis using a logarithmic scale
Default: False
y-log/-Y: draw the Y-axis using a logarithmic scale
Default: False
raw: show raw results (not just Pareto frontier) in faded colours
Default: False
batch: batch query mode
Default: False
rq: range query / threshold-based query mode
Default: False
radius: in the range query mode, the used cut-off value. Here the distance is used, so if all near neighbors with a similarity coefficient larger than 0.8, please set it 0.2.
Default: 0.3


Command Examples:
- Plot results on chembl-1024-jaccard dataset for top-K (K=100) nearest neighbor query to "results/chembl-1024-jaccard-100.png". X-axis: recall. Y-axis: qps, log-scale.

python plot.py --dataset=chembl-1024-jaccard -Y --count=100 -o=results/chembl-1024-jaccard-100

- Plot results on molport-1024-jaccard dataset for top-K (K=10) nearest neighbor query to "results/molport-1024-jaccard-indexsize-10.png". X-axis: recall. Y-axis: index size, log-scale.

ython plot.py --dataset=molport-1024-jaccard -Y -y=indexsize --count=10 -o=results/molport-1024-jaccard-indexsize-10

- Plot results on molport-1024-jaccard dataset for top-K (K=10) nearest neighbor query to "results/molport-1024-jaccard-buildtime-10.png". X-axis: recall. Y-axis: indexing time, log-scale.

python plot.py --dataset=molport-1024-jaccard -Y -y=build --count=10 -o=results/molport-1024-jaccard-buildtime-10

- Plot batch mode results on molport-1024-jaccard dataset for top-K (K=100) nearest neighbor query to "results/molport-1024-jaccard-batch-100.png". X-axis: recall. Y-axis: qps, log-scale.

python plot.py --dataset=molport-1024-jaccard -Y --batch --count=100 -o=results/molport-1024-jaccard-batch-100

- Plot batch mode results on chembl-1024-jaccard dataset for range query with similarity cutoff 0.6 to "results/chembl-1024-jaccard-0_4.png". X-axis: recall (range query). Y-axis: qps, log-scale.

python plot.py --dataset=chembl-1024-jaccard -Y -x=range --rq --radius=0.4 -o=results/chembl-1024-jaccard-0_4

# Executions under an HPC environment

1. Load anaconda module
Expand Down Expand Up @@ -143,6 +204,34 @@ An example "run.sh":
python run.py --dataset=chembl-1024-jaccard --algorithm='Hnsw(Nmslib)' --count=100 --sif-dir="./singularity"


# Visualization of Execution Results under an HPC environment
Run your algorithm scripts by SLURM shell

sbatch plot.sh

An example "plot.sh":

#!/bin/bash

#SBATCH --ntasks=1
#SBATCH --exclude=cn[65-69,71-136,325-343,345-353,355-358,360-364,369-398,400-401],gpu[07-10]

module load anaconda/5.1.0

source activate ann_env

module purge

module load gcc/5.4.0

module load singularity/3.1


python plot.py --dataset=chembl-1024-jaccard -Y --count=100 -o=results/chembl-1024-jaccard-100


# Parameter tuning
All algorithmic parameter settings are included in the "./algos.yaml" file.

Expand Down

0 comments on commit 321ef63

Please sign in to comment.