Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/master' into develop
Browse files Browse the repository at this point in the history
  • Loading branch information
Leonid Boytsov authored and Leonid Boytsov committed Oct 2, 2018
2 parents 4c003b5 + 7ff9ad8 commit 712be85
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion python_bindings/notebooks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ We have Python notebooks for the following scenarios:
3. [Sparse cosine similarity (non-optimized index)](search_sparse_cosine.ipynb);
4. [Sparse Jaccard similarity (non-optimized index)](search_generic_sparse_jaccard.ipynb).

Note that for for the dense space, we have examples of the so-called optimized and non-optimized indices. Except HNSW, all the methods save meta-indices rather than real ones. Meta indices contain only index structure, but not the data. Hence, before a meta-index can be loaded, we need to re-load data. One example is a memory efficient space to search for SIFT vectors.
Note that for for the dense space, we have examples of the so-called optimized and non-optimized indices. Except HNSW, all the methods save meta-indices rather than real ones. Meta indices contain only index structure, but not the data. Hence, before a meta-index can be loaded, we need to re-load data. One example is a memory efficient space to search for SIFT vectors. For ease of reproduction, example use very small corpora. For this reason, our methods do not necessarily outperform brute-force search. Typically, the larger is the corpora the larger is the improvement in efficiency over the brute-force search.

HNSW, can save real indices, but only for the dense vector spaces: Euclidean and the cosine. When you use these optimized indices, the search does not require reloading all the data. However, reloading the data is **required** if you want to use the function **getDistance**. Furthermore, creation of the optimized index can always be disabled specifying the index-time parameter **skip_optimized_index** (value 1).
This separation into optimized and non-optimized indices is not very convenient. In the future, we will fix this issue.
Expand Down

0 comments on commit 712be85

Please sign in to comment.