-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Improving docs (interm. commit) #390
- Loading branch information
searchivarius
committed
Jun 3, 2019
1 parent
55f9fc4
commit 7bb63ef
Showing
21 changed files
with
369 additions
and
4,372 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,55 @@ | ||
#NMSLIB documentation | ||
|
||
Documentation is split into several parts. | ||
Links to these parts are given below. | ||
They are preceded by a short problem definition. | ||
|
||
# Terminology and Problem Formulation | ||
|
||
NMSLIB provides a fast similarity search. | ||
The search is carried out in a finite database of objects _{o<sub>i</sub>}_ | ||
using a search query _q_ and a dissimilarity measure. | ||
An object is a synonym for a **data point** or simply a **point**. | ||
The dissimilarity measure is typically represented by a **distance** function _d(o<sub>i</sub>, q)_. | ||
The ultimate goal is to answer a query by retrieving a subset of database objects sufficiently similar to the query _q_. | ||
A combination of data points and the distance function is called a **search space**, | ||
or simply a **space**. | ||
|
||
|
||
Note that we use the terms distance and the distance function in a broader sense than | ||
most other folks: | ||
We do not assume that the distance is a true metric distance. | ||
The distance function can disobey the triangle inequality and/or be even non-symmetric. | ||
|
||
Two retrieval tasks are typically considered: a **nearest-neighbor** and a range search. | ||
Currently, we mostly support only the nearest-neighbor search, | ||
which aims to find the object at the smallest distance from the query. | ||
Its direct generalization is the _k_ nearest-neighbor search (the _k_-NN search), | ||
which looks for the _k_ closest objects, which | ||
have _k_ smallest distance values to the query _q_. | ||
|
||
In generic spaces, the distance is not necessarily symmetric. | ||
Thus, two types of queries can be considered. | ||
In a **left** query, the object is the left argument of the distance function, | ||
while the query is the right argument. | ||
In a **right** query, the query _q_ is the first argument and the object is the second, i.e., the right, argument. | ||
|
||
The queries can be answered either exactly, | ||
i.e., by returning a complete result set that does not contain erroneous elements, or, | ||
approximately, e.g., by finding only some neighbors. | ||
Thus, the methods are evaluated in terms of efficiency-effectiveness trade-offs | ||
rather than merely in terms of their efficiency. | ||
One common effectiveness metric is recall, | ||
which is computed as | ||
an average fraction of true neighbors returned by the method (with ties broken arbitrarily). | ||
|
||
# Documentation Links | ||
|
||
* [Python bindings overview](/python_bindings) and [Python bindings API](https://nmslib.github.io/nmslib/index.html) | ||
* [A Brief List of Methods and Parameters](/manual/parameters.md) | ||
* [A brief list of supported spaces/distance](/manual/spaces.md) | ||
* [Building the main library](/manual/build.md) | ||
* [Building and using the query server](/manual/query_server.md) | ||
* [Extending the library](/manual/extensions.md) | ||
* [A more detailed and formal description of methods and spaces (PDF)](/manual/latex/manual.pdf) | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,93 @@ | ||
#Building the main library on Linux/Mac | ||
|
||
##Prerequisites | ||
|
||
1. A modern compiler that supports C++11: G++ 4.7, Intel compiler 14, Clang 3.4, or Visual Studio 14 (version 12 can probably be used as well, but the project files need to be downgraded). | ||
2. **64-bit** Linux is recommended, but most of our code builds on **64-bit** Windows and MACOS as well. | ||
3. Only for Linux/MACOS: CMake (GNU make is also required) | ||
4. An Intel or AMD processor that supports SSE 4.2 is recommended | ||
5. Extended version of the library requires a development version of the following libraries: Boost, GNU scientific library, and Eigen3. | ||
|
||
To install additional prerequisite packages on Ubuntu, type the following | ||
|
||
``` | ||
sudo apt-get install libboost-all-dev libgsl0-dev libeigen3-dev | ||
``` | ||
|
||
##Quick Start on Linux/Mac | ||
|
||
To compile, go to the directory **similarity_search** and type: | ||
``` | ||
cmake . | ||
make | ||
``` | ||
To build an extended version (need extra library): | ||
``` | ||
cmake . -DWITH_EXTRAS=1 | ||
make | ||
``` | ||
|
||
##Quick Start on Windows | ||
|
||
Building on Windows requires [Visual Studio 2015 Express for Desktop](https://www.visualstudio.com/en-us/downloads/download-visual-studio-vs.aspx) and [CMake for Windows](https://cmake.org/download/). First, generate Visual Studio solution file for 64 bit architecture using CMake **GUI**. You have to specify both the platform and the version of Visual Studio. Then, the generated solution can be built using Visual Studio. **Attention**: this way of building on Windows is not well tested yet. We suspect that there might be some issues related to building truly 64-bit binaries. | ||
|
||
##Additional Building Details | ||
|
||
Here we cover a few details on choosing the compiler, | ||
a type of the release, and manually pointing to the location | ||
of Boost libraries (to build with extras). | ||
|
||
The compiler is chosen by setting two environment variables: ``CXX`` and ``CC``. In the case of GNU | ||
C++ (version 8), you may need to type: | ||
``` | ||
export CCX=g++-8 CC=gcc-8 | ||
``` | ||
|
||
To create makeles for a release version of the code, type: | ||
``` | ||
cmake -DCMAKE_BUILD_TYPE=Release . | ||
``` | ||
|
||
If you did not create any makeles before, you can shortcut by typing: | ||
``` | ||
cmake . | ||
``` | ||
|
||
To create makeles for a debug version of the code, type: | ||
``` | ||
cmake -DCMAKE_BUILD_TYPE=Debug . | ||
``` | ||
|
||
When makeles are created, just type: | ||
|
||
```make``` | ||
|
||
**Important note**: a shortcut command: | ||
``cmake .`` | ||
(re)-creates makeles for the previously created build. When you type ``cmake .`` | ||
for the first time, it creates release makefiles. However, if you create debug | ||
makefiles and then type ``cmake .``, this will not lead to creation of release makefiles! | ||
To prevent this, you need to to delete the cmake cache and makefiles, before | ||
running cmake. For example, you can do the following (assuming the | ||
current directory is similarity search): | ||
|
||
``` | ||
rm -rf `find . -name CMakeFiles CMakeCache.txt` | ||
``` | ||
|
||
Also note that, for some reason, cmake might sometimes ignore environmental | ||
variables ``CXX`` and ``CC``. In this unlikely case, you can specify the compiler directly | ||
through cmake arguments. For example, in the case of the GNU C++ and the | ||
release build, this can be done as follows: | ||
|
||
``` | ||
cmake -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_COMPILER=g++-8 \ | ||
-DCMAKE_GCC_COMPILER=gcc-8 CMAKE_CC_COMPILER=gcc-8 . | ||
``` | ||
|
||
Finally, if cmake cannot find the Boost libraries, their location can be specified | ||
manually as follows: | ||
|
||
``` | ||
export BOOST_ROOT=$HOME/boost_download_dir | ||
``` |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
Oops, something went wrong.