Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
updated sigma tsne talk
  • Loading branch information
jet08013 committed Oct 2, 2019
1 parent cfeae7b commit 96e8b87
Show file tree
Hide file tree
Showing 3 changed files with 7 additions and 0 deletions.
1 change: 1 addition & 0 deletions .gitignore
Expand Up @@ -251,3 +251,4 @@ TSWLatexianTemp*
# emacs
*~
\#*\#

Binary file modified sigma/sigma.pdf
Binary file not shown.
6 changes: 6 additions & 0 deletions sigma/sigma.tex
Expand Up @@ -187,9 +187,11 @@ So we may view each image as a vector in $\R^{784}$.
\frametitle{The high-dimensional similarity}
The distance $P_{j|i}$ constructed above is not symmetric. To rectify this, the t-SNE algorithm symmetrizes it.
\bigskip\noindent

One way to think of this is to consider the fact that the relation ``$p$ is one of the $k$ points closest to $q$'' is not symmetric.
Making $P_{j|i}$ symmetric is an analytic way of creating the symmetric relation ``$p$ is one of the $k$ points closest to $q$, or vice versa.''
\bigskip\noindent

This brings outlier points into fuller consideration when constructing the low-dimensional map.
\end{frame}

Expand Down Expand Up @@ -288,6 +290,7 @@ There are both repulsive and attractive forces at work.
One can efficiently find the $k$ closest points using, for example, {\it vantage points trees.} These are a data structure specifically
designed for this purpose.
\bigskip\noindent

This makes $P$ sparse -- there are only a few non-zero entries in each row.
\end{frame}
\begin{frame}
Expand All @@ -298,6 +301,7 @@ For the gradient descent phase, one can use a variant of the Barnes-Hut techniqu
$$
where $q_{ij}Z=(1+\|y_i-y_j\|^2)^{-1}$ takes constant time to compute.
\bigskip\noindent

The first sum requires adding terms corresponding to non-zero entries in $p_{ij}$, which is sparse, so this takes time $O(N)$.
\end{frame}
\begin{frame}
Expand All @@ -306,6 +310,7 @@ For the gradient descent phase, one can use a variant of the Barnes-Hut techniqu
that if a bunch of points $y_i$ are close together, one may approximate their contribution to the force by replacing them with their
center of mass.
\bigskip\noindent

The BH algorithm partitions space into cubes that are small enough that the center of mass of the points in each cube are a good summary of the data.
\end{frame}
\begin{frame}
Expand All @@ -321,6 +326,7 @@ For the gradient descent phase, one can use a variant of the Barnes-Hut techniqu

Hinton and van der Maaten, Visualizing High Dimensional Data with t-SNE, Journal of Machine Learning Research, 2008
\bigskip\noindent

van der Maaten, Accelerating t-SNE using Tree-Based Algorithms, Journal of Machine Learning Reserach, 2014.
\end{frame}

Expand Down

0 comments on commit 96e8b87

Please sign in to comment.