Yeshwanth Cherapanamjeri ; Jelani Nelson - Terminal Embeddings in Sublinear Time

theoretics:9167 - TheoretiCS, March 14, 2024, Volume 3 - https://doi.org/10.46298/theoretics.24.6
Terminal Embeddings in Sublinear TimeArticle

Authors: Yeshwanth Cherapanamjeri ; Jelani Nelson

    Recently (Elkin, Filtser, Neiman 2017) introduced the concept of a {\it terminal embedding} from one metric space $(X,d_X)$ to another $(Y,d_Y)$ with a set of designated terminals $T\subset X$. Such an embedding $f$ is said to have distortion $\rho\ge 1$ if $\rho$ is the smallest value such that there exists a constant $C>0$ satisfying \begin{equation*} \forall x\in T\ \forall q\in X,\ C d_X(x, q) \le d_Y(f(x), f(q)) \le C \rho d_X(x, q) . \end{equation*} When $X,Y$ are both Euclidean metrics with $Y$ being $m$-dimensional, recently (Narayanan, Nelson 2019), following work of (Mahabadi, Makarychev, Makarychev, Razenshteyn 2018), showed that distortion $1+\epsilon$ is achievable via such a terminal embedding with $m = O(\epsilon^{-2}\log n)$ for $n := |T|$. This generalizes the Johnson-Lindenstrauss lemma, which only preserves distances within $T$ and not to $T$ from the rest of space. The downside of prior work is that evaluating their embedding on some $q\in \mathbb{R}^d$ required solving a semidefinite program with $\Theta(n)$ constraints in~$m$ variables and thus required some superlinear $\mathrm{poly}(n)$ runtime. Our main contribution in this work is to give a new data structure for computing terminal embeddings. We show how to pre-process $T$ to obtain an almost linear-space data structure that supports computing the terminal embedding image of any $q\in\mathbb{R}^d$ in sublinear time $O^* (n^{1-\Theta(\epsilon^2)} + d)$. To accomplish this, we leverage tools developed in the context of approximate nearest neighbor search.


    Volume: Volume 3
    Published on: March 14, 2024
    Accepted on: January 28, 2024
    Submitted on: March 4, 2022
    Keywords: Computer Science - Data Structures and Algorithms,Computer Science - Computational Geometry,Computer Science - Machine Learning,Statistics - Machine Learning
    Funding:
      Source : OpenAIRE Graph
    • AF: Small: Collaborative Research: Dynamic data structures for vectors and graphs in sublinear memory; Funder: National Science Foundation; Code: 1951384

    Classifications

    Mathematics Subject Classification 20201

    Consultation statistics

    This page has been seen 202 times.
    This article's PDF has been downloaded 219 times.