Orthogonal Matrices for MBAT Vector Symbolic Architectures, and a "Soft" VSA Representation for JSON

by   Stephen I. Gallant, et al.

Vector Symbolic Architectures (VSAs) give a way to represent a complex object as a single fixed-length vector, so that similar objects have similar vector representations. These vector representations then become easy to use for machine learning or nearest-neighbor search. We review a previously proposed VSA method, MBAT (Matrix Binding of Additive Terms), which uses multiplication by random matrices for binding related terms. However, multiplying by such matrices introduces instabilities which can harm performance. Making the random matrices be orthogonal matrices provably fixes this problem. With respect to larger scale applications, we see how to apply MBAT vector representations for any data expressed in JSON. JSON is used in numerous programming languages to express complex data, but its native format appears highly unsuited for machine learning. Expressing JSON as a fixed-length vector makes it readily usable for machine learning and nearest-neighbor search. Creating such JSON vectors also shows that a VSA needs to employ binding operations that are non-commutative. VSAs are now ready to try with full-scale practical applications, including healthcare, pharmaceuticals, and genomics. Keywords: MBAT (Matrix Binding of Additive Terms), VSA (Vector Symbolic Architecture), HDC (Hyperdimensional Computing), Distributed Representations, Binding, Orthogonal Matrices, Recurrent Connections, Machine Learning, Search, JSON, VSA Applications


page 1

page 2

page 3

page 4


Exploiting Modern Hardware for High-Dimensional Nearest Neighbor Search

Many multimedia information retrieval or machine learning problems requi...

Anti-sparse coding for approximate nearest neighbor search

This paper proposes a binarization scheme for vectors of high dimension ...

Bolt: Accelerated Data Mining with Fast Vector Compression

Vectors of data are at the heart of machine learning and data mining. Re...

Topics in Random Matrices and Statistical Machine Learning

This thesis consists of two independent parts: random matrices, which fo...

Custom 8-bit floating point value format for reducing shared memory bank conflict in approximate nearest neighbor search

The k-nearest neighbor search is used in various applications such as ma...

Search Efficient Binary Network Embedding

Traditional network embedding primarily focuses on learning a dense vect...

The Unreasonable Effectiveness of Structured Random Orthogonal Embeddings

We examine a class of embeddings based on structured random matrices wit...

Please sign up or login with your details

Forgot password? Click here to reset