Accelerating Markov Random Field Inference with Uncertainty Quantification

08/02/2021
by   Ramin Bashizade, et al.
7

Statistical machine learning has widespread application in various domains. These methods include probabilistic algorithms, such as Markov Chain Monte-Carlo (MCMC), which rely on generating random numbers from probability distributions. These algorithms are computationally expensive on conventional processors, yet their statistical properties, namely interpretability and uncertainty quantification (UQ) compared to deep learning, make them an attractive alternative approach. Therefore, hardware specialization can be adopted to address the shortcomings of conventional processors in running these applications. In this paper, we propose a high-throughput accelerator for Markov Random Field (MRF) inference, a powerful model for representing a wide range of applications, using MCMC with Gibbs sampling. We propose a tiled architecture which takes advantage of near-memory computing, and memory optimizations tailored to the semantics of MRF. Additionally, we propose a novel hybrid on-chip/off-chip memory system and logging scheme to efficiently support UQ. This memory system design is not specific to MRF models and is applicable to applications using probabilistic algorithms. In addition, it dramatically reduces off-chip memory bandwidth requirements. We implemented an FPGA prototype of our proposed architecture using high-level synthesis tools and achieved 146MHz frequency for an accelerator with 32 function units on an Intel Arria 10 FPGA. Compared to prior work on FPGA, our accelerator achieves 26X speedup. Furthermore, our proposed memory system and logging scheme to support UQ reduces off-chip bandwidth by 71 two applications. ASIC analysis in 15nm shows our design with 2048 function units running at 3GHz outperforms GPU implementations of motion estimation and stereo vision on Nvidia RTX2080Ti by 120X-210X, occupying only 7.7 area.

READ FULL TEXT

page 1

page 3

page 5

page 6

page 7

page 9

page 12

research
12/01/2018

DeCoILFNet: Depth Concatenation and Inter-Layer Fusion based ConvNet Accelerator

Convolutional Neural Networks (CNNs) are rapidly gaining popularity in v...
research
09/22/2021

Sextans: A Streaming Accelerator for General-Purpose Sparse-Matrix Dense-Matrix Multiplication

Sparse-Matrix Dense-Matrix multiplication (SpMM) is the key operator for...
research
09/20/2020

The Ultimate DataFlow for Ultimate SuperComputers-on-a-Chips

This article starts from the assumption that near future 100BTransistor ...
research
03/05/2020

Beyond Application End-Point Results: Quantifying Statistical Robustness of MCMC Accelerators

Statistical machine learning often uses probabilistic algorithms, such a...
research
10/27/2019

A Case for Quantifying Statistical Robustness of Specialized Probabilistic AI Accelerators

Statistical machine learning often uses probabilistic algorithms, such a...
research
10/15/2019

The Memory Controller Wall: Benchmarking the Intel FPGA SDK for OpenCL Memory Interface

Supported by their high power efficiency and recent advancements in High...
research
04/12/2023

CMOS + stochastic nanomagnets: heterogeneous computers for probabilistic inference and learning

Extending Moore's law by augmenting complementary-metal-oxide semiconduc...

Please sign up or login with your details

Forgot password? Click here to reset