Download PDFOpen PDF in browser

DLAU: A Scalable Deep Learning Accelerator Unit on FPGA

EasyChair Preprint no. 2206

5 pagesDate: December 19, 2019


As the emerging field of machine learning, deep learning shows excellent ability in solving complex learning problems. However, the size of the networks becomes increasingly large scale due to the demands of the practical applications, which poses significant challenge to construct a high performance implementations of deep learning neural networks. In order to improve the performance as well to maintain the low power cost, in this paper we design DLAU, which is a scalable accelerator architecture for large-scale deep learning networks using FPGA as the hardware prototype. The DLAU accelerator employs three pipelined processing units to improve the throughput and utilizes tile techniques to explore locality for deep learning applications. Experimental results on the state-of-the-art Xilinx FPGA board demonstrate that the DLAU accelerator is able to achieve up to 36.1x speedup comparing to the Intel Core2 processors, with the power consumption at 234mW.

Keyphrases: Accelerator, FPGA, Speedup

BibTeX entry
BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:
  author = {Vajja Paramesh},
  title = {DLAU: A Scalable Deep Learning Accelerator Unit on FPGA},
  howpublished = {EasyChair Preprint no. 2206},

  year = {EasyChair, 2019}}
Download PDFOpen PDF in browser