EdgeDRNN: Enabling Low-latency Recurrent Neural Network Edge Inference
Description
This paper presents a Gated Recurrent Unit (GRU) based recurrent neural network (RNN) accelerator called Edge-DRNN designed for portable edge computing. EdgeDRNN adopts the spiking neural network inspired delta network algorithm to exploit temporal sparsity in RNNs. It reduces off-chip memory access by a factor of up to 10x with tolerable accuracy loss. Experimental results on a 10 million parameter 2-layer GRU-RNN, with weights stored in DRAM, show that EdgeDRNN computes them in under 0.5 ms. With 2.42 W wall plug power on an entry level USB powered FPGA board, it achieves latency comparable with a 92 W Nvidia 1080 GPU. It outperforms NVIDIA Jetson Nano, Jetson TX2 and Intel Neural Compute Stick 2 in latency by 6X. For a batch size of 1, EdgeDRNN achieves a mean effective throughput of 20.2 GOp/s and a wall plug power efficiency that is over 4X higher than all other platforms.
Additional details
- URL
- https://idus.us.es/handle//11441/143870
- URN
- urn:oai:idus.us.es:11441/143870
- Origin repository
- USE