EdgeDRNN: Enabling Low-latency Recurrent Neural Network Edge Inference

Gao, Chang; Ríos Navarro, José Antonio; Chen, Xi; Delbruck, Tobi; Liu, Shih-Chii

Published April 3, 2023 | Version v1

Publication Metadata-only

EdgeDRNN: Enabling Low-latency Recurrent Neural Network Edge Inference

Contributors

Other:

Universidad de Sevilla. Departamento de Arquitectura y Tecnología de Computadores

This paper presents a Gated Recurrent Unit (GRU) based recurrent neural network (RNN) accelerator called Edge-DRNN designed for portable edge computing. EdgeDRNN adopts the spiking neural network inspired delta network algorithm to exploit temporal sparsity in RNNs. It reduces off-chip memory access by a factor of up to 10x with tolerable accuracy loss. Experimental results on a 10 million parameter 2-layer GRU-RNN, with weights stored in DRAM, show that EdgeDRNN computes them in under 0.5 ms. With 2.42 W wall plug power on an entry level USB powered FPGA board, it achieves latency comparable with a 92 W Nvidia 1080 GPU. It outperforms NVIDIA Jetson Nano, Jetson TX2 and Intel Neural Compute Stick 2 in latency by 6X. For a batch size of 1, EdgeDRNN achieves a mean effective throughput of 20.2 GOp/s and a wall plug power efficiency that is over 4X higher than all other platforms.

Additional details

URL: https://idus.us.es/handle//11441/143870
URN: urn:oai:idus.us.es:11441/143870

	All versions	This version
Views	0	0
Downloads	0	0
Data volume	0 Bytes	0 Bytes

EdgeDRNN: Enabling Low-latency Recurrent Neural Network Edge Inference

Creators

Contributors

Other:

Description

Additional details

Identifiers