A Brief Introduction to The DianNao Project

Before the DianNao Project

In 2010, during a keynote at ISCA, Prof. Temam outlined that a remarkable convergence of trends in technology, applications and machine-learning, were pointing to machine-learning accelerators as a very attractive scalability path for micro-architectures.

At ISCA in 2012, Prof. Temam proposed a first machine-learning accelerator design, showing that it was possible to achieve high performance with a small area and power footprint on a large set of neural network based applications. The main limitation of that accelerator was the memory bandwidth.

The DianNao Project

The goal of the DianNao research project was to develop accelerator architectures for machine-learning. The project was an academic collaboration between Prof. Yunji Chen (ICT) and Prof. Olivier Temam (Inria), within a joint ICT/Inria lab.

The academic collaboration between Prof. Temam and Prof. Chen started with the second accelerator, called DianNao (the first member of DianNao family). This accelerator extended the ISCA 2012 accelerator with local memories in order to capture the locality properties of deep neural networks and overcome memory bandwidth limitations. This design was published at ASPLOS in 2014 and got the best paper award.

The second accelerator of the DianNao family was a multi-chip version of DianNao and had two main goals: show that machine-learning accelerators have excellent scalability properties thanks to the partitioning properties of neural network layers, and to aggregate enough memory capacity to store the whole machine-learning model on-chip in order to overcome memory bandwidth limitations. This design, called DaDianNao, was published at MICRO in 2014 and got the best paper award.

As another way to overcome memory bandwidth limitations in embedded applications, we have also shown that such machine-learning accelerators can be directly connected to a sensor, bypassing memory. We applied this to a vision sensor, leading to the design called ShiDianNao (the third member of DianNao family), and published at ISCA in 2015.

Finally, we also demonstrated that the application scope of such accelerators can be extended to multiple machine-learning algorithms because they share common primitives; the corresponding design, called PuDianNao (the fourth and final member of DianNao family) was published at ASPLOS 2015.

After the DianNao Project

Prof. Chen and his ICT team designed an Instruction Set Architecture (ISA) for a broad range of neural network accelerators, called Cambricon. This ISA design was published at ISCA 2016, and got the highest score in peer review.





DianNao学术项目的目标是面向机器学习研究加速器架构。本项目是中科院计算所的陈云霁教授和法国Inria的Olivier Temam间的一个学术合作项目,双方为此设立了联合实验室。







Site Meter