AICAS 2020 Special Session: AI accelerators with Memristive Technology

Low Power In-Memory Implementation of Ternary Neural Networks with Resistive RAM-Based Synapse



.et

ceatect

C<sub>2</sub>N

<sup>1</sup>C2N, Univ. Paris-Saclay, CNRS, Palaiseau, France <sup>2</sup>IM2NP, Univ. Aix-Marseille et Toulon, CNRS, France <sup>3</sup>CEA, LETI, Grenoble, France

# Memristive technology promising for Neuromorphic Computing



- Fast, non-volatile memory that can be embedded at the core of CMOS
- Memory state is the electrical resistance of the device (high or low)
- Many variations (oxide, phase change, magnetoresistive)
- In industry test production (Samsung, TSMC, Intel, ST Microelectronics...)

#### However, challenge of device imperfections/variations

# Low precision Neural Networks





Previous work: implementation of Binarized weights

#### Low precision Neural Networks





In this work: implementation of Ternarized weights

- 1. Hybrid CMOS/Resistive RAM experimental implementation of ternarized weight using a precharge sense amplifier in the low supply voltage regime
- 2. PyTorch simulations demonstrating that Ternarized Neural Networks consistently outperform Binarized Neural Networks
- 3. Demonstration of the network **robustness to the device imperfections** in the system

# Our kilobit array

Our die





fabricated 130 nm RRAM/CMOS hybrid chip

HfO<sub>2</sub> RRAM

# **Background: two Resistive RAM devices as one synapse**

#### Peripheric circuit to differentiate resistance states



#### **Background: two Resistive RAM devices as one synapse**



Hirtzlin et al., Frontiers in Neuroscience, 2020

**Device pairs** programmed in a complementary fashion reduce error rate without computation overhead

#### In this work: encode a third state with HRS/HRS





#### No memory overhead

# The Sense converges slowly when operated with low supply voltage



This work: leverage the speed of the Sense to store a new value

# The Sense converges slowly when operated with low supply voltage



#### Implementation of ternary weights

Experimental data



No memory overhead and in a single Sense operation

# A behavior magnified in the low supply voltage regime



A behavior magnified in low supply voltage regime of the Sense -> Better for energy efficiency

# **Ternarized Neural Networks (TNNs) outperform Binarized Neural Networks**



**PyTorch simulations** 

# Device pairs programmed in the stochastic area create new errors

Experimental Distribution of the LRS and HRS



- SET compliance: 200µA
- RESET voltage: 2.5V
- Programming pulses: 100µs

New type of errors: 0 can be read as  $\pm 1$ 

#### TNNs are resilient to errors due to device variability





TNNs still outperform BNNs when devices errors are taken into account

- Impact of this work: best envisionned for **low-power**, **high-performing** dedicated hardware for **edge intelligence** (wireless sensors, medical applications...)
- Low supply voltage regimes can give room for new functionalities

• Device imperfection should be embraced rather than fought against

# Thank you for your attention!

Fundings:





European Research Council Established by the European Commission

# However, device to device variation is a challenge



### Appealing for low precision neural networks

#### **Example of a BNN implemention**



## Where do errors come from ?

Process Voltage Temperature variation analysis:

