Brain-inspired Computing based on emerging Non-volatile Memory Devices (IV) - Xin Zhang, Ph.D
1. Device-level characteristics of synaptic devices
1.1 Expected Features
(1) Asymmetry and linearity in weight updating
The linearity in weight updating refers to the linearity of the curve between the device conductance and the same number of programming pulses. Ideally, this should be a linear and symmetric relationship that maps the weights in the algorithm directly to the conductance in the device. However, the reality is that there are nonlinear problems in weight updating of resistive synaptic devices. The asymmetry also results from the fact that the trajectory of the weight increase [long-term potentiation (LTP)] is different from that of the weight decrease [long-term depression (LTD)]. The conductance changes rapidly at the beginning of the process but tends to saturate at the end. Figure 2(a) shows an example of the conductance of TaOx/TiO2 device under the same programming pulse . This nonlinearity/asymmetry is undesirable because the change in weight (δ W) depends on the current weight (W), or in other words, weight updates are historically dependent. Recent research results show that such nonlinearity/asymmetry leads to the loss of learning accuracy of neural networks . There are strategies to improve linearity by optimizing programming schemes. As shown in Figure 2(b), the same pulse pair (larger pulse followed by smaller reverse polarity pulse) can improve the nonlinearity of TaOx/TiO2 device. Non-identical pulses with different pulse widths can further improve the nonlinearity of TaOx/TiO2 devices, as shown in FIG. 2(c). However, from the point of view of the peripheral circuit, the different pulse generation requires very important design work because the amplitude or pulse width needs to be calibrated by reading the current conductance state before applying the programming pulse. This challenge makes it impractical to implement non-ideal pulse programming schemes on a chip. It is worth noting that the nonlinearity/asymmetry of weight update is a key problem in online training, which requires smooth and continuous conductance adjustment, while off-line training can mask the nonlinearity by iterative programming of write verification techniques.
Figure 1 Weight updating behavior (conductance vs pulse) of TaOx/TiO2 devices under different pulse schemes. Different pulses with different pulse widths can improve nonlinearity/asymmetry, but complicate peripheral circuit design. The picture is taken from reference .
(2) Energy consumption of programming
In biological synapses, the estimated energy cost of each synaptic event is about 1-10fj (1F =10-15). Most RRAM (resistive memory) devices have a programming energy of about 100fJ to 10pJ (1P =10-12), while most PCM (phase change memory) devices have a programming energy of perhaps even higher, 10 to 100 pJ. The fundamental challenge is that moving ions/defects in solid devices is much more difficult (and therefore requires much more energy) than moving calcium ions in the liquid environment of biological synapses. In biological synapses, the peak voltage is ~ 10mV, the ion current is ~ 1nA, and the peak period is ~1ms, thus the energy is about 10fJ. In resistive synaptic devices, the typical programming voltage is ~1V and the programming current is generally ~10μA. Although the programming speed can be accelerated to ~100ns, the energy is still on the order of pJ. Further device engineering is therefore required to reduce power consumption by increasing programming speeds to the ~ NS range while maintaining analog incremental conductance tuning capabilities.
(3) Retention and Endurance
In online training, weight updates frequently, so data retention requirements can be relaxed. Upon completion of training, resistive synaptic devices should represent long-term memory with a data retention time of approximately 10 years at the highest chip operating temperature (e.g. 85°C). The number of cycles of endurance is very application dependent, depending on how many weight updates are required during training. For a relatively simple task (i.e. MNIST handwritten digit recognition ), 60,000 training images with 50 training ephemeris (to be repeated) give a maximum weight update probability of 3´106 updates. Not every cycle of training actually updates every synapse, so endurance of ~104 cycles is sufficient for training on MNIST data sets. However, given the more challenging tasks, higher endurance may be required. It should be noted that the definition of endurance cycles in resistive synaptic devices is tricky because each weight update is usually a small incremental change in analog conductance tuning and, therefore, is different from a complete switch from an on state to an off state in binary eNVM.
(4) Consistency and variability
Poor consistency or variability of eNVMs is a major obstacle to digital memory applications. In contrast, neural networks are potentially robust to device changes. There are two mechanisms that can partially tolerate device changes: the establishment of numerous (and therefore potentially redundant) connections between neuron nodes through synaptic arrays, and the iterative weight updating process of online training devices. The degree of variation that can be tolerated at the system level depends largely on the network architecture and the accuracy required by the target application. Recent results show that device changes in different neural networks have reasonable robustness . However, for off-line training with write verification, the requirement of consistency is stricter because the network cannot adapt itself in the reasoning process.
1.2 Representative prototypes of materials, systems and devices
Over the past few years, many candidates for resistive synaptic devices with tens to hundreds of conductivity states have been confirmed at the individual device level. In addition to modeling conductance tuning capabilities, biologically realistic behaviors such as short-term memory, impulse facilitation, and peak-time dependent plasticity have been simulated in a variety of devices, Including Ag/Ag2S based CBRAM, Cu/Cu2S based CBRAM, Ag/GeS2 based CBRAM, Ag/Ge30Se70 based CBRAM, Ag/SiOxNy based CBRAM, TiOx based OxRAM, based on OxRAM of HfOx , OxRAM based on WOx , OxRAM based on TaOx, etc. . However, so far, it is not clear how these biologically-like characteristics facilitate system-level computation, and subsequent investigations will only investigate the simulated weight updating properties of devices that have been reported for implementing artificial neural networks. Figure 2 shows some representative key vendors laying out ENVM-based storage technologies and preparing for "in-memory computing" (brain-like computing) applications.
Figure 2. Key vendors lay out emerging storage technologies and prepare for "in-memory computing" (brain-like computing) applications.
 P.-Y. Chen et al., "Mitigating Effects of NonIdeal Synaptic Device Characteristics for On-chip Learning," In Proc. IEEE/ACM Int.conf.comput.-Aided Design (ICCAD), Nov. 2015, pp. 194 -- 199.
 G. W. Burr, R. M. Shelby, C. D. Nolfo, J. W. Jang, R. S. Shenoy, and P. Narayanan, "Experimental demonstration and tolerancing of a large-scale neural network (165,000 Synapses), Using Phase-change memory as a Synaptic weight Element, "in IEDM Tech. Dig., 2014.
 MNIST Handwritten Digits Dataset. [Online]. Available: http://yann.lecun.com/exdb/mnist/
 D. Garbin, "Variable-tolerant convolutional Neural Network for Pattern Recognition Applications based on OxRAM Synapses," in IEDM Tech. Dig., 2014.
 T. Ohno, T. Hasegawa, T. Tsuruoka, K. Terabe, J. K. Gimzewski, and M. Aono, "Short-term potentiation mimicking in single assignment synapses and long-term potentiation," Nature Mater., Vol. 10, pp. 591 -- 595, Aug. 2011.
 A. Nayak, "Plastic plasticity of A Cu2S gap-like atomic switch", Adv. Funct. Mater., Vol. 22, pp. 366 -- 3613, 2012.
 M. Suri, "CBRAM Devices as Binary Synapses for Low-power stochastic Neuromorphic Systems: Auditory (Cochlea) and Visual (Retina) Cognitive Processing Applications, "in IEDM Tech. Dig., 2012.
 D. Mahalanabis, H. J. Barnaby, Y. Gonzalez-Velo, M. N. Kozicki, S. Vrudhula, and P. Dandamudi, "Incremental Resistance Programming of Programmable metallization cells for use as electronic Synapses," Solid-state Electron., Vol. 100, pp. 39 -- 44, Oct. 2014.
 Z. Wang et al., "Memristors with Diffusive dynamics as Synaptic Emulators for Neuromorphic Computing," Nature Mater., Vol. 16, pp. 101-108, Mar. 2017.
 K. Seo et al., "Simulation of Plasma Resistive behavior in Analog memory and Spiketiming -dependent finite Element Modeling of a nanoscale titanium oxide bilayer Switching Device, "Nanotechnology, Vol. 22, No. 25, p. 254023, Jun. 2011.
 S. Yu, Y. Wu, R. Jeyasingh, D. Kuzum, and H.-S. P. Wong, "An Electronic Synapse Device Based on Metal Oxide Resistive switching Memory for Neuromorphic Computation," IEEE Trans. Electron Devices, Vol. 58, No. 8, pp. 2729 -- 2737, Aug. 2011.
 S. Ambrogio et al., "Neuromorphic Learning and Recognition with One-transient-oneresistor Synapses and Bistable metal oxide RRAM," IEEE Trans. Electron Devices, Vol. 63, No. 4, pp. 1508 -- 1515, Apr. 2016.
 T. Chang, S.-H. Jo, AND W. Lu, "Short-term memory to long-term memory transition in a nanoscale memristor," ACS Nano, Vol. 5, pp. 7669 -- 7676, Sep. 2011.
 C. Du, W. Ma, T. Chang, P. Sheridan, and W. D. Lu, "Bioretopical implementation of Synaptic functions with oxide memristors through internal Ionic Dynamics," Adv. Funct. Mater., Vol. 25, pp. 4290 -- 4299, Jun. 2015.
 S. Kim, C. Du, P. Sheridan, W. Ma, S. Choi, and W. D. Lu, "Experimental demonstration of a second-order memristor and its ability to biorealimplement Synaptic. A further study on the mechanism of plasticity of nanocomposites, Vol. 15, No. 3, pp. 223-2211, 2015.