This application claims foreign priority to European Patent Application No. EP 16199877.8, filed Nov. 21, 2016, the content of which is incorporated by reference herein in its entirety.
The disclosed technology generally relates to machine learning, and more particularly to integration of basic machine learning kernels in a semiconductor device.
Neural networks (NNs) are classification techniques used in the machine learning domain. Typical examples of such classifiers include multi-layer perceptrons (MLPs) or convolutional neural networks (CNNs).
Neural network (NN) architectures comprise layers of “neurons” (which are basically multiply-accumulate units), weights that interconnect them and particular layers, used for various operations, among which normalization or pooling. As such, the algorithmic foundations for these machine learning objects have been established.
The computation involved in training or running these classifiers has been facilitated using graphics processing units (GPUs) or customized application-specific integrated circuits (ASICs), for which dedicated software flows have been extensively developed.
Some software approaches have suggested the use of NNs, e.g., MLPs or CNNs, with binary weights and activations, showing minimal accuracy degradation of state-of-the-art classification benchmarks. The goal of such approaches is to enable neural network GPU kernels of smaller memory footprint and higher performance, given that the data structures exchanged from/to the GPU are aggressively reduced. However, these approaches have not demonstrated that they can efficiently reduce the high energy that is involved for each classification run on a GPU, e.g., the high energy associated with leakage energy component related to the storage of the NN weights. A benefit of assuming weights and activations of two possible values each (either +1 or −1) is that the multiply-accumulate operation (i.e., dot-product) that is typically encountered in NNs boils down to a popcount of element-wise XNOR or XOR operations.
As used herein, a dot-product or a scalar product is an algebraic operation that takes two equal-length sequences of numbers and returns a single number. A dot-product is very frequently used as a basic mathematical NN operation. At least at the inference phase (i.e., not during training), a wide range of machine learning implementations (e.g., MLPs or CNNs) can be decomposed to layers of dot-product operators, interleaved with simple arithmetic operations. Most of these implementations pertain to the classification of raw data (e.g., the assignment of a label to a raw data frame).
Dot-product operations are typically performed between values that depend on the NN input (e.g., a frame to be classified) and constant operands. The input-dependent operands are sometimes referred to as “activations.” For the case of MLPs, the constant operands are the weights that interconnect two MLP layers. For the case of CNNs, the constant operands are the filters that are convolved with the input activations or the weights of the final fully connected layer. A similar thing can be said for the simple arithmetic operations that are interleaved with the dot-products in the classifier: for example, normalization is a mathematical operation between the outputs of a hidden layer and constant terms that are fixed after training of the classifier.
It is an object of the disclosed technology to reduce energy requirements of classification operations.
The above objective is accomplished by a semiconductor cell, an array of semiconductor cells and a method of using at least one array of semiconductor cells, according to embodiments of the disclosed technology.
In a first aspect, the disclosed technology provides a semiconductor cell for performing a logic XNOR or XOR operation. the semiconductor cell comprises:
In a semiconductor cell according to embodiments of the disclosed technology, the switching unit may be arranged for being provided with both the stored first operand and a complement of the stored first operand and further with the received second operand and a complement of the received second operand to perform the logic operation. In such embodiments, the memory unit may comprise a first memory element and a second memory element, for storing the first operand and for storing the complement of the first operand, respectively.
In a semiconductor cell according to embodiments of the disclosed technology, the switching unit may comprise a first switch and a second switch for being controlled by the received second operand and the complement of the received second operand, respectively. Furthermore, each of the stored first operand and the complement of the stored first operand may be switchably connected through one of the first or second switch to a common node that is coupled to the readout port.
In a semiconductor cell according to embodiments of the disclosed technology, the memory unit may be a non-volatile memory unit. In particular embodiments, the non-volatile memory unit may comprise non-volatile memory elements supporting multi-level readout.
In a semiconductor cell according to embodiments of the disclosed technology, the switch unit may be implemented using vertical transistors, i.e., transistors which have a channel perpendicular to the wafer substrate, such as e.g., vertical field effect transistors (vFETs), vertical nanowires, vertical nanosheets, etc.
In a second aspect, the disclosed technology provides an array of cells logically organized in rows and columns, wherein the cells are semiconductor cells according to embodiments of the first aspect of the disclosed technology.
In embodiments of the disclosed technology, the array may furthermore comprise word lines and read bit lines, wherein the word lines are configured for delivering second operands to input ports of the semiconductor cells, and wherein the read bit lines are configured for receiving the outputs of the XNOR or XOR operations from the readout ports of the cells in the array connected to that read bit line.
An array according to embodiments of the disclosed technology may furthermore comprise a sensing unit shared between different cells of the array, for instance a sensing unit shared between different cells of a column of the array, such as between all cells of a column of the array.
An array according to embodiments of the disclosed technology may furthermore comprise a pre-processing unit for creating the second operand for at least one of the semiconductor cells in the array, e.g., for receiving a signal, and for creating therefrom the second operand.
In embodiments of the disclosed technology, the readout port of at least one semiconductor cell from at least one row and at least one column of the array may be read by at least one sensing unit configured to distinguish between at least two levels of a readout signal at the readout port of the at least one read semiconductor cell. The distinguishing between a plurality of levels of the readout signal may for instance be done by comparing the level of the readout signal with a plurality of reference signals.
An array according to embodiments of the disclosed technology may furthermore comprise at least one post-processing unit, for implementing at least one logical operation on at least one value read out of the array.
An array according to embodiments of the disclosed technology may, furthermore comprise allocation units for allocating subsets of the array to nodes of a directed graph.
In a third aspect, the disclosed technology provides a set comprising a plurality of arrays according to embodiments of the second aspect, wherein the arrays are connected to one another in a directed graph. The arrays form the nodes of the directed graph.
In a set according to embodiments of the disclosed technology, the arrays may be statically connected according to a directed graph. Alternatively, the arrays may be dynamically reconfigurable, in which cans the set may furthermore comprise intermediate routing units for reconfiguring connectivity between the arrays in the directed graph.
In a fourth aspect, the disclosed technology provides a 3D-array comprising at least two arrays according to any embodiments of the disclosed technology, wherein the semiconductor cells of respective arrays are physically stacked in layers one on top of the other. Different ways of stacking are possible, such as for example wafer stacking, monolithic processing of transistors on the same wafer, provision of an interposer, etc.
In a fifth aspect, the disclosed technology provides a method of using at least one array of semiconductor cells according to embodiments of the second aspect, for the implementation of a neural network. The method comprises storing layer weights as the first operands of each of the semiconductor cells, and providing layer activations as the second operands of each of the semiconductor cells.
In a specific method according to embodiments of the disclosed technology, for implementation of MLP, the first operands are weights that interconnect two MLP layers and the second operands are input-dependent activations.
In a specific method according to embodiments of the disclosed technology, for implementation of CNN, the first operands are filters that are convolved with the second operands that are input-dependent activations.
A method according to embodiments of the disclosed technology may use, for the implementation of the neural network, as arrays of semiconductor cells at least an input layer, an output layer, and at least one intermediate layer. The method may further comprise performing one or more algebraic operations to values of the at least one intermediate layer of the implemented NN; for instance including, but not limited to, normalization, pooling, and non-linearity operations.
In a sixth aspect, the disclosed technology provides a method of operating a neural network, implemented by at least one array of semiconductor cells according to embodiments of the second aspect of the disclosed technology, wherein operating the neural network is done in a clocked regime, the XNOR or XOR operation within a semiconductor cell of the at least one array being completed within one or more clock cycles.
Particular and preferred aspects of the invention are set out in the accompanying independent and dependent claims. Features from the dependent claims may be combined with features of the independent claims and with features of other dependent claims as appropriate and not merely as explicitly set out in the claims.
For purposes of summarizing the invention and the advantages achieved over the prior art, certain objects and advantages of the invention have been described herein above. Of course, it is to be understood that not necessarily all such objects or advantages may be achieved in accordance with any particular embodiment of the invention. Thus, for example, those skilled in the art will recognize that the invention may be embodied or carried out in a manner that achieves or optimizes one advantage or group of advantages as taught herein without necessarily achieving other objects or advantages as may be taught or suggested herein.
The above and other aspects of the invention will be apparent from and elucidated with reference to the embodiment(s) described hereinafter.
The invention will now be described further, by way of example, with reference to the accompanying drawings, in which:
The drawings are only schematic and are non-limiting. In the drawings, the size of some of the elements may be exaggerated and not drawn on scale for illustrative purposes. The dimensions and the relative dimensions do not necessarily correspond to actual reductions to practice of the invention.
Any reference signs in the claims shall not be construed as limiting the scope.
In the different drawings, the same reference signs refer to the same or analogous elements.
The disclosed technology will be described with respect to particular embodiments and with reference to certain drawings but the invention is not limited thereto but only by the claims.
The terms first, second and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a sequence, either temporally, spatially, in ranking or in any other manner. It is to be understood that the terms so used are interchangeable under appropriate circumstances and that the embodiments of the invention described herein are capable of operation in other sequences than described or illustrated herein.
Moreover, directional terminology such as top, bottom, front, back, leading, trailing, under, over and the like in the description and the claims is used for descriptive purposes with reference to the orientation of the drawings being described, and not necessarily for describing relative positions. Because components of embodiments of the disclosed technology can be positioned in a number of different orientations, the directional terminology is used for purposes of illustration only, and is in no way intended to be limiting, unless otherwise indicated. It is, hence, to be understood that the terms so used are interchangeable under appropriate circumstances and that the embodiments of the invention described herein are capable of operation in other orientations than described or illustrated herein.
It is to be noticed that the term “comprising”, used in the claims, should not be interpreted as being restricted to the means listed thereafter; it does not exclude other elements or steps. It is thus to be interpreted as specifying the presence of the stated features, integers, steps or components as referred to, but does not preclude the presence or addition of one or more other features, integers, steps or components, or groups thereof. Thus, the scope of the expression “a device comprising means A and B” should not be limited to devices consisting only of components A and B. It means that with respect to the disclosed technology, the only relevant components of the device are A and B.
Reference throughout this specification to “one embodiment” or “an embodiment” means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the disclosed technology. Thus, appearances of the phrases “in one embodiment” or “in an embodiment” in various places throughout this specification are not necessarily all referring to the same embodiment, but may. Furthermore, the particular features, structures or characteristics may be combined in any suitable manner, as would be apparent to one of ordinary skill in the art from this disclosure, in one or more embodiments.
Similarly it should be appreciated that in the description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. This method of disclosure, however, is not to be interpreted as reflecting an intention that the claimed invention requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Furthermore, while some embodiments described herein include some but not other features included in other embodiments, combinations of features of different embodiments are meant to be within the scope of the invention, and form different embodiments, as would be understood by those in the art. For example, in the following claims, any of the claimed embodiments can be used in any combination.
It should be noted that the use of particular terminology when describing certain features or aspects of the invention should not be taken to imply that the terminology is being re-defined herein to be restricted to include any specific characteristics of the features or aspects of the invention with which that terminology is associated.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the invention may be practiced without these specific details. In other instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
In embodiments of the disclosed technology, semiconductor cells are logically organized in rows and columns. Throughout this description, the terms “horizontal” and “vertical” (related to the terms “row” and “column”, respectively) are used to provide a co-ordinate system and for ease of explanation only. They do not need to, but may, refer to an actual physical direction of the device. Furthermore, the terms “column” and “row” are used to describe sets of array elements, in particular in the disclosed technology semiconductor cells, which are linked together. The linking can be in the form of a Cartesian array of rows and columns; however, the disclosed technology is not limited thereto. As will be understood by those skilled in the art, columns and rows can be easily interchanged and it is intended in this disclosure that these terms be interchangeable. Also, non-Cartesian arrays may be constructed and are included within the scope of the invention. Accordingly the terms “row” and “column” should be interpreted widely. To facilitate in this wide interpretation, the claims refer to logically organized in rows and columns. By this is meant that sets of semiconductor cells are linked together in a topologically linear intersecting manner; however, that the physical or topographical arrangement need not be so. For example, the rows may be circles and the columns radii of these circles and the circles and radii are described in this invention as “logically organized” rows and columns. Also, specific names of the various lines, e.g., word line and bit line, are intended to be generic names used to facilitate the explanation and to refer to a particular function and this specific choice of words is not intended to in any way limit the invention. It should be understood that all these terms are used only to facilitate a better understanding of the specific structure being described, and are in no way intended to limit the invention.
For the technical description of embodiments of the disclosed technology, the design enablement may be described in the context of a multi-layer perceptron (MLP) with binary weights and activations. It will be appreciated that, however, a similar description is valid, although it may not be written out in detail, for convolutional neural networks (CNNs), with the appropriate reordering of logic units and the designation of the memory unit as storing binary filter values, instead of binary weight values.
In the following, various embodiments relating to a semiconductor cell for performing one or more logic operations, e.g., an XNOR and/or an XOR operation, between a first and a second operand, is disclosed. While some embodiments may be described with respect to a discrete cell, it will be appreciated that they can be implemented in an array of semiconductor cells, in a set comprising a plurality of such arrays, and in a method of using at least one array of semiconductor cells for the implementation of a neural network.
In a first aspect, the disclosed technology relates to a semiconductor cell 100, as illustrated in
A semiconductor cell 100 according to embodiments of the disclosed technology further comprises a switch unit 103, communicatively coupled to the memory unit 101 and the input port unit 102, configured for implementing the XNOR and/or the XOR operation on the stored first and second operands, and a readout port 104 for transferring an output of the XNOR or XOR operation.
The signal at the readout port 104 can be buffered and/or inverted to achieve the desired logic function (XOR instead of XOR, or vice versa, by inverting).
In embodiments of the disclosed technology, the memory unit 101 can be a non-volatile memory unit, comprising one or more non-volatile memory elements, such as for instance, but not limited thereto, magnetic tunneling junction (MTJ), magnetic random access memory (MRAM), oxide-based resistive random oxide memory (OxRAM), vacancy-modulated conductive oxide (VMCO) memory, phase change memory (PCM) or conductive bridge random oxide memory (CBRAM) memory elements, to name a few. In alternative embodiments, the memory unit 101 can be a volatile memory unit, comprising one or more volatile memory elements, such as for instance, but not limited thereto, MOS-type memory elements, e.g., CMOS-type memory elements.
In the embodiment illustrated in
The switch unit 103 is a logic component which, in the embodiment illustrated, comprises a first switch 107 for being controlled by the received second operand A, and a second switch 108 for being controlled by a complement Abar of the received second operand. Both the second operand A and the complement Abar may be received. Alternatively, the second operand A may be received, and the complement Abar may be generated therefrom. The second operand may be an external binary activation. The first and second switches 107, 108 may be transistors, for instance field effect transistor (FETs). In particular embodiments, the switches may be vertical transistors, such as for instance vertical FETs. As described herein, vertical FETs refer to FETs in which current in the channel flows in a vertical direction or a layer normal direction to the substrate. By means of the first and second switches 107, 108, each of the stored first operand and the complement of the stored first operand is switchably connected to a common node that is coupled to the readout port 104, 404. The input-dependent binary activation A and its complement Abar are assigned accordingly as voltage pulses of the transistor gate nodes. This implements the XOR or XNOR function.
In particular embodiments, the first and second switches 107, 108 of the semiconductor cells 100, 400 may be vertical FETs. The memory elements 105, 106 may be formed vertically above the vertical FETs, as illustrated in
In some embodiments, the first and second switches 107, 108 may be n-type transistors, of which the sources may be connected to a conductive plane 901 that is grounded, as illustrated in
Using a sense unit 201, as illustrated in
In particular embodiments, the signal is a current signal, and a load resistance 209 may be used to enable readout of the XNOR signal as a voltage signal. This voltage can be measured at readout port 104, and it can be sensed in any suitable way. For instance, by using a sense amplifier 210, the output can be latched by any suitable latch element 211 to a final output node 212. The load resistance 209 can be any suitable type of resistance, such as for instance a pull-up resistance, a pull-down resistance, an active resistor, a passive resistor.
Alternatively, rather than a voltage, a current can be measured at the readout port 104, which can be sensed in any suitable way, for instance by using a transimpedance amplifier. The current signal at the readout port 104 can be brought to a final output node 212. It can be converted into a voltage signal.
It is an advantage of embodiments of the disclosed technology that a “wired OR” operation is present in the non-volatile implementation of the semiconductor cells according to the disclosed technology. For instance in the non-volatile memory case as in
In other embodiments, as illustrated in
Semiconductor cells 100, 400 according to embodiments of the disclosed technology can be used in the implementation of a neural network (NN). Hereto, the semiconductor cells 100, 400 are organized in an array, in which they are logically organized in rows and columns. The array may comprise word lines and bit lines, wherein the word lines are for instance running horizontally, and are configured for delivering second operands to input ports of the semiconductor cells, and wherein the bit lines are for instance running vertically, and are configured for receiving the outputs of the XNOR or XOR operations from the output ports. Preferably, the array may comprise more than one column and more than on row of semiconductor cells.
It is an advantage of an array of semiconductor cells according to embodiments of the disclosed technology that it reduces energy consumption of classification operations, by letting input-dependent values (NN activations) flow through arrays of pre-trained binary weights, with arithmetic operations performed as close to their operands as possible.
A sense unit 201, for instance comprising a load resistance 209, may be provided in each semiconductor cell 100, 400 for readout of the logic operation implemented in the cell. Alternatively, not illustrated in the drawings, a sense unit, for instance comprising a load resistance, may be shared between a number of semiconductor cells 100 defined at design time (e.g., but not limited thereto, among all cells in a column).
The signal, e.g., current or voltage, at the readout port 104 can be sensed using a sense amplifier 201, such as for instance, but not limited thereto, the one disclosed in S. Cosemans, W. Dehaene and F. Catthoor, “A 3.6 pJ/access 480 MHz, 128 Kbit on-Chip SRAM with 850 MHz boost mode in 90 nm CMOS with tunable sense amplifiers to cope with variability,” in Solid-State Circuits Conference, 2008. ESSCIRC 2008. 34th European, 2008. The relevant disclosure associated with the sense amplifier in Cosemans et al. is incorporated herein in its entirety. A representative schematic is illustrated in
Generally, sensing units 201 may be shared among multiple semiconductor cells 100. For instance, in a typical memory, multiple columns are using the same sense amplifier. This can be configured at design time, based on the semiconductor cell array dimensions.
In particular embodiments of an array of the disclosed technology, as illustrated in
The semiconductor cells of each array in the 3D structure comprise memory units which may be laid out in a memory unit layer, and switch units which may be laid out in a switch layer, e.g., a FET layer, according to embodiments. The sequence of layers in a 3D structure can be, but does not need to be, as illustrated in
As an example, a binarized neural network (BNN) software implementation (Courbariaux et al. CoRR 2016—https://arxiv.org/abs/1602.02830) is considered. Multiplication between a binary activation x and a binary weight w on the cell of
The semiconductor cell 100 suitable for implementing a binary multiplication leverages the equivalence between the numerical values of the BNN software assumptions as in the Courbariaux paper mentioned above (−1/+1), the logical values of digital logic (0/1), the resistance values of the MTJs (low resistive state (LRS)/high resistive state (HRS)) and the angle of the (out-of-plane) magnetization of the MTJ's free layer. The two MTJs 105, 106 of the cell 100 hold the binary weight value w and its complement
The respective SPICE simulation output can be seen in
From the above example, it can be seen how the XNOR cell 100 can operate within the well-established memory designs. It will be appreciated that the complementarity of activation signals x1 and
With proper signaling of word lines 1350, it is possible to route multiple readout values (from more than 1 read semiconductor cells) to the sense unit 1320, which should be designed to distinguish between the applicable input combinations. In
A NN-style classifier has a wide range of operands that remain constant during inference (classification). It is hence an advantage of semiconductor cells 100, 400 according to embodiments of the disclosed technology, and more in particular of such semiconductor cells 100, 400 arranged in an array 500, that such operands can be stored locally (in the memory unit 101, 401), while input-dependent activations can be routed to specific points of the classifier implementation, where computation takes place. Additionally, novel algorithmic flavors of NN-style classifiers are based on binary weights/filters and activations, further reducing the memory requirements of a software classifier implementation. In accordance with this trend, embodiments of the disclosed technology propose in-place operations for the dot-product stages of a classifier and post-processing units, such as for instance simple logic, to interconnect between classifier layers with simple math operations, as graphically illustrated in
In other embodiments, a traditional latching circuit may be used. In other embodiments, the dot-product layers can be mapped on an array of memory elements, whereby the control of each layer and any required mathematical operation is implemented outside the array in dedicated control units. In particular uses of a system according to embodiments of the disclosed technology, dot-product layers can be used to implement partial products of an extended mathematical operation, the partial products being reconciled in the peripheral control units of the memory element array.
An idea is to use the current system during inference, with weights and hyperparameters (such as μ, γ, σ′, and β) fixed after an offline training session. In the implementation illustrated in
The basic advantage of an implementation such as the above is that each semiconductor cell 100, 400 according to embodiments of the disclosed technology in a column produces the addends of the dot-product, namely all individual binary multiplications. Assuming that binary weights and activations are of values +1 and −1, and given their logical mapping to 1 and 0, the dot-product requires a popcount of the +1 (1 in logic) values across the semiconductor cells that contribute to the dot product. This will result to an integer value, which is the scalar activation of the respective neural network neuron. In these classifiers, neuron inputs are generally normalized and pass through a final nonlinearity (computing a non-linear activation function f(x), where x is the sum of XNOR operations of one or more columns of the array of cells) before being forwarded to the next layer of the neural network (either MLP or CNN). Examples of non-linear functions used in machine learning are, without being limited thereto, sigmoid, tan h, rectified linear unit (ReLU), among others.
A logic unit according to embodiments of the disclosed technology may implement the normalization, using trained parameters μ, γ, σ′, and β. Generally, the operation applied to the popcount output is of a double precision type and actually implements the following calculation, where x is the dot-product output:
In accordance with embodiments of the disclosed technology, the following data type refinements may be implemented in order to reduce the complexity of the logic units that stand between neural network layers. These are organized according to
As such, this approach aims at optimizing the inference using NNs (MLPs or CNNs), assuming pre-trained binary weights and hyperparameters. That way, NN classification models can be deployed on the field in low energy and state-of-the-art performance with the option of non-volatile storage of trained weights and hyperparameters, thus enabling rapid reboot times of the respective NN classification hardware modules.
The above technical description details a hardware implementation of an MLP, using binary NVM memory elements in memory units that locally perform an XNOR operation between the stored binary weight and a binary activation input. These XNOR outputs are then sensed by a sensing unit 504 and routed to a logic unit 503, where they are counted at the bottom of each row. In an implementation as illustrated in
The same building blocks, namely the dot-product engine and post-processing units like the logic units performing simple arithmetic operations like normalization and binarization non-linearity can be extended or rearranged to create CNN building blocks. These include dot-product kernels (to perform convolution between input activations and filters), batch normalization, pooling (which is effectively an aggregation operation) and binarization
One way to organize the layers of the dot-product arrays and the interleaving logic is the meandric layout view of Error! Reference source not found.
An alternative to this solution is a single, big array 700 of semiconductor cells according to embodiments of the disclosed technology that enable in-place binary products. On this large area, different sizes of dot-product layers are allocated and any layer interconnection, along with the associated normalization logic is implemented in peripheral controllers. An illustrative view of this arrangement can be seen in
Binary weights that connect neuron layers of the entire NN are allocated on different regions of a big semiconductor cell array 700 and dot-product output is aggregated on associated control units 705, 706 that are situated in the periphery of the semiconductor cell array 700. These units 705, 706 additionally perform normalization and forward the activations to the next NN layer, namely the respective peripheral control unit.
Still alternatively, a hybrid solution between an embodiment with a meandric layout, as for example illustrated for one implementation in
While the invention has been illustrated and described in detail in the drawings and foregoing description, such illustration and description are to be considered illustrative or exemplary and not restrictive. The foregoing description details certain embodiments of the invention. It will be appreciated, however, that no matter how detailed the foregoing appears in text, the invention may be practiced in many ways. The invention is not limited to the disclosed embodiments.
Number | Date | Country | Kind |
---|---|---|---|
16199877.8 | Nov 2016 | EP | regional |