LSTM classkeras.layers.LSTM(
units,
activation="tanh",
recurrent_activation="sigmoid",
use_bias=True,
kernel_initializer="glorot_uniform",
recurrent_initializer="orthogonal",
bias_initializer="zeros",
unit_forget_bias=True,
kernel_regularizer=None,
recurrent_regularizer=None,
bias_regularizer=None,
activity_regularizer=None,
kernel_constraint=None,
recurrent_constraint=None,
bias_constraint=None,
dropout=0.0,
recurrent_dropout=0.0,
seed=None,
return_sequences=False,
return_state=False,
go_backwards=False,
stateful=False,
unroll=False,
use_cudnn="auto",
**kwargs
)
Long Short-Term Memory layer - Hochreiter 1997.
Based on available runtime hardware and constraints, this layer will choose different implementations (cuDNN-based or backend-native) to maximize the performance. If a GPU is available and all the arguments to the layer meet the requirement of the cuDNN kernel (see below for details), the layer will use a fast cuDNN implementation when using the TensorFlow backend. The requirements to use the cuDNN implementation are:
activation == tanhrecurrent_activation == sigmoidrecurrent_dropout == 0unroll is Falseuse_bias is TrueFor example:
>>> inputs = np.random.random((32, 10, 8))
>>> lstm = keras.layers.LSTM(4)
>>> output = lstm(inputs)
>>> output.shape
(32, 4)
>>> lstm = keras.layers.LSTM(
... 4, return_sequences=True, return_state=True)
>>> whole_seq_output, final_memory_state, final_carry_state = lstm(inputs)
>>> whole_seq_output.shape
(32, 10, 4)
>>> final_memory_state.shape
(32, 4)
>>> final_carry_state.shape
(32, 4)
Arguments
tanh).
If you pass None, no activation is applied
(ie. "linear" activation: a(x) = x).sigmoid).
If you pass None, no activation is applied
(ie. "linear" activation: a(x) = x).True), whether the layer
should use a bias vector.kernel weights matrix,
used for the linear transformation of the inputs. Default:
"glorot_uniform".recurrent_kernel
weights matrix, used for the linear transformation of the recurrent
state. Default: "orthogonal"."zeros".True). If True,
add 1 to the bias of the forget gate at initialization.
Setting it to True will also force bias_initializer="zeros".
This is recommended in Jozefowicz et al.kernel weights
matrix. Default: None.recurrent_kernel weights matrix. Default: None.None.None.kernel weights
matrix. Default: None.recurrent_kernel weights matrix. Default: None.None.False.False.False).
If True, process the input sequence backwards and return the
reversed sequence.False). If True, the last state
for each sample at index i in a batch will be used as initial
state for the sample of index i in the following batch.True, the network will be unrolled,
else a symbolic loop will be used.
Unrolling can speed-up a RNN,
although it tends to be more memory-intensive.
Unrolling is only suitable for short sequences."auto" will
attempt to use cuDNN when feasible, and will fallback to the
default implementation if not.Call arguments
(batch, timesteps, feature).(samples, timesteps) indicating whether
a given timestep should be masked (optional).
An individual True entry indicates that the corresponding timestep
should be utilized, while a False entry indicates that the
corresponding timestep should be ignored. Defaults to None.dropout or
recurrent_dropout is used (optional). Defaults to None.None causes creation
of zero-filled initial state tensors). Defaults to None.