LSTM classtf_keras.layers.LSTM(
units,
activation="tanh",
recurrent_activation="sigmoid",
use_bias=True,
kernel_initializer="glorot_uniform",
recurrent_initializer="orthogonal",
bias_initializer="zeros",
unit_forget_bias=True,
kernel_regularizer=None,
recurrent_regularizer=None,
bias_regularizer=None,
activity_regularizer=None,
kernel_constraint=None,
recurrent_constraint=None,
bias_constraint=None,
dropout=0.0,
recurrent_dropout=0.0,
return_sequences=False,
return_state=False,
go_backwards=False,
stateful=False,
time_major=False,
unroll=False,
**kwargs
)
Long Short-Term Memory layer - Hochreiter 1997.
See the TF-Keras RNN API guide for details about the usage of RNN API.
Based on available runtime hardware and constraints, this layer will choose different implementations (cuDNN-based or pure-TensorFlow) to maximize the performance. If a GPU is available and all the arguments to the layer meet the requirement of the cuDNN kernel (see below for details), the layer will use a fast cuDNN implementation.
The requirements to use the cuDNN implementation are:
activation == tanhrecurrent_activation == sigmoidrecurrent_dropout == 0unroll is Falseuse_bias is TrueFor example:
>>> inputs = tf.random.normal([32, 10, 8])
>>> lstm = tf.keras.layers.LSTM(4)
>>> output = lstm(inputs)
>>> print(output.shape)
(32, 4)
>>> lstm = tf.keras.layers.LSTM(4, return_sequences=True, return_state=True)
>>> whole_seq_output, final_memory_state, final_carry_state = lstm(inputs)
>>> print(whole_seq_output.shape)
(32, 10, 4)
>>> print(final_memory_state.shape)
(32, 4)
>>> print(final_carry_state.shape)
(32, 4)
Arguments
tanh). If you pass None, no activation
is applied (ie. "linear" activation: a(x) = x).sigmoid). If you pass None, no activation is
applied (ie. "linear" activation: a(x) = x).True), whether the layer uses a bias vector.kernel weights matrix, used for
the linear transformation of the inputs. Default: glorot_uniform.recurrent_kernel weights
matrix, used for the linear transformation of the recurrent state.
Default: orthogonal.zeros.True). If True, add 1 to the bias of
the forget gate at initialization. Setting it to true will also force
bias_initializer="zeros". This is recommended in Jozefowicz et
al..kernel weights
matrix. Default: None.recurrent_kernel weights matrix. Default: None.None.None.kernel weights
matrix. Default: None.recurrent_kernel weights matrix. Default: None.None.False.False.False). If True, process the input
sequence backwards and return the reversed sequence.False). If True, the last state for each
sample at index i in a batch will be used as initial state for the sample
of index i in the following batch.inputs and outputs tensors.
If True, the inputs and outputs will be in shape
[timesteps, batch, feature], whereas in the False case, it will be
[batch, timesteps, feature]. Using time_major = True is a bit more
efficient because it avoids transposes at the beginning and end of the
RNN calculation. However, most TensorFlow data is batch-major, so by
default this function accepts input and emits output in batch-major
form.False). If True, the network will be unrolled,
else a symbolic loop will be used. Unrolling can speed-up a RNN,
although it tends to be more memory-intensive. Unrolling is only
suitable for short sequences.Call arguments
[batch, timesteps, feature].[batch, timesteps] indicating whether
a given timestep should be masked (optional).
An individual True entry indicates that the corresponding timestep
should be utilized, while a False entry indicates that the
corresponding timestep should be ignored. Defaults to None.dropout or
recurrent_dropout is used (optional). Defaults to None.None causes creation
of zero-filled initial state tensors). Defaults to None.