ConvLSTM2D classkeras.layers.ConvLSTM2D(
    filters,
    kernel_size,
    strides=1,
    padding="valid",
    data_format=None,
    dilation_rate=1,
    activation="tanh",
    recurrent_activation="sigmoid",
    use_bias=True,
    kernel_initializer="glorot_uniform",
    recurrent_initializer="orthogonal",
    bias_initializer="zeros",
    unit_forget_bias=True,
    kernel_regularizer=None,
    recurrent_regularizer=None,
    bias_regularizer=None,
    activity_regularizer=None,
    kernel_constraint=None,
    recurrent_constraint=None,
    bias_constraint=None,
    dropout=0.0,
    recurrent_dropout=0.0,
    seed=None,
    return_sequences=False,
    return_state=False,
    go_backwards=False,
    stateful=False,
    **kwargs
)
2D Convolutional LSTM.
Similar to an LSTM layer, but the input transformations and recurrent transformations are both convolutional.
Arguments
strides > 1 is incompatible with
  dilation_rate > 1."valid" or "same" (case-insensitive).
  "valid" means no padding. "same" results in padding evenly to
  the left/right or up/down of the input such that output has the same
  height/width dimension as the input."channels_last" or "channels_first".
  The ordering of the dimensions in the inputs. "channels_last"
  corresponds to inputs with shape (batch, steps, features)
  while "channels_first" corresponds to inputs with shape
  (batch, features, steps). It defaults to the image_data_format
  value found in your Keras config file at ~/.keras/keras.json.
  If you never set it, then it will be "channels_last".tanh(x)).kernel weights matrix,
  used for the linear transformation of the inputs.recurrent_kernel weights
  matrix, used for the linear transformation of the recurrent state.True, add 1 to the bias of the forget
  gate at initialization.
  Use in combination with bias_initializer="zeros".
  This is recommended in Jozefowicz et al., 2015kernel weights
  matrix.recurrent_kernel weights matrix.kernel weights
  matrix.recurrent_kernel weights matrix.False.False.False).
  If True, process the input sequence backwards and return the
  reversed sequence.True, the last state
  for each sample at index i in a batch will be used as initial
  state for the sample of index i in the following batch.False).
  If True, the network will be unrolled,
  else a symbolic loop will be used.
  Unrolling can speed-up a RNN,
  although it tends to be more memory-intensive.
  Unrolling is only suitable for short sequences.Call arguments
(samples, timesteps) indicating whether a
  given timestep should be masked.dropout or recurrent_dropout are set.Input shape
data_format='channels_first':
    5D tensor with shape: (samples, time, channels, rows, cols)data_format='channels_last':
    5D tensor with shape: (samples, time, rows, cols, channels)Output shape
return_state: a list of tensors. The first tensor is the output.
    The remaining tensors are the last states,
    each 4D tensor with shape: (samples, filters, new_rows, new_cols) if
    data_format='channels_first'
    or shape: (samples, new_rows, new_cols, filters) if
    data_format='channels_last'. rows and cols values might have
    changed due to padding.return_sequences: 5D tensor with shape: (samples, timesteps,
    filters, new_rows, new_cols) if data_format='channels_first'
    or shape: (samples, timesteps, new_rows, new_cols, filters) if
    data_format='channels_last'.(samples, filters, new_rows, new_cols) if
    data_format='channels_first'
    or shape: (samples, new_rows, new_cols, filters) if
    data_format='channels_last'.References