Initializers define the way to set the initial random weights of Keras layers.
The keyword arguments used for passing initializers to layers depends on the layer.
Usually, it is simply kernel_initializer
and bias_initializer
:
from tensorflow.keras import layers
from tensorflow.keras import initializers
layer = layers.Dense(
units=64,
kernel_initializer=initializers.RandomNormal(stddev=0.01),
bias_initializer=initializers.Zeros()
)
All built-in initializers can also be passed via their string identifier:
layer = layers.Dense(
units=64,
kernel_initializer='random_normal',
bias_initializer='zeros'
)
The following built-in initializers are available as part of the tf.keras.initializers
module:
RandomNormal
classtf.keras.initializers.RandomNormal(mean=0.0, stddev=0.05, seed=None)
Initializer that generates tensors with a normal distribution.
Also available via the shortcut function
tf.keras.initializers.random_normal
.
Examples
>>> # Standalone usage:
>>> initializer = tf.keras.initializers.RandomNormal(mean=0., stddev=1.)
>>> values = initializer(shape=(2, 2))
>>> # Usage in a Keras layer:
>>> initializer = tf.keras.initializers.RandomNormal(mean=0., stddev=1.)
>>> layer = tf.keras.layers.Dense(3, kernel_initializer=initializer)
Arguments
RandomUniform
classtf.keras.initializers.RandomUniform(minval=-0.05, maxval=0.05, seed=None)
Initializer that generates tensors with a uniform distribution.
Also available via the shortcut function
tf.keras.initializers.random_uniform
.
Examples
>>> # Standalone usage:
>>> initializer = tf.keras.initializers.RandomUniform(minval=0., maxval=1.)
>>> values = initializer(shape=(2, 2))
>>> # Usage in a Keras layer:
>>> initializer = tf.keras.initializers.RandomUniform(minval=0., maxval=1.)
>>> layer = tf.keras.layers.Dense(3, kernel_initializer=initializer)
Arguments
TruncatedNormal
classtf.keras.initializers.TruncatedNormal(mean=0.0, stddev=0.05, seed=None)
Initializer that generates a truncated normal distribution.
Also available via the shortcut function
tf.keras.initializers.truncated_normal
.
The values generated are similar to values from a
tf.keras.initializers.RandomNormal
initializer except that values more
than two standard deviations from the mean are
discarded and re-drawn.
Examples
>>> # Standalone usage:
>>> initializer = tf.keras.initializers.TruncatedNormal(mean=0., stddev=1.)
>>> values = initializer(shape=(2, 2))
>>> # Usage in a Keras layer:
>>> initializer = tf.keras.initializers.TruncatedNormal(mean=0., stddev=1.)
>>> layer = tf.keras.layers.Dense(3, kernel_initializer=initializer)
Arguments
Zeros
classtf.keras.initializers.Zeros()
Initializer that generates tensors initialized to 0.
Also available via the shortcut function tf.keras.initializers.zeros
.
Examples
>>> # Standalone usage:
>>> initializer = tf.keras.initializers.Zeros()
>>> values = initializer(shape=(2, 2))
>>> # Usage in a Keras layer:
>>> initializer = tf.keras.initializers.Zeros()
>>> layer = tf.keras.layers.Dense(3, kernel_initializer=initializer)
Ones
classtf.keras.initializers.Ones()
Initializer that generates tensors initialized to 1.
Also available via the shortcut function tf.keras.initializers.ones
.
Examples
>>> # Standalone usage:
>>> initializer = tf.keras.initializers.Ones()
>>> values = initializer(shape=(2, 2))
>>> # Usage in a Keras layer:
>>> initializer = tf.keras.initializers.Ones()
>>> layer = tf.keras.layers.Dense(3, kernel_initializer=initializer)
GlorotNormal
classtf.keras.initializers.GlorotNormal(seed=None)
The Glorot normal initializer, also called Xavier normal initializer.
Also available via the shortcut function
tf.keras.initializers.glorot_normal
.
Draws samples from a truncated normal distribution centered on 0 with
stddev = sqrt(2 / (fan_in + fan_out))
where fan_in
is the number of
input units in the weight tensor and fan_out
is the number of output units
in the weight tensor.
Examples
>>> # Standalone usage:
>>> initializer = tf.keras.initializers.GlorotNormal()
>>> values = initializer(shape=(2, 2))
>>> # Usage in a Keras layer:
>>> initializer = tf.keras.initializers.GlorotNormal()
>>> layer = tf.keras.layers.Dense(3, kernel_initializer=initializer)
Arguments
References
GlorotUniform
classtf.keras.initializers.GlorotUniform(seed=None)
The Glorot uniform initializer, also called Xavier uniform initializer.
Also available via the shortcut function
tf.keras.initializers.glorot_uniform
.
Draws samples from a uniform distribution within [-limit, limit]
, where
limit = sqrt(6 / (fan_in + fan_out))
(fan_in
is the number of input
units in the weight tensor and fan_out
is the number of output units).
Examples
>>> # Standalone usage:
>>> initializer = tf.keras.initializers.GlorotUniform()
>>> values = initializer(shape=(2, 2))
>>> # Usage in a Keras layer:
>>> initializer = tf.keras.initializers.GlorotUniform()
>>> layer = tf.keras.layers.Dense(3, kernel_initializer=initializer)
Arguments
References
HeNormal
classtf.keras.initializers.HeNormal(seed=None)
He normal initializer.
Also available via the shortcut function
tf.keras.initializers.he_normal
.
It draws samples from a truncated normal distribution centered on 0 with
stddev = sqrt(2 / fan_in)
where fan_in
is the number of input units in
the weight tensor.
Examples
>>> # Standalone usage:
>>> initializer = tf.keras.initializers.HeNormal()
>>> values = initializer(shape=(2, 2))
>>> # Usage in a Keras layer:
>>> initializer = tf.keras.initializers.HeNormal()
>>> layer = tf.keras.layers.Dense(3, kernel_initializer=initializer)
Arguments
References
HeUniform
classtf.keras.initializers.HeUniform(seed=None)
He uniform variance scaling initializer.
Also available via the shortcut function
tf.keras.initializers.he_uniform
.
Draws samples from a uniform distribution within [-limit, limit]
, where
limit = sqrt(6 / fan_in)
(fan_in
is the number of input units in the
weight tensor).
Examples
>>> # Standalone usage:
>>> initializer = tf.keras.initializers.HeUniform()
>>> values = initializer(shape=(2, 2))
>>> # Usage in a Keras layer:
>>> initializer = tf.keras.initializers.HeUniform()
>>> layer = tf.keras.layers.Dense(3, kernel_initializer=initializer)
Arguments
References
Identity
classtf.keras.initializers.Identity(gain=1.0)
Initializer that generates the identity matrix.
Also available via the shortcut function tf.keras.initializers.identity
.
Only usable for generating 2D matrices.
Examples
>>> # Standalone usage:
>>> initializer = tf.keras.initializers.Identity()
>>> values = initializer(shape=(2, 2))
>>> # Usage in a Keras layer:
>>> initializer = tf.keras.initializers.Identity()
>>> layer = tf.keras.layers.Dense(3, kernel_initializer=initializer)
Arguments
Orthogonal
classtf.keras.initializers.Orthogonal(gain=1.0, seed=None)
Initializer that generates an orthogonal matrix.
Also available via the shortcut function tf.keras.initializers.orthogonal
.
If the shape of the tensor to initialize is two-dimensional, it is initialized with an orthogonal matrix obtained from the QR decomposition of a matrix of random numbers drawn from a normal distribution. If the matrix has fewer rows than columns then the output will have orthogonal rows. Otherwise, the output will have orthogonal columns.
If the shape of the tensor to initialize is more than two-dimensional,
a matrix of shape (shape[0] * ... * shape[n - 2], shape[n - 1])
is initialized, where n
is the length of the shape vector.
The matrix is subsequently reshaped to give a tensor of the desired shape.
Examples
>>> # Standalone usage:
>>> initializer = tf.keras.initializers.Orthogonal()
>>> values = initializer(shape=(2, 2))
>>> # Usage in a Keras layer:
>>> initializer = tf.keras.initializers.Orthogonal()
>>> layer = tf.keras.layers.Dense(3, kernel_initializer=initializer)
Arguments
References
Constant
classtf.keras.initializers.Constant(value=0)
Initializer that generates tensors with constant values.
Also available via the shortcut function tf.keras.initializers.constant
.
Only scalar values are allowed. The constant value provided must be convertible to the dtype requested when calling the initializer.
Examples
>>> # Standalone usage:
>>> initializer = tf.keras.initializers.Constant(3.)
>>> values = initializer(shape=(2, 2))
>>> # Usage in a Keras layer:
>>> initializer = tf.keras.initializers.Constant(3.)
>>> layer = tf.keras.layers.Dense(3, kernel_initializer=initializer)
Arguments
VarianceScaling
classtf.keras.initializers.VarianceScaling(
scale=1.0, mode="fan_in", distribution="truncated_normal", seed=None
)
Initializer capable of adapting its scale to the shape of weights tensors.
Also available via the shortcut function
tf.keras.initializers.variance_scaling
.
With distribution="truncated_normal" or "untruncated_normal"
, samples are
drawn from a truncated/untruncated normal distribution with a mean of zero
and a standard deviation (after truncation, if used) stddev = sqrt(scale /
n)
, where n
is:
mode="fan_in"
mode="fan_out"
mode="fan_avg"
With distribution="uniform"
, samples are drawn from a uniform distribution
within [-limit, limit]
, where limit = sqrt(3 * scale / n)
.
Examples
>>> # Standalone usage:
>>> initializer = tf.keras.initializers.VarianceScaling(
... scale=0.1, mode='fan_in', distribution='uniform')
>>> values = initializer(shape=(2, 2))
>>> # Usage in a Keras layer:
>>> initializer = tf.keras.initializers.VarianceScaling(
... scale=0.1, mode='fan_in', distribution='uniform')
>>> layer = tf.keras.layers.Dense(3, kernel_initializer=initializer)
Arguments
You can pass a custom callable as initializer.
It must take the arguments shape
(shape of the variable to initialize) and dtype
(dtype of generated values):
def my_init(shape, dtype=None):
return tf.random.normal(shape, dtype=dtype)
layer = Dense(64, kernel_initializer=my_init)
Initializer
subclassesIf you need to configure your initializer via various arguments (e.g. stddev
argument in RandomNormal
),
you should implement it as a subclass of tf.keras.initializers.Initializer
.
Initializers should implement a __call__
method with the following
signature:
def __call__(self, shape, dtype=None)`:
# returns a tensor of shape `shape` and dtype `dtype`
# containing values drawn from a distribution of your choice.
Optionally, you an also implement the method get_config
and the class
method from_config
in order to support serialization -- just like with
any Keras object.
Here's a simple example: a random normal initializer.
import tensorflow as tf
class ExampleRandomNormal(tf.keras.initializers.Initializer):
def __init__(self, mean, stddev):
self.mean = mean
self.stddev = stddev
def __call__(self, shape, dtype=None)`:
return tf.random.normal(
shape, mean=self.mean, stddev=self.stddev, dtype=dtype)
def get_config(self): # To support serialization
return {'mean': self.mean, 'stddev': self.stddev}
Note that we don't have to implement from_config
in the example above since
the constructor arguments of the class the keys in the config returned by
get_config
are the same. In this case, the default from_config
works fine.