Keras 3 API documentation / Optimizers


Available optimizers

Usage with compile() & fit()

An optimizer is one of the two arguments required for compiling a Keras model:

import keras
from keras import layers

model = keras.Sequential()
model.add(layers.Dense(64, kernel_initializer='uniform', input_shape=(10,)))

opt = keras.optimizers.Adam(learning_rate=0.01)
model.compile(loss='categorical_crossentropy', optimizer=opt)

You can either instantiate an optimizer before passing it to model.compile() , as in the above example, or you can pass it by its string identifier. In the latter case, the default parameters for the optimizer will be used.

# pass optimizer by name: default parameters will be used
model.compile(loss='categorical_crossentropy', optimizer='adam')

Learning rate decay / scheduling

You can use a learning rate schedule to modulate how the learning rate of your optimizer changes over time:

lr_schedule = keras.optimizers.schedules.ExponentialDecay(
optimizer = keras.optimizers.SGD(learning_rate=lr_schedule)

Check out the learning rate schedule API documentation for a list of available schedules.

Core Optimizer API

These methods and attributes are common to all Keras optimizers.


Optimizer class


A class for Tensorflow specific optimizer logic.

The major behavior change for this class is for tf.distribute.

It will override methods from base Keras core Optimizer, which provide distribute specific functionality, e.g. variable creation, loss reduction, etc.


apply_gradients method


variables property