compile
methodModel.compile(
optimizer="rmsprop",
loss=None,
metrics=None,
loss_weights=None,
weighted_metrics=None,
run_eagerly=None,
steps_per_execution=None,
jit_compile=None,
pss_evaluation_shards=0,
**kwargs
)
Configures the model for training.
Example
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3),
loss=tf.keras.losses.BinaryCrossentropy(),
metrics=[tf.keras.metrics.BinaryAccuracy(),
tf.keras.metrics.FalseNegatives()])
Arguments
tf.keras.optimizers
.tf.keras.losses.Loss
instance. See tf.keras.losses
. A loss
function is any callable with the signature loss = fn(y_true,
y_pred)
, where y_true
are the ground truth values, and
y_pred
are the model's predictions.
y_true
should have shape
(batch_size, d0, .. dN)
(except in the case of
sparse loss functions such as
sparse categorical crossentropy which expects integer arrays of
shape (batch_size, d0, .. dN-1)
).
y_pred
should have shape (batch_size, d0, .. dN)
.
The loss function should return a float tensor.
If a custom Loss
instance is
used and reduction is set to None
, return value has shape
(batch_size, d0, .. dN-1)
i.e. per-sample or per-timestep loss
values; otherwise, it is a scalar. If the model has multiple
outputs, you can use a different loss on each output by passing a
dictionary or a list of losses. The loss value that will be
minimized by the model will then be the sum of all individual
losses, unless loss_weights
is specified.tf.keras.metrics.Metric
instance. See tf.keras.metrics
. Typically you will use
metrics=['accuracy']
.
A function is any callable with the signature result = fn(y_true,
y_pred)
. To specify different metrics for different outputs of a
multi-output model, you could also pass a dictionary, such as
metrics={'output_a':'accuracy', 'output_b':['accuracy', 'mse']}
.
You can also pass a list to specify a metric or a list of metrics
for each output, such as
metrics=[['accuracy'], ['accuracy', 'mse']]
or metrics=['accuracy', ['accuracy', 'mse']]
. When you pass the
strings 'accuracy' or 'acc', we convert this to one of
tf.keras.metrics.BinaryAccuracy
,
tf.keras.metrics.CategoricalAccuracy
,
tf.keras.metrics.SparseCategoricalAccuracy
based on the shapes
of the targets and of the model output. We do a similar
conversion for the strings 'crossentropy' and 'ce' as well.
The metrics passed here are evaluated without sample weighting; if
you would like sample weighting to apply, you can specify your
metrics via the weighted_metrics
argument instead.loss_weights
coefficients. If a list,
it is expected to have a 1:1 mapping to the model's outputs. If a
dict, it is expected to map output names (strings) to scalar
coefficients.sample_weight
or class_weight
during training and testing.True
, this Model
's logic will not be
wrapped in a tf.function
. Recommended to leave this as None
unless your Model
cannot be run inside a tf.function
.
run_eagerly=True
is not supported when using
tf.distribute.experimental.ParameterServerStrategy
. Defaults to
False
.'auto'
. The number of batches to
run during each tf.function
call. If set to "auto", keras will
automatically tune steps_per_execution
during runtime. Running
multiple batches inside a single tf.function
call can greatly
improve performance on TPUs, when used with distributed strategies
such as ParameterServerStrategy
, or with small models with a
large Python overhead. At most, one full epoch will be run each
execution. If a number larger than the size of the epoch is
passed, the execution will be truncated to the size of the epoch.
Note that if steps_per_execution
is set to N
,
Callback.on_batch_begin
and Callback.on_batch_end
methods will
only be called every N
batches (i.e. before/after each
tf.function
execution). Defaults to 1
.True
, compile the model training step with XLA.
XLA is an optimizing compiler
for machine learning.
jit_compile
is not enabled for by default.
Note that jit_compile=True
may not necessarily work for all models.
For more information on supported operations please refer to the
XLA documentation.
Also refer to
known XLA issues
for more details.tf.distribute.ParameterServerStrategy
training only. This arg
sets the number of shards to split the dataset into, to enable an
exact visitation guarantee for evaluation, meaning the model will
be applied to each dataset element exactly once, even if workers
fail. The dataset must be sharded to ensure separate workers do
not process the same data. The number of shards should be at least
the number of workers for good performance. A value of 'auto'
turns on exact evaluation and uses a heuristic for the number of
shards based on the number of workers. 0, meaning no
visitation guarantee is provided. NOTE: Custom implementations of
Model.test_step
will be ignored when doing exact evaluation.
Defaults to 0
.fit
methodModel.fit(
x=None,
y=None,
batch_size=None,
epochs=1,
verbose="auto",
callbacks=None,
validation_split=0.0,
validation_data=None,
shuffle=True,
class_weight=None,
sample_weight=None,
initial_epoch=0,
steps_per_epoch=None,
validation_steps=None,
validation_batch_size=None,
validation_freq=1,
max_queue_size=10,
workers=1,
use_multiprocessing=False,
)
Trains the model for a fixed number of epochs (dataset iterations).
Arguments
tf.data
dataset. Should return a tuple
of either (inputs, targets)
or
(inputs, targets, sample_weights)
.keras.utils.Sequence
returning (inputs,
targets)
or (inputs, targets, sample_weights)
.tf.keras.utils.experimental.DatasetCreator
, which wraps a
callable that takes a single argument of type
tf.distribute.InputContext
, and returns a tf.data.Dataset
.
DatasetCreator
should be used when users prefer to specify the
per-replica batching and sharding logic for the Dataset
.
See tf.keras.utils.experimental.DatasetCreator
doc for more
information.
A more detailed description of unpacking behavior for iterator
types (Dataset, generator, Sequence) is given below. If these
include sample_weights
as a third component, note that sample
weighting applies to the weighted_metrics
argument but not the
metrics
argument in compile()
. If using
tf.distribute.experimental.ParameterServerStrategy
, only
DatasetCreator
type is supported for x
.x
,
it could be either Numpy array(s) or TensorFlow tensor(s).
It should be consistent with x
(you cannot have Numpy inputs and
tensor targets, or inversely). If x
is a dataset, generator,
or keras.utils.Sequence
instance, y
should
not be specified (since targets will be obtained from x
).None
.
Number of samples per gradient update.
If unspecified, batch_size
will default to 32.
Do not specify the batch_size
if your data is in the
form of datasets, generators, or keras.utils.Sequence
instances (since they generate batches).x
and y
data provided
(unless the steps_per_epoch
flag is set to
something other than None).
Note that in conjunction with initial_epoch
,
epochs
is to be understood as "final epoch".
The model is not trained for a number of iterations
given by epochs
, but merely until the epoch
of index epochs
is reached.ParameterServerStrategy
. Note that the progress bar is not
particularly useful when logged to a file, so verbose=2 is
recommended when not running interactively (eg, in a production
environment). Defaults to 'auto'.keras.callbacks.Callback
instances.
List of callbacks to apply during training.
See tf.keras.callbacks
. Note
tf.keras.callbacks.ProgbarLogger
and
tf.keras.callbacks.History
callbacks are created automatically
and need not be passed into model.fit
.
tf.keras.callbacks.ProgbarLogger
is created or not based on
verbose
argument to model.fit
.
Callbacks with batch-level calls are currently unsupported with
tf.distribute.experimental.ParameterServerStrategy
, and users
are advised to implement epoch-level calls instead with an
appropriate steps_per_epoch
value.x
and y
data provided, before shuffling. This
argument is not supported when x
is a dataset, generator or
keras.utils.Sequence
instance.
If both validation_data
and validation_split
are provided,
validation_data
will override validation_split
.
validation_split
is not yet supported with
tf.distribute.experimental.ParameterServerStrategy
.validation_split
or validation_data
is not affected by
regularization layers like noise and dropout.
validation_data
will override validation_split
.
validation_data
could be:
- A tuple (x_val, y_val)
of Numpy arrays or tensors.
- A tuple (x_val, y_val, val_sample_weights)
of NumPy
arrays.
- A tf.data.Dataset
.
- A Python generator or keras.utils.Sequence
returning
(inputs, targets)
or (inputs, targets, sample_weights)
.
validation_data
is not yet supported with
tf.distribute.experimental.ParameterServerStrategy
.x
is a generator or an object of tf.data.Dataset.
'batch' is a special option for dealing
with the limitations of HDF5 data; it shuffles in batch-sized
chunks. Has no effect when steps_per_epoch
is not None
.class_weight
is specified
and targets have a rank of 2 or greater, either y
must be
one-hot encoded, or an explicit final dimension of 1
must
be included for sparse class labels.(samples, sequence_length)
,
to apply a different weight to every timestep of every sample.
This argument is not supported when x
is a dataset, generator,
or keras.utils.Sequence
instance, instead provide the
sample_weights as the third element of x
.
Note that sample weighting does not apply to metrics specified
via the metrics
argument in compile()
. To apply sample
weighting to your metrics, you can specify them via the
weighted_metrics
in compile()
instead.None
.
Total number of steps (batches of samples)
before declaring one epoch finished and starting the
next epoch. When training with input tensors such as
TensorFlow data tensors, the default None
is equal to
the number of samples in your dataset divided by
the batch size, or 1 if that cannot be determined. If x is a
tf.data
dataset, and 'steps_per_epoch'
is None, the epoch will run until the input dataset is
exhausted. When passing an infinitely repeating dataset, you
must specify the steps_per_epoch
argument. If
steps_per_epoch=-1
the training will run indefinitely with an
infinitely repeating dataset. This argument is not supported
with array inputs.
When using tf.distribute.experimental.ParameterServerStrategy
:
* steps_per_epoch=None
is not supported.validation_data
is provided and
is a tf.data
dataset. Total number of steps (batches of
samples) to draw before stopping when performing validation
at the end of every epoch. If 'validation_steps' is None,
validation will run until the validation_data
dataset is
exhausted. In the case of an infinitely repeated dataset, it
will run into an infinite loop. If 'validation_steps' is
specified and only part of the dataset will be consumed, the
evaluation will start from the beginning of the dataset at each
epoch. This ensures that the same validation samples are used
every time.None
.
Number of samples per validation batch.
If unspecified, will default to batch_size
.
Do not specify the validation_batch_size
if your data is in
the form of datasets, generators, or keras.utils.Sequence
instances (since they generate batches).collections.abc.Container
instance (e.g. list, tuple,
etc.). If an integer, specifies how many training epochs to run
before a new validation run is performed, e.g. validation_freq=2
runs validation every 2 epochs. If a Container, specifies the
epochs on which to run validation, e.g.
validation_freq=[1, 2, 10]
runs validation at the end of the
1st, 2nd, and 10th epochs.keras.utils.Sequence
input only. Maximum size for the generator
queue. If unspecified, max_queue_size
will default to 10.keras.utils.Sequence
input
only. Maximum number of processes to spin up
when using process-based threading. If unspecified, workers
will default to 1.keras.utils.Sequence
input only. If True
, use process-based
threading. If unspecified, use_multiprocessing
will default to
False
. Note that because this implementation relies on
multiprocessing, you should not pass non-pickleable arguments to
the generator as they can't be passed easily to children
processes.Unpacking behavior for iterator-like inputs:
A common pattern is to pass a tf.data.Dataset, generator, or
tf.keras.utils.Sequence to the x
argument of fit, which will in fact
yield not only features (x) but optionally targets (y) and sample
weights. TF-Keras requires that the output of such iterator-likes be
unambiguous. The iterator should return a tuple of length 1, 2, or 3,
where the optional second and third elements will be used for y and
sample_weight respectively. Any other type provided will be wrapped in
a length one tuple, effectively treating everything as 'x'. When
yielding dicts, they should still adhere to the top-level tuple
structure.
e.g. ({"x0": x0, "x1": x1}, y)
. TF-Keras will not attempt to
separate features, targets, and weights from the keys of a single
dict.
A notable unsupported data type is the namedtuple. The reason is
that it behaves like both an ordered datatype (tuple) and a mapping
datatype (dict). So given a namedtuple of the form:
namedtuple("example_tuple", ["y", "x"])
it is ambiguous whether to reverse the order of the elements when
interpreting the value. Even worse is a tuple of the form:
namedtuple("other_tuple", ["x", "y", "z"])
where it is unclear if the tuple was intended to be unpacked into x,
y, and sample_weight or passed through as a single element to x
. As
a result the data processing code will simply raise a ValueError if it
encounters a namedtuple. (Along with instructions to remedy the
issue.)
Returns
A History
object. Its History.history
attribute is
a record of training loss values and metrics values
at successive epochs, as well as validation loss values
and validation metrics values (if applicable).
Raises
model.fit
is wrapped in tf.function
.evaluate
methodModel.evaluate(
x=None,
y=None,
batch_size=None,
verbose="auto",
sample_weight=None,
steps=None,
callbacks=None,
max_queue_size=10,
workers=1,
use_multiprocessing=False,
return_dict=False,
**kwargs
)
Returns the loss value & metrics values for the model in test mode.
Computation is done in batches (see the batch_size
arg.)
Arguments
tf.data
dataset. Should return a tuple
of either (inputs, targets)
or
(inputs, targets, sample_weights)
.keras.utils.Sequence
returning (inputs,
targets)
or (inputs, targets, sample_weights)
.
A more detailed description of unpacking behavior for iterator
types (Dataset, generator, Sequence) is given in the Unpacking
behavior for iterator-like inputs
section of Model.fit
.x
, it could be either Numpy
array(s) or TensorFlow tensor(s). It should be consistent with x
(you cannot have Numpy inputs and tensor targets, or inversely).
If x
is a dataset, generator or keras.utils.Sequence
instance,
y
should not be specified (since targets will be obtained from
the iterator/dataset).None
. Number of samples per batch of
computation. If unspecified, batch_size
will default to 32. Do
not specify the batch_size
if your data is in the form of a
dataset, generators, or keras.utils.Sequence
instances (since
they generate batches)."auto"
, 0, 1, or 2. Verbosity mode.
0 = silent, 1 = progress bar, 2 = single line.
"auto"
becomes 1 for most cases, and to 2 when used with
ParameterServerStrategy
. Note that the progress bar is not
particularly useful when logged to a file, so verbose=2
is
recommended when not running interactively (e.g. in a production
environment). Defaults to 'auto'.(samples,
sequence_length)
, to apply a different weight to every
timestep of every sample. This argument is not supported when
x
is a dataset, instead pass sample weights as the third
element of x
.None
. Total number of steps (batches of samples)
before declaring the evaluation round finished. Ignored with the
default value of None
. If x is a tf.data
dataset and steps
is None, 'evaluate' will run until the dataset is exhausted. This
argument is not supported with array inputs.keras.callbacks.Callback
instances. List of
callbacks to apply during evaluation. See
callbacks.keras.utils.Sequence
input only. Maximum size for the generator
queue. If unspecified, max_queue_size
will default to 10.keras.utils.Sequence
input
only. Maximum number of processes to spin up when using
process-based threading. If unspecified, workers
will default to
1.keras.utils.Sequence
input only. If True
, use process-based
threading. If unspecified, use_multiprocessing
will default to
False
. Note that because this implementation relies on
multiprocessing, you should not pass non-pickleable arguments to
the generator as they can't be passed easily to children
processes.True
, loss and metric results are returned as a
dict, with each key being the name of the metric. If False
, they
are returned as a list.See the discussion of Unpacking behavior for iterator-like inputs
for
Model.fit
.
Returns
Scalar test loss (if the model has a single output and no metrics)
or list of scalars (if the model has multiple outputs
and/or metrics). The attribute model.metrics_names
will give you
the display labels for the scalar outputs.
Raises
model.evaluate
is wrapped in a tf.function
.predict
methodModel.predict(
x,
batch_size=None,
verbose="auto",
steps=None,
callbacks=None,
max_queue_size=10,
workers=1,
use_multiprocessing=False,
)
Generates output predictions for the input samples.
Computation is done in batches. This method is designed for batch processing of large numbers of inputs. It is not intended for use inside of loops that iterate over your data and process small numbers of inputs at a time.
For small numbers of inputs that fit in one batch,
directly use __call__()
for faster execution, e.g.,
model(x)
, or model(x, training=False)
if you have layers such as
tf.keras.layers.BatchNormalization
that behave differently during
inference. You may pair the individual model call with a tf.function
for additional performance inside your inner loop.
If you need access to numpy array values instead of tensors after your
model call, you can use tensor.numpy()
to get the numpy array value of
an eager tensor.
Also, note the fact that test loss is not affected by regularization layers like noise and dropout.
Note: See this FAQ entry
for more details about the difference between Model
methods
predict()
and __call__()
.
Arguments
tf.data
dataset.keras.utils.Sequence
instance.
A more detailed description of unpacking behavior for iterator
types (Dataset, generator, Sequence) is given in the Unpacking
behavior for iterator-like inputs
section of Model.fit
.None
.
Number of samples per batch.
If unspecified, batch_size
will default to 32.
Do not specify the batch_size
if your data is in the
form of dataset, generators, or keras.utils.Sequence
instances
(since they generate batches)."auto"
, 0, 1, or 2. Verbosity mode.
0 = silent, 1 = progress bar, 2 = single line.
"auto"
becomes 1 for most cases, and to 2 when used with
ParameterServerStrategy
. Note that the progress bar is not
particularly useful when logged to a file, so verbose=2
is
recommended when not running interactively (e.g. in a production
environment). Defaults to 'auto'.None
. If x is a tf.data
dataset and steps
is None, predict()
will
run until the input dataset is exhausted.keras.callbacks.Callback
instances.
List of callbacks to apply during prediction.
See callbacks.keras.utils.Sequence
input only. Maximum size for the
generator queue. If unspecified, max_queue_size
will default
to 10.keras.utils.Sequence
input
only. Maximum number of processes to spin up when using
process-based threading. If unspecified, workers
will default
to 1.keras.utils.Sequence
input only. If True
, use process-based
threading. If unspecified, use_multiprocessing
will default to
False
. Note that because this implementation relies on
multiprocessing, you should not pass non-pickleable arguments to
the generator as they can't be passed easily to children
processes.See the discussion of Unpacking behavior for iterator-like inputs
for
Model.fit
. Note that Model.predict uses the same interpretation rules
as Model.fit
and Model.evaluate
, so inputs must be unambiguous for
all three methods.
Returns
Numpy array(s) of predictions.
Raises
model.predict
is wrapped in a tf.function
.train_on_batch
methodModel.train_on_batch(
x,
y=None,
sample_weight=None,
class_weight=None,
reset_metrics=True,
return_dict=False,
)
Runs a single gradient update on a single batch of data.
Arguments
x
, it could be either Numpy
array(s) or TensorFlow tensor(s).class_weight
is specified and targets have a rank of
2 or greater, either y
must be one-hot encoded, or an explicit
final dimension of 1
must be included for sparse class labels.True
, the metrics returned will be only for this
batch. If False
, the metrics will be statefully accumulated
across batches.True
, loss and metric results are returned as a
dict, with each key being the name of the metric. If False
, they
are returned as a list.Returns
Scalar training loss
(if the model has a single output and no metrics)
or list of scalars (if the model has multiple outputs
and/or metrics). The attribute model.metrics_names
will give you
the display labels for the scalar outputs.
Raises
model.train_on_batch
is wrapped in a tf.function
.test_on_batch
methodModel.test_on_batch(
x, y=None, sample_weight=None, reset_metrics=True, return_dict=False
)
Test the model on a single batch of samples.
Arguments
x
, it could be either Numpy
array(s) or TensorFlow tensor(s). It should be consistent with x
(you cannot have Numpy inputs and tensor targets, or inversely).True
, the metrics returned will be only for this
batch. If False
, the metrics will be statefully accumulated
across batches.True
, loss and metric results are returned as a
dict, with each key being the name of the metric. If False
, they
are returned as a list.Returns
Scalar test loss (if the model has a single output and no metrics)
or list of scalars (if the model has multiple outputs
and/or metrics). The attribute model.metrics_names
will give you
the display labels for the scalar outputs.
Raises
model.test_on_batch
is wrapped in a
tf.function
.predict_on_batch
methodModel.predict_on_batch(x)
Returns predictions for a single batch of samples.
Arguments
Returns
Numpy array(s) of predictions.
Raises
model.predict_on_batch
is wrapped in a
tf.function
.run_eagerly
propertytf_keras.Model.run_eagerly
Settable attribute indicating whether the model should run eagerly.
Running eagerly means that your model will be run step by step, like Python code. Your model might run slower, but it should become easier for you to debug it by stepping into individual layer calls.
By default, we will attempt to compile your model to a static graph to deliver the best execution performance.
Returns
Boolean, whether the model should run eagerly.