set_random_seed
functiontf_keras.utils.set_random_seed(seed)
Sets all random seeds for the program (Python, NumPy, and TensorFlow).
You can use this utility to make almost any TF-Keras program fully deterministic. Some limitations apply in cases where network communications are involved (e.g. parameter server distribution), which creates additional sources of randomness, or when certain non-deterministic cuDNN ops are involved.
Calling this utility is equivalent to the following:
import random
import numpy as np
import tensorflow as tf
random.seed(seed)
np.random.seed(seed)
tf.random.set_seed(seed)
Arguments
split_dataset
functiontf_keras.utils.split_dataset(
dataset, left_size=None, right_size=None, shuffle=False, seed=None
)
Split a dataset into a left half and a right half (e.g. train / test).
Arguments
tf.data.Dataset
object, or a list/tuple of arrays with the
same length.[0, 1]
), it signifies
the fraction of the data to pack in the left dataset. If integer, it
signifies the number of samples to pack in the left dataset. If
None
, it uses the complement to right_size
. Defaults to None
.[0, 1]
), it signifies
the fraction of the data to pack in the right dataset. If integer, it
signifies the number of samples to pack in the right dataset. If
None
, it uses the complement to left_size
. Defaults to None
.Returns
tf.data.Dataset
objects: the left and right splits.Example
>>> data = np.random.random(size=(1000, 4))
>>> left_ds, right_ds = tf.keras.utils.split_dataset(data, left_size=0.8)
>>> int(left_ds.cardinality())
800
>>> int(right_ds.cardinality())
200
get_file
functiontf_keras.utils.get_file(
fname=None,
origin=None,
untar=False,
md5_hash=None,
file_hash=None,
cache_subdir="datasets",
hash_algorithm="auto",
extract=False,
archive_format="auto",
cache_dir=None,
)
Downloads a file from a URL if it not already in the cache.
By default the file at the url origin
is downloaded to the
cache_dir ~/.keras
, placed in the cache_subdir datasets
,
and given the filename fname
. The final location of a file
example.txt
would therefore be ~/.keras/datasets/example.txt
.
Files in tar, tar.gz, tar.bz, and zip formats can also be extracted.
Passing a hash will verify the file after download. The command line
programs shasum
and sha256sum
can compute the hash.
Example
path_to_downloaded_file = tf.keras.utils.get_file(
origin="https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz",
extract=True,
)
Arguments
/path/to/file.txt
is
specified the file will be saved at that location. If None
, the
name of the file at origin
will be used.extract
argument.
boolean, whether the file should be decompressedfile_hash
argument.
md5 hash of the file for verification/path/to/folder
is
specified the file will be saved at that location.'md5'
, 'sha256'
, and 'auto'
.
The default 'auto' detects the hash algorithm in use.'auto'
, 'tar'
, 'zip'
, and None
.
'tar'
includes tar, tar.gz, and tar.bz files.
The default 'auto'
corresponds to ['tar', 'zip']
.
None or an empty list will return no matches found.~/.keras/
.Returns
Path to the downloaded file.
⚠️ Warning on malicious downloads ⚠️
Downloading something from the Internet carries a risk.
NEVER download a file/archive if you do not trust the source.
We recommend that you specify the file_hash
argument
(if the hash of the source file is known) to make sure that the file you
are getting is the one you expect.
Progbar
classtf_keras.utils.Progbar(
target, width=30, verbose=1, interval=0.05, stateful_metrics=None, unit_name="step"
)
Displays a progress bar.
Arguments
Sequence
classtf_keras.utils.Sequence()
Base object for fitting to a sequence of data, such as a dataset.
Every Sequence
must implement the __getitem__
and the __len__
methods.
If you want to modify your dataset between epochs, you may implement
on_epoch_end
. The method __getitem__
should return a complete batch.
Notes:
Sequence
is a safer way to do multiprocessing. This structure guarantees
that the network will only train once on each sample per epoch, which is not
the case with generators.
Examples
from skimage.io import imread
from skimage.transform import resize
import numpy as np
import math
# Here, `x_set` is list of path to the images
# and `y_set` are the associated classes.
class CIFAR10Sequence(tf.keras.utils.Sequence):
def __init__(self, x_set, y_set, batch_size):
self.x, self.y = x_set, y_set
self.batch_size = batch_size
def __len__(self):
return math.ceil(len(self.x) / self.batch_size)
def __getitem__(self, idx):
low = idx * self.batch_size
# Cap upper bound at array length; the last batch may be smaller
# if the total number of items is not a multiple of batch size.
high = min(low + self.batch_size, len(self.x))
batch_x = self.x[low:high]
batch_y = self.y[low:high]
return np.array([
resize(imread(file_name), (200, 200))
for file_name in batch_x]), np.array(batch_y)
to_categorical
functiontf_keras.utils.to_categorical(y, num_classes=None, dtype="float32")
Converts a class vector (integers) to binary class matrix.
E.g. for use with categorical_crossentropy
.
Arguments
num_classes - 1
).None
, this would be inferred
as max(y) + 1
.'float32'
.Returns
A binary matrix representation of the input as a NumPy array. The class axis is placed last.
Example
>>> a = tf.keras.utils.to_categorical([0, 1, 2, 3], num_classes=4)
>>> print(a)
[[1. 0. 0. 0.]
[0. 1. 0. 0.]
[0. 0. 1. 0.]
[0. 0. 0. 1.]]
>>> b = tf.constant([.9, .04, .03, .03,
... .3, .45, .15, .13,
... .04, .01, .94, .05,
... .12, .21, .5, .17],
... shape=[4, 4])
>>> loss = tf.keras.backend.categorical_crossentropy(a, b)
>>> print(np.around(loss, 5))
[0.10536 0.82807 0.1011 1.77196]
>>> loss = tf.keras.backend.categorical_crossentropy(a, a)
>>> print(np.around(loss, 5))
[0. 0. 0. 0.]
to_ordinal
functiontf_keras.utils.to_ordinal(y, num_classes=None, dtype="float32")
Converts a class vector (integers) to an ordinal regression matrix.
This utility encodes class vector to ordinal regression/classification matrix where each sample is indicated by a row and rank of that sample is indicated by number of ones in that row.
Arguments
num_classes - 1
).None
, this would be inferred
as max(y) + 1
.'float32'
.Returns
An ordinal regression matrix representation of the input as a NumPy array. The class axis is placed last.
Example
>>> a = tf.keras.utils.to_ordinal([0, 1, 2, 3], num_classes=4)
>>> print(a)
[[0. 0. 0.]
[1. 0. 0.]
[1. 1. 0.]
[1. 1. 1.]]
normalize
functiontf_keras.utils.normalize(x, axis=-1, order=2)
Normalizes a Numpy array.
Arguments
order=2
for L2 norm).Returns
A normalized copy of the array.