ยป Keras API reference / Utilities / Python & NumPy utilities

Python & NumPy utilities

to_categorical function

tf.keras.utils.to_categorical(y, num_classes=None, dtype="float32")

Converts a class vector (integers) to binary class matrix.

E.g. for use with categorical_crossentropy.

Arguments

  • y: class vector to be converted into a matrix (integers from 0 to num_classes).
  • num_classes: total number of classes. If None, this would be inferred as the (largest number in y) + 1.
  • dtype: The data type expected by the input. Default: 'float32'.

Returns

A binary matrix representation of the input. The classes axis is placed last.

Example

>>> a = tf.keras.utils.to_categorical([0, 1, 2, 3], num_classes=4)
>>> a = tf.constant(a, shape=[4, 4])
>>> print(a)
tf.Tensor(
  [[1. 0. 0. 0.]
   [0. 1. 0. 0.]
   [0. 0. 1. 0.]
   [0. 0. 0. 1.]], shape=(4, 4), dtype=float32)

>>> b = tf.constant([.9, .04, .03, .03,
...                  .3, .45, .15, .13,
...                  .04, .01, .94, .05,
...                  .12, .21, .5, .17],
...                 shape=[4, 4])
>>> loss = tf.keras.backend.categorical_crossentropy(a, b)
>>> print(np.around(loss, 5))
[0.10536 0.82807 0.1011  1.77196]

>>> loss = tf.keras.backend.categorical_crossentropy(a, a)
>>> print(np.around(loss, 5))
[0. 0. 0. 0.]

Raises

  • Value Error: If input contains string value

normalize function

tf.keras.utils.normalize(x, axis=-1, order=2)

Normalizes a Numpy array.

Arguments

  • x: Numpy array to normalize.
  • axis: axis along which to normalize.
  • order: Normalization order (e.g. order=2 for L2 norm).

Returns

A normalized copy of the array.


get_file function

tf.keras.utils.get_file(
    fname,
    origin,
    untar=False,
    md5_hash=None,
    file_hash=None,
    cache_subdir="datasets",
    hash_algorithm="auto",
    extract=False,
    archive_format="auto",
    cache_dir=None,
)

Downloads a file from a URL if it not already in the cache.

By default the file at the url origin is downloaded to the cache_dir ~/.keras, placed in the cache_subdir datasets, and given the filename fname. The final location of a file example.txt would therefore be ~/.keras/datasets/example.txt.

Files in tar, tar.gz, tar.bz, and zip formats can also be extracted. Passing a hash will verify the file after download. The command line programs shasum and sha256sum can compute the hash.

Example

path_to_downloaded_file = tf.keras.utils.get_file(
    "flower_photos",
    "https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz",
    untar=True)

Arguments

  • fname: Name of the file. If an absolute path /path/to/file.txt is specified the file will be saved at that location.
  • origin: Original URL of the file.
  • untar: Deprecated in favor of extract argument. boolean, whether the file should be decompressed
  • md5_hash: Deprecated in favor of file_hash argument. md5 hash of the file for verification
  • file_hash: The expected hash string of the file after download. The sha256 and md5 hash algorithms are both supported.
  • cache_subdir: Subdirectory under the Keras cache dir where the file is saved. If an absolute path /path/to/folder is specified the file will be saved at that location.
  • hash_algorithm: Select the hash algorithm to verify the file. options are 'md5', 'sha256', and 'auto'. The default 'auto' detects the hash algorithm in use.
  • extract: True tries extracting the file as an Archive, like tar or zip.
  • archive_format: Archive format to try for extracting the file. Options are 'auto', 'tar', 'zip', and None. 'tar' includes tar, tar.gz, and tar.bz files. The default 'auto' corresponds to ['tar', 'zip']. None or an empty list will return no matches found.
  • cache_dir: Location to store cached files, when None it defaults to the default directory ~/.keras/.

Returns

Path to the downloaded file


Progbar class

tf.keras.utils.Progbar(
    target, width=30, verbose=1, interval=0.05, stateful_metrics=None, unit_name="step"
)

Displays a progress bar.

Arguments

  • target: Total number of steps expected, None if unknown.
  • width: Progress bar width on screen.
  • verbose: Verbosity mode, 0 (silent), 1 (verbose), 2 (semi-verbose)
  • stateful_metrics: Iterable of string names of metrics that should not be averaged over time. Metrics in this list will be displayed as-is. All others will be averaged by the progbar before display.
  • interval: Minimum visual progress update interval (in seconds).
  • unit_name: Display name for step counts (usually "step" or "sample").

Sequence class

tf.keras.utils.Sequence()

Base object for fitting to a sequence of data, such as a dataset.

Every Sequence must implement the __getitem__ and the __len__ methods. If you want to modify your dataset between epochs you may implement on_epoch_end. The method __getitem__ should return a complete batch.

Notes:

Sequence are a safer way to do multiprocessing. This structure guarantees that the network will only train once on each sample per epoch which is not the case with generators.

Examples

from skimage.io import imread
from skimage.transform import resize
import numpy as np
import math

# Here, `x_set` is list of path to the images
# and `y_set` are the associated classes.

class CIFAR10Sequence(Sequence):

    def __init__(self, x_set, y_set, batch_size):
        self.x, self.y = x_set, y_set
        self.batch_size = batch_size

    def __len__(self):
        return math.ceil(len(self.x) / self.batch_size)

    def __getitem__(self, idx):
        batch_x = self.x[idx * self.batch_size:(idx + 1) *
        self.batch_size]
        batch_y = self.y[idx * self.batch_size:(idx + 1) *
        self.batch_size]

        return np.array([
            resize(imread(file_name), (200, 200))
               for file_name in batch_x]), np.array(batch_y)