Keras 3 API documentation / Quantizers / Quantizer utilities

Quantizer utilities

[source]

abs_max_quantize function

keras.quantizers.abs_max_quantize(
    inputs, axis, value_range=(-127, 127), dtype="int8", epsilon=1e-07, to_numpy=False
)

[source]

compute_float8_amax_history function

keras.quantizers.compute_float8_amax_history(x, amax_history)

[source]

compute_float8_scale function

keras.quantizers.compute_float8_scale(amax, scale, dtype_max, margin=0)

[source]

deserialize function

keras.quantizers.deserialize(config, custom_objects=None)

Return a Keras quantizer object via its config.


[source]

fake_quant_with_min_max_vars function

keras.quantizers.fake_quant_with_min_max_vars(
    inputs, min_vals, max_vals, num_bits=8, narrow_range=False, axis=None
)

Perform per-tensor or per-channel fake quantization.

[min_vals, max_vals] define the clamping range for the inputs.

The inputs are quantized into the quantization range: - [0, 2^num_bits - 1] when narrow_range=False - [1, 2^num_bits - 1] when narrow_range=True

After quantization, the values are dequantized and output as floats within the [min_vals, max_vals] interval.

This operation supports gradient computation, allowing min_vals and max_vals to be trained.

Arguments

  • inputs: Input Keras tensor of float dtype.
  • min_vals: A global minimum scalar or a per-channel minimum tensor.
  • max_vals: A global maximum scalar or a per-channel maximum tensor.
  • num_bits: Quantization bit width (e.g., 8 for int8). Defaults to 8.
  • narrow_range: Whether to use narrow quantization range. Defaults to False.
  • axis: Axis along which to perform per-channel quantization. If None, per-tensor quantization is performed. Defaults to None.

Returns

  • Tensor: A Keras tensor with fake quantization applied.

[source]

get function

keras.quantizers.get(identifier, **kwargs)

Retrieve a Keras quantizer object via an identifier.


[source]

pack_int4 function

keras.quantizers.pack_int4(arr, axis=0, dtype="int8")

Pack an int4 tensor into an int8 tensor with packed nibbles.

The input values must already be int8 in the signed range `[-8, 7]` and
represent the desired int4 values. Packing is performed along the specified
axis (default is 0).

For every two consecutive rows, the **low nibble** of the output byte
stores the value from the first row, and the **high nibble** stores
the value from the second row.

# Arguments
    arr: An `int8` or `uint8` tensor containing int4 values in the range
        `[-8, 7]`.
    axis: The axis along which to pack the tensor. Defaults to 0.
    dtype: The data type of the input and packed tensor. Can be
        `"int8"` or `"uint8"`. Defaults to `"int8"`.

# Returns
    tuple: A tuple `(packed, packed_shape, orig_rows)` where `packed` is
        the packed int8 tensor with int4 values stored in nibbles,
        `packed_shape` is the shape of the packed tensor, and `orig_rows`
        is the original (unpacked) row count prior to any padding that may
        have been inserted when an odd number of rows is supplied.

# Example


```python
>>> import numpy as np
>>> from keras.quantizers import pack_int4, unpack_int4

# Example with axis=0
# Original array has shape (3, 2)
>>> original_array = np.array([[-3, 7], [2, -8], [1, 0]], dtype=np.int8)

# Pack the array along axis 0. Since the length of axis 0 (3) is
# odd, it will be padded to a length of 4. The packed array will
# have a shape of (ceil(3/2), 2) = (2, 2).
>>> packed, packed_shape, orig_len = pack_int4(original_array, axis=0)
>>> print("Packed array:

", packed) Packed array: [[ 45 -121] [ 1 0]]

# Now, unpack the array back to its original form
>>> unpacked = unpack_int4(packed, orig_len, axis=0)
>>> print("Unpacked array:

", unpacked) Unpacked array: [[-3 7] [ 2 -8] [ 1 0]] >>> np.allclose(original_array, unpacked) True

# Example with axis=1
# Original array has shape (2, 3)
>>> original_array = np.array([[-3, 7, 2], [-8, 1, 0]], dtype=np.int8)

# Pack along axis 1. Length of axis 1 (3) is padded to 4.
# The new shape is (2, ceil(3/2)) = (2, 2).
>>> packed, packed_shape, orig_len = pack_int4(original_array, axis=1)
>>> print("Packed array:

", packed) Packed array: [[ 125 2] [ 24 0]]

# Unpack the array
>>> unpacked = unpack_int4(packed, orig_len, axis=1)
>>> print("Unpacked array:

", unpacked) Unpacked array: [[-3 7 2] [-8 1 0]] >>> np.allclose(original_array, unpacked) True ```


[source]

quantize_and_dequantize function

keras.quantizers.quantize_and_dequantize(
    inputs, scale, quantized_dtype, compute_dtype
)

[source]

serialize function

keras.quantizers.serialize(initializer)

[source]

unpack_int4 function

keras.quantizers.unpack_int4(packed, orig_len, axis=0, dtype="int8")

Unpack a packed int4 back to an int8 tensor in the range [-8, 7].

This function reverses the packing performed by `pack_int4`, restoring
the original int8 tensor (values in the range [-8, 7]) from a packed int8
tensor where each element contains two int4 values (one in the lower nibble,
one in the upper nibble).

The function restores the original axis order and removes any
padding that was added during packing.

# Arguments
    packed: An int8 tensor containing packed int4 values along the
        specified axis. Each int8 value encodes two int4 values.
    orig_len: The original (unpadded) length of the axis that was
        packed. This is used to remove any padding that may have
        been added during packing to ensure an even number of rows.
    axis: The axis along which the tensor was packed. Defaults to 0.
    dtype: The data type of the input and unpacked tensor. Can be
        `"int8"` or `"uint8"`. Defaults to `"int8"`.

# Returns
    unpacked: An int8 tensor with the same shape as the original
        (unpacked) tensor, with values in the range [-8, 7].

# Example


```python
>>> import numpy as np
>>> from keras.quantizers import pack_int4, unpack_int4

# Example with axis=0
# Original array has shape (3, 2)
>>> original_array = np.array([[-3, 7], [2, -8], [1, 0]], dtype=np.int8)

# Pack the array along axis 0. Since the length of axis 0 (3) is
# odd, it will be padded to a length of 4. The packed array will
# have a shape of (ceil(3/2), 2) = (2, 2).
>>> packed, packed_shape, orig_len = pack_int4(original_array, axis=0)
>>> print("Packed array:

", packed) Packed array: [[ 45 -121] [ 1 0]]

# Now, unpack the array back to its original form
>>> unpacked = unpack_int4(packed, orig_len, axis=0)
>>> print("Unpacked array:

", unpacked) Unpacked array: [[-3 7] [ 2 -8] [ 1 0]] >>> np.allclose(original_array, unpacked) True

# Example with axis=1
# Original array has shape (2, 3)
>>> original_array = np.array([[-3, 7, 2], [-8, 1, 0]], dtype=np.int8)

# Pack along axis 1. Length of axis 1 (3) is padded to 4.
# The new shape is (2, ceil(3/2)) = (2, 2).
>>> packed, packed_shape, orig_len = pack_int4(original_array, axis=1)
>>> print("Packed array:

", packed) Packed array: [[ 125 2] [ 24 0]]

# Unpack the array
>>> unpacked = unpack_int4(packed, orig_len, axis=1)
>>> print("Unpacked array:

", unpacked) Unpacked array: [[-3 7 2] [-8 1 0]] >>> np.allclose(original_array, unpacked) True ```