average_pool functionkeras.ops.average_pool(
inputs, pool_size, strides=None, padding="valid", data_format=None
)
Average pooling operation.
Arguments
inputs has shape
(batch_size,) + inputs_spatial_shape + (num_channels,) if
data_format="channels_last", or
(batch_size, num_channels) + inputs_spatial_shape if
data_format="channels_first". Pooling happens over the spatial
dimensions only.len(inputs_spatial_shape), specifying the size of the pooling
window for each spatial dimension of the input tensor. If
pool_size is int, then every spatial dimension shares the same
pool_size.len(inputs_spatial_shape). The stride of the sliding window for
each spatial dimension of the input tensor. If strides is int,
then every spatial dimension shares the same strides."valid" or "same". "valid" means no
padding is applied, and "same" results in padding evenly to the
left/right or up/down of the input such that output has the
same height/width dimension as the input when strides=1."channels_last" or "channels_first".
data_format determines the ordering of the dimensions in the
inputs. If data_format="channels_last", inputs is of shape
(batch_size, ..., channels) while if
data_format="channels_first", inputs is of shape
(batch_size, channels, ...).Returns
A tensor of rank N+2, the result of the average pooling operation.
batch_normalization functionkeras.ops.batch_normalization(
x, mean, variance, axis, offset=None, scale=None, epsilon=0.001
)
Normalizes x by mean and variance.
This op is typically used by the batch normalization step in a neural network. It normalizes the input tensor along the given axis.
Arguments
axis dimension of the
input thensor.axis dimension
of the input tensor.axis dimension of
the input tensor. If not None, offset is added to the normalized
tensor. Defaults to None.axis dimension of the
input tensor. If not None, the normalized tensor is multiplied by
scale. Defaults to None.Returns
The normalized tensor.
Example
>>> x = keras.ops.convert_to_tensor(
... [[0.1, 0.2, 0.3], [0.4, 0.5, 0.6], [0.7, 0.8, 0.9]]
... )
>>> keras.ops.batch_normalization(
... x,
... mean=[0.4, 0.5, 0.6],
... variance=[0.67, 0.67, 0.67],
... axis=-1
... )
array([[-3.6624e-01, -3.6624e-01, -3.6624e-01],
[-4.6445e-09, 0.0000e+00, -1.8578e-08],
[ 3.6624e-01, 3.6624e-01, 3.6624e-01]])
binary_crossentropy functionkeras.ops.binary_crossentropy(target, output, from_logits=False)
Computes binary cross-entropy loss between target and output tensor.
The binary cross-entropy loss is commonly used in binary classification tasks where each input sample belongs to one of the two classes. It measures the dissimilarity between the target and output probabilities or logits.
Arguments
output tensor.target tensor.output is a tensor of logits or
probabilities.
Set it to True if output represents logits; otherwise,
set it to False if output represents probabilities.
Defaults to False.Returns
target and output.Example
>>> target = keras.ops.convert_to_tensor([0, 1, 1, 0])
>>> output = keras.ops.convert_to_tensor([0.1, 0.9, 0.8, 0.2])
>>> binary_crossentropy(target, output)
array([0.10536054 0.10536054 0.22314355 0.22314355],
shape=(4,), dtype=float32)
categorical_crossentropy functionkeras.ops.categorical_crossentropy(target, output, from_logits=False, axis=-1)
Computes categorical cross-entropy loss between target and output tensor.
The categorical cross-entropy loss is commonly used in multi-class classification tasks where each input sample can belong to one of multiple classes. It measures the dissimilarity between the target and output probabilities or logits.
Arguments
output tensor
except for the last dimension.target
tensor except for the last dimension.output is a tensor of logits or
probabilities.
Set it to True if output represents logits; otherwise,
set it to False if output represents probabilities.
Defaults to False.-1, which corresponds to the last dimension of
the tensors.Returns
target and output.Example
>>> target = keras.ops.convert_to_tensor(
... [[1, 0, 0],
... [0, 1, 0],
... [0, 0, 1]])
>>> output = keras.ops.convert_to_tensor(
... [[0.9, 0.05, 0.05],
... [0.1, 0.8, 0.1],
... [0.2, 0.3, 0.5]])
>>> categorical_crossentropy(target, output)
array([0.10536054 0.22314355 0.6931472 ], shape=(3,), dtype=float32)
conv functionkeras.ops.conv(
inputs, kernel, strides=1, padding="valid", data_format=None, dilation_rate=1
)
General N-D convolution.
This ops supports 1D, 2D and 3D convolution.
Arguments
inputs has shape
(batch_size,) + inputs_spatial_shape + (num_channels,) if
data_format="channels_last", or
(batch_size, num_channels) + inputs_spatial_shape if
data_format="channels_first".kernel has shape
(kernel_spatial_shape, num_input_channels, num_output_channels).
num_input_channels should match the number of channels in
inputs.len(inputs_spatial_shape),
specifying the strides of the convolution along each spatial
dimension. If strides is int, then every spatial dimension shares
the same strides."valid" or "same". "valid" means no
padding is applied, and "same" results in padding evenly to the
left/right or up/down of the input such that output has the
same height/width dimension as the input when strides=1."channels_last" or "channels_first".
data_format determines the ordering of the dimensions in the
inputs. If data_format="channels_last", inputs is of shape
(batch_size, ..., channels) while if
data_format="channels_first", inputs is of shape
(batch_size, channels, ...).len(inputs_spatial_shape),
specifying the dilation rate to use for dilated convolution. If
dilation_rate is int, then every spatial dimension shares
the same dilation_rate.Returns
A tensor of rank N+2, the result of the conv operation.
conv_transpose functionkeras.ops.conv_transpose(
inputs,
kernel,
strides=1,
padding="valid",
output_padding=None,
data_format=None,
dilation_rate=1,
)
General N-D convolution transpose.
Also known as de-convolution. This ops supports 1D, 2D and 3D convolution.
Arguments
inputs has shape
(batch_size,) + inputs_spatial_shape + (num_channels,) if
data_format="channels_last", or
(batch_size, num_channels) + inputs_spatial_shape if
data_format="channels_first".kernel has shape
[kernel_spatial_shape, num_output_channels, num_input_channels],
num_input_channels should match the number of channels in
inputs.len(inputs_spatial_shape),
specifying the strides of the convolution along each spatial
dimension. If strides is int, then every spatial dimension shares
the same strides."valid" or "same". "valid" means no
padding is applied, and "same" results in padding evenly to the
left/right or up/down of the input such that output has the
same height/width dimension as the input when strides=1.len(inputs_spatial_shape),
specifying the amount of padding along the height and width of
the output tensor. Can be a single integer to specify the same
value for all spatial dimensions. The amount of output padding
along a given dimension must be lower than the stride along that
same dimension. If set to None (default), the output shape is
inferred."channels_last" or "channels_first".
data_format determines the ordering of the dimensions in the
inputs. If data_format="channels_last", inputs is of shape
(batch_size, ..., channels) while if
data_format="channels_first", inputs is of shape
(batch_size, channels, ...).len(inputs_spatial_shape),
specifying the dilation rate to use for dilated convolution. If
dilation_rate is int, then every spatial dimension shares
the same dilation_rate.Returns
A tensor of rank N+2, the result of the conv operation.
ctc_decode functionkeras.ops.ctc_decode(
inputs,
sequence_lengths,
strategy="greedy",
beam_width=100,
top_paths=1,
merge_repeated=True,
mask_index=0,
)
Decodes the output of a CTC model.
Arguments
(batch_size, max_length, num_classes)
containing the logits (the output of the model).
They should not be normalized via softmax.(batch_size,) containing the
sequence lengths for the batch."greedy" and "beam_search".True.0.Returns
strategy="greedy", the shape is (1, batch_size, max_length). If
strategy="beam_search", the shape is
(top_paths, batch_size, max_length). Note that: -1 indicates the
blank label.strategy="greedy", a tensor of shape (batch_size, 1)
representing the negative of the sum of the probability logits for
each sequence. If strategy="beam_seatch", a tensor of shape
(batch_size, top_paths) representing the log probability for each
sequence.ctc_loss functionkeras.ops.ctc_loss(target, output, target_length, output_length, mask_index=0)
CTC (Connectionist Temporal Classification) loss.
Arguments
(batch_size, max_length) containing
the true labels in integer format.(batch_size, max_length, num_classes)
containing logits (the output of your model).(batch_size,) containing the
true label lengths.(batch_size,) containing the
output lengths.0.depthwise_conv functionkeras.ops.depthwise_conv(
inputs, kernel, strides=1, padding="valid", data_format=None, dilation_rate=1
)
General N-D depthwise convolution.
This ops supports 1D and 2D depthwise convolution.
Arguments
inputs has shape
(batch_size,) + inputs_spatial_shape + (num_channels,) if
data_format="channels_last", or
(batch_size, num_channels) + inputs_spatial_shape if
data_format="channels_first".kernel has shape
[kernel_spatial_shape, num_input_channels, num_channels_multiplier],
num_input_channels should match the number of channels in
inputs.len(inputs_spatial_shape),
specifying the strides of the convolution along each spatial
dimension. If strides is int, then every spatial dimension shares
the same strides."valid" or "same". "valid" means no
padding is applied, and "same" results in padding evenly to the
left/right or up/down of the input such that output has the
same height/width dimension as the input when strides=1."channels_last" or "channels_first".
data_format determines the ordering of the dimensions in the
inputs. If data_format="channels_last", inputs is of shape
(batch_size, ..., channels) while if
data_format="channels_first", inputs is of shape
(batch_size, channels, ...).len(inputs_spatial_shape),
specifying the dilation rate to use for dilated convolution. If
dilation_rate is int, then every spatial dimension shares
the same dilation_rate.Returns
A tensor of rank N+2, the result of the depthwise conv operation.
dot_product_attention functionkeras.ops.dot_product_attention(
query,
key,
value,
bias=None,
mask=None,
scale=None,
is_causal=False,
flash_attention=None,
attn_logits_soft_cap=None,
)
Scaled dot product attention function.
Computes the attention function on Q (query), K (key), and V(value):
attention(Q, K, V) = softmax(Q * K / sqrt(d)) * V. If we define logits
as the output of Q * K and the probs as the output of softmax.
Throughout this function, we utilize the following notation to represent the
shape of array:
- B: batch size
- S: length of the key/value
- T: length of the query
- N: number of attention heads
- H: dimensions of each attention head
- K: number of key/value heads
- G: number of groups, which equals to N // K
Arguments
(B, T, N, H).(B, S, K, H). When K equals
N, multi-headed attention (MHA) is performed. Otherwise, grouped
query attention (GQA) is performed if N is a multiple of K. and
multi-query attention (MQA) is performed if K==1 (a special case
of GQA).key.(B, N, T, S).True indicates the element should take part in
attention. For an additive mask, users should pass it to bias. The
shape must be broadcastable to (B, N, T, S).None, the scale will be set
to 1.0 / sqrt(H).None, it will
attempt to use flash attention if the required conditions are met.
Typically, the inputs must be in float16 and bfloat16 dtype and the
input layout requirements may vary depending on the backend.Returns
An array of the attention output with the same shape of query.
Example
>>> query = keras.random.normal((2, 4, 8, 16))
>>> key = keras.random.normal((2, 6, 8, 16))
>>> value = keras.random.normal((2, 6, 8, 16))
>>> keras.ops.nn.dot_product_attention(query, key, value).shape
(2, 4, 8, 16)
elu functionkeras.ops.elu(x, alpha=1.0)
Exponential Linear Unit activation function.
It is defined as:
f(x) = alpha * (exp(x) - 1.) for x < 0, f(x) = x for x >= 0.
Arguments
1.0.Returns
A tensor with the same shape as x.
Example
>>> x = np.array([-1., 0., 1.])
>>> x_elu = keras.ops.elu(x)
>>> print(x_elu)
array([-0.63212055, 0., 1.], shape=(3,), dtype=float64)
gelu functionkeras.ops.gelu(x, approximate=True)
Gaussian Error Linear Unit (GELU) activation function.
If approximate is True, it is defined as:
f(x) = 0.5 * x * (1 + tanh(sqrt(2 / pi) * (x + 0.044715 * x^3)))
Or if approximate is False, it is defined as:
f(x) = x * P(X <= x) = 0.5 * x * (1 + erf(x / sqrt(2))),
where P(X) ~ N(0, 1).
Arguments
True.Returns
A tensor with the same shape as x.
Example
>>> x = np.array([-1., 0., 1.])
>>> x_gelu = keras.ops.gelu(x)
>>> print(x_gelu)
array([-0.15865525, 0., 0.84134475], shape=(3,), dtype=float64)
hard_sigmoid functionkeras.ops.hard_sigmoid(x)
Hard sigmoid activation function.
It is defined as:
0 if x < -2.5, 1 if x > 2.5, (0.2 * x) + 0.5 if -2.5 <= x <= 2.5.
Arguments
Returns
A tensor with the same shape as x.
Example
>>> x = np.array([-1., 0., 1.])
>>> x_hard_sigmoid = keras.ops.hard_sigmoid(x)
>>> print(x_hard_sigmoid)
array([0.3, 0.5, 0.7], shape=(3,), dtype=float64)
leaky_relu functionkeras.ops.leaky_relu(x, negative_slope=0.2)
Leaky version of a Rectified Linear Unit activation function.
It allows a small gradient when the unit is not active, it is defined as:
f(x) = alpha * x for x < 0 or f(x) = x for x >= 0.
Arguments
0.2.Returns
A tensor with the same shape as x.
Example
>>> x = np.array([-1., 0., 1.])
>>> x_leaky_relu = keras.ops.leaky_relu(x)
>>> print(x_leaky_relu)
array([-0.2, 0. , 1. ], shape=(3,), dtype=float64)
log_sigmoid functionkeras.ops.log_sigmoid(x)
Logarithm of the sigmoid activation function.
It is defined as f(x) = log(1 / (1 + exp(-x))).
Arguments
Returns
A tensor with the same shape as x.
Example
>>> x = keras.ops.convert_to_tensor([-0.541391, 0.0, 0.50, 5.0])
>>> keras.ops.log_sigmoid(x)
array([-1.0000418, -0.6931472, -0.474077, -0.00671535], dtype=float32)
log_softmax functionkeras.ops.log_softmax(x, axis=-1)
Log-softmax activation function.
It is defined as:
f(x) = x - max(x) - log(sum(exp(x - max(x))))
Arguments
-1.Returns
A tensor with the same shape as x.
Example
>>> x = np.array([-1., 0., 1.])
>>> x_log_softmax = keras.ops.log_softmax(x)
>>> print(x_log_softmax)
array([-2.40760596, -1.40760596, -0.40760596], shape=(3,), dtype=float64)
max_pool functionkeras.ops.max_pool(
inputs, pool_size, strides=None, padding="valid", data_format=None
)
Max pooling operation.
Arguments
inputs has shape
(batch_size,) + inputs_spatial_shape + (num_channels,) if
data_format="channels_last", or
(batch_size, num_channels) + inputs_spatial_shape if
data_format="channels_first". Pooling happens over the spatial
dimensions only.len(inputs_spatial_shape), specifying the size of the pooling
window for each spatial dimension of the input tensor. If
pool_size is int, then every spatial dimension shares the same
pool_size.len(inputs_spatial_shape). The stride of the sliding window for
each spatial dimension of the input tensor. If strides is int,
then every spatial dimension shares the same strides."valid" or "same". "valid" means no
padding is applied, and "same" results in padding evenly to the
left/right or up/down of the input such that output has the
same height/width dimension as the input when strides=1."channels_last" or "channels_first".
data_format determines the ordering of the dimensions in the
inputs. If data_format="channels_last", inputs is of shape
(batch_size, ..., channels) while if
data_format="channels_first", inputs is of shape
(batch_size, channels, ...).Returns
A tensor of rank N+2, the result of the max pooling operation.
moments functionkeras.ops.moments(x, axes, keepdims=False, synchronized=False)
Calculates the mean and variance of x.
The mean and variance are calculated by aggregating the contents of x
across axes. If x is 1-D and axes = [0] this is just the mean and
variance of a vector.
Arguments
True, the axes which are reduced are left
in the result as dimensions with size one.True, synchronizes the global batch statistics (mean and
variance) across all devices at each training step in a
distributed training strategy. If False, each replica uses its own
local batch statistics.Returns
A tuple containing two tensors - mean and variance.
Example
>>> x = keras.ops.convert_to_tensor([0, 1, 2, 3, 100], dtype="float32")
>>> keras.ops.moments(x, axes=[0])
(array(21.2, dtype=float32), array(1553.3601, dtype=float32))
multi_hot functionkeras.ops.multi_hot(
inputs, num_classes=None, axis=-1, dtype=None, sparse=False, **kwargs
)
Encodes integer labels as multi-hot vectors.
This function encodes integer labels as multi-hot vectors, where each label is mapped to a binary value in the resulting vector.
Arguments
-1, which corresponds to the last dimension.Returns
Example
>>> data = keras.ops.convert_to_tensor([0, 4])
>>> keras.ops.multi_hot(data, num_classes=5)
array([1.0, 0.0, 0.0, 0.0, 1.0], dtype=float32)
normalize functionkeras.ops.normalize(x, axis=-1, order=2, epsilon=None)
Normalizes x over the specified axis.
It is defined as: normalize(x) = x / max(norm(x), epsilon).
Arguments
backend.epsilon().Returns
The normalized array.
Example
>>> x = keras.ops.convert_to_tensor([[1, 2, 3], [4, 5, 6]])
>>> x_norm = keras.ops.math.normalize(x)
>>> print(x_norm)
array([[0.26726124 0.5345225 0.8017837 ]
[0.45584232 0.5698029 0.68376344]], shape=(2, 3), dtype=float32)
one_hot functionkeras.ops.one_hot(x, num_classes, axis=-1, dtype=None, sparse=False)
Converts integer tensor x into a one-hot tensor.
The one-hot encoding is a representation where each integer value is
converted into a binary vector with a length equal to num_classes,
and the index corresponding to the integer value is marked as 1, while
all other indices are marked as 0.
Arguments
-1 represents the last axis. Defaults to -1.Returns
x
except for the specified axis dimension, which will have
a length of num_classes. The dtype of the output tensor
is determined by dtype or the default data type of the backend.Example
>>> x = keras.ops.convert_to_tensor([1, 3, 2, 0])
>>> one_hot(x, num_classes=4)
array([[0. 1. 0. 0.]
[0. 0. 0. 1.]
[0. 0. 1. 0.]
[1. 0. 0. 0.]], shape=(4, 4), dtype=float32)
psnr functionkeras.ops.psnr(x1, x2, max_val)
Peak Signal-to-Noise Ratio (PSNR) function.
This function computes the Peak Signal-to-Noise Ratio between two signals,
x1 and x2. PSNR is a measure of the quality of a reconstructed signal.
The higher the PSNR, the closer the reconstructed signal is to the original
signal. Note that it can become negative when the signal power is
smaller that the noise power.
Arguments
x1.Returns
x1 and x2.Examples
>>> x1 = keras.random.normal((2, 4, 4, 3))
>>> x2 = keras.random.normal((2, 4, 4, 3))
>>> max_val = 1.0
>>> keras.ops.nn.psnr(x1, x2, max_val)
-3.1697404
relu functionkeras.ops.relu(x)
Rectified linear unit activation function.
It is defined as f(x) = max(0, x).
Arguments
Returns
A tensor with the same shape as x.
Example
>>> x1 = keras.ops.convert_to_tensor([-1.0, 0.0, 1.0, 0.2])
>>> keras.ops.relu(x1)
array([0.0, 0.0, 1.0, 0.2], dtype=float32)
relu6 functionkeras.ops.relu6(x)
Rectified linear unit activation function with upper bound of 6.
It is defined as f(x) = np.clip(x, 0, 6).
Arguments
Returns
A tensor with the same shape as x.
Example
>>> x = keras.ops.convert_to_tensor([-3.0, -2.0, 0.1, 0.2, 6.0, 8.0])
>>> keras.ops.relu6(x)
array([0.0, 0.0, 0.1, 0.2, 6.0, 6.0], dtype=float32)
selu functionkeras.ops.selu(x)
Scaled Exponential Linear Unit (SELU) activation function.
It is defined as:
f(x) = scale * alpha * (exp(x) - 1.) for x < 0,
f(x) = scale * x for x >= 0.
Arguments
Returns
A tensor with the same shape as x.
Example
>>> x = np.array([-1., 0., 1.])
>>> x_selu = keras.ops.selu(x)
>>> print(x_selu)
array([-1.11133055, 0., 1.05070098], shape=(3,), dtype=float64)
separable_conv functionkeras.ops.separable_conv(
inputs,
depthwise_kernel,
pointwise_kernel,
strides=1,
padding="valid",
data_format=None,
dilation_rate=1,
)
General N-D separable convolution.
This ops supports 1D and 2D separable convolution. separable_conv is
a depthwise conv followed by a pointwise conv.
Arguments
inputs has shape
(batch_size,) + inputs_spatial_shape + (num_channels,) if
data_format="channels_last", or
(batch_size, num_channels) + inputs_spatial_shape if
data_format="channels_first".depthwise_kernel has shape
[kernel_spatial_shape, num_input_channels, num_channels_multiplier],
num_input_channels should match the number of channels in
inputs.pointwise_kernel has shape
(*ones_like(kernel_spatial_shape),
num_input_channels * num_channels_multiplier, num_output_channels).len(inputs_spatial_shape),
specifying the strides of the convolution along each spatial
dimension. If strides is int, then every spatial dimension shares
the same strides."valid" or "same". "valid" means no
padding is applied, and "same" results in padding evenly to the
left/right or up/down of the input such that output has the
same height/width dimension as the input when strides=1."channels_last" or "channels_first".
data_format determines the ordering of the dimensions in the
inputs. If data_format="channels_last", inputs is of shape
(batch_size, ..., channels) while if
data_format="channels_first", inputs is of shape
(batch_size, channels, ...).len(inputs_spatial_shape),
specifying the dilation rate to use for dilated convolution. If
dilation_rate is int, then every spatial dimension shares
the same dilation_rate.Returns
A tensor of rank N+2, the result of the depthwise conv operation.
sigmoid functionkeras.ops.sigmoid(x)
Sigmoid activation function.
It is defined as f(x) = 1 / (1 + exp(-x)).
Arguments
Returns
A tensor with the same shape as x.
Example
>>> x = keras.ops.convert_to_tensor([-6.0, 1.0, 0.0, 1.0, 6.0])
>>> keras.ops.sigmoid(x)
array([0.00247262, 0.7310586, 0.5, 0.7310586, 0.9975274], dtype=float32)
silu functionkeras.ops.silu(x)
Sigmoid Linear Unit (SiLU) activation function, also known as Swish.
The SiLU activation function is computed by the sigmoid function multiplied
by its input. It is defined as f(x) = x * sigmoid(x).
Arguments
Returns
A tensor with the same shape as x.
Example
>>> x = keras.ops.convert_to_tensor([-6.0, 1.0, 0.0, 1.0, 6.0])
>>> keras.ops.sigmoid(x)
array([0.00247262, 0.7310586, 0.5, 0.7310586, 0.9975274], dtype=float32)
>>> keras.ops.silu(x)
array([-0.0148357, 0.7310586, 0.0, 0.7310586, 5.9851646], dtype=float32)
hard_silu functionkeras.ops.hard_silu(x)
Hard SiLU activation function, also known as Hard Swish.
It is defined as:
0 if if x < -3x if x > 3x * (x + 3) / 6 if -3 <= x <= 3It's a faster, piecewise linear approximation of the silu activation.
Arguments
Returns
A tensor with the same shape as x.
Example
>>> x = keras.ops.convert_to_tensor([-3.0, -1.0, 0.0, 1.0, 3.0])
>>> keras.ops.hard_silu(x)
array([-0.0, -0.3333333, 0.0, 0.6666667, 3.0], shape=(5,), dtype=float32)
softmax functionkeras.ops.softmax(x, axis=-1)
Softmax activation function.
The elements of the output vector lie within the range (0, 1), and their
total sum is exactly 1 (excluding the floating point rounding error).
Each vector is processed independently. The axis argument specifies the
axis along which the function is applied within the input.
It is defined as:
f(x) = exp(x) / sum(exp(x))
Arguments
Returns
A tensor with the same shape as x.
Example
>>> x = np.array([-1., 0., 1.])
>>> x_softmax = keras.ops.softmax(x)
>>> print(x_softmax)
array([0.09003057, 0.24472847, 0.66524096], shape=(3,), dtype=float64)
softplus functionkeras.ops.softplus(x)
Softplus activation function.
It is defined as f(x) = log(exp(x) + 1), where log is the natural
logarithm and exp is the exponential function.
Arguments
Returns
A tensor with the same shape as x.
Example
>>> x = keras.ops.convert_to_tensor([-0.555, 0.0, 0.555])
>>> keras.ops.softplus(x)
array([0.45366603, 0.6931472, 1.008666], dtype=float32)
softsign functionkeras.ops.softsign(x)
Softsign activation function.
It is defined as f(x) = x / (abs(x) + 1).
Arguments
Returns
A tensor with the same shape as x.
Example
>>> x = keras.ops.convert_to_tensor([-0.100, -10.0, 1.0, 0.0, 100.0])
>>> keras.ops.softsign(x)
Array([-0.09090909, -0.90909094, 0.5, 0.0, 0.990099], dtype=float32)
sparse_categorical_crossentropy functionkeras.ops.sparse_categorical_crossentropy(target, output, from_logits=False, axis=-1)
Computes sparse categorical cross-entropy loss.
The sparse categorical cross-entropy loss is similar to categorical cross-entropy, but it is used when the target tensor contains integer class labels instead of one-hot encoded vectors. It measures the dissimilarity between the target and output probabilities or logits.
Arguments
output
tensor except for the last dimension.target tensor except
for the last dimension.output is a tensor of logits
or probabilities.
Set it to True if output represents logits; otherwise,
set it to False if output represents probabilities.
Defaults to False.-1, which corresponds to the last dimension
of the tensors.Returns
target and output.Example
>>> target = keras.ops.convert_to_tensor([0, 1, 2], dtype=int32)
>>> output = keras.ops.convert_to_tensor(
... [[0.9, 0.05, 0.05],
... [0.1, 0.8, 0.1],
... [0.2, 0.3, 0.5]])
>>> sparse_categorical_crossentropy(target, output)
array([0.10536056 0.22314355 0.6931472 ], shape=(3,), dtype=float32)
silu functionkeras.ops.swish(x)
Sigmoid Linear Unit (SiLU) activation function, also known as Swish.
The SiLU activation function is computed by the sigmoid function multiplied
by its input. It is defined as f(x) = x * sigmoid(x).
Arguments
Returns
A tensor with the same shape as x.
Example
>>> x = keras.ops.convert_to_tensor([-6.0, 1.0, 0.0, 1.0, 6.0])
>>> keras.ops.sigmoid(x)
array([0.00247262, 0.7310586, 0.5, 0.7310586, 0.9975274], dtype=float32)
>>> keras.ops.silu(x)
array([-0.0148357, 0.7310586, 0.0, 0.7310586, 5.9851646], dtype=float32)
hard_silu functionkeras.ops.hard_swish(x)
Hard SiLU activation function, also known as Hard Swish.
It is defined as:
0 if if x < -3x if x > 3x * (x + 3) / 6 if -3 <= x <= 3It's a faster, piecewise linear approximation of the silu activation.
Arguments
Returns
A tensor with the same shape as x.
Example
>>> x = keras.ops.convert_to_tensor([-3.0, -1.0, 0.0, 1.0, 3.0])
>>> keras.ops.hard_silu(x)
array([-0.0, -0.3333333, 0.0, 0.6666667, 3.0], shape=(5,), dtype=float32)
celu functionkeras.ops.celu(x, alpha=1.0)
Continuously-differentiable exponential linear unit.
It is defined as:
f(x) = alpha * (exp(x / alpha) - 1) for x < 0, f(x) = x for x >= 0.
Arguments
1.0.Returns
A tensor with the same shape as x.
Example
>>> x = np.array([-1., 0., 1.])
>>> x_celu = keras.ops.celu(x)
>>> print(x_celu)
array([-0.63212056, 0. , 1. ], shape=(3,), dtype=float64)
sparsemax functionkeras.ops.sparsemax(x, axis=-1)
Sparsemax activation function.
For each batch i, and class j,
sparsemax activation function is defined as:
sparsemax(x)[i, j] = max(x[i, j] - τ(x[i, :]), 0).
Arguments
int, axis along which the sparsemax operation is applied.Returns
A tensor, output of sparsemax transformation. Has the same type and
shape as x.
Example
>>> x = np.array([-1., 0., 1.])
>>> x_sparsemax = keras.ops.sparsemax(x)
>>> print(x_sparsemax)
array([0., 0., 1.], shape=(3,), dtype=float64)
squareplus functionkeras.ops.squareplus(x, b=4)
Squareplus activation function.
The Squareplus activation function is defined as:
f(x) = (x + sqrt(x^2 + b)) / 2
Arguments
Returns
A tensor with the same shape as x.
Example
>>> x = np.array([-1.0, 0.0, 1.0])
>>> x_squareplus = keras.ops.squareplus(x)
>>> print(x_squareplus)
array([0.6180, 1.0000, 1.6180], dtype=float32)
sparse_plus functionkeras.ops.sparse_plus(x)
SparsePlus activation function.
It is defined as
f(x) = 0 for x <= -1.
f(x) = (1/4) * (x + 1)^2 for -1 < x < 1.
f(x) = x for x >= 1.
Arguments
Returns
A tensor with the same shape as x.
Example
>>> x = np.array([-1.0, 0.0, 1.0])
>>> x_sparse_plus = keras.ops.sparse_plus(x)
>>> print(x_sparse_plus)
Array([0. 0.25 1. ], shape=(3,), dtype=float32)
soft_shrink functionkeras.ops.soft_shrink(x, threshold=0.5)
Soft Shrink activation function.
It is defined as
f(x) = x - threshold if x > threshold,
f(x) = x + threshold if x < -threshold,
f(x) = 0 otherwise.
Arguments
Returns
A tensor with the same shape as x.
Example
>>> x = np.array([-1.0, 0.0, 1.0])
>>> x_soft_shrink = keras.ops.soft_shrink(x)
>>> print(x_soft_shrink)
array([-0.5 0. 0.5], shape=(3,), dtype=float64)
threshold functionkeras.ops.threshold(x, threshold, default_value)
Threshold activation function.
The function thresholds the input x as follows:
f(x) = x if x > threshold,
f(x) = default_value otherwise.
Arguments
x <= threshold.Returns
A tensor with the same shape as x.
Example
>>> x = np.array([-1.0, 0.0, 1.0, 2.0])
>>> x_threshold = keras.ops.threshold(x, 1, 0)
>>> print(x_threshold)
array([0., 0., 0., 2.], shape=(4,), dtype=float64)
glu functionkeras.ops.glu(x, axis=-1)
Gated Linear Unit (GLU) activation function.
It is defined as:
f(x) = a * sigmoid(b)
where x is split into a and b along the given axis.
Arguments
-1.Returns
A tensor with the same shape as half of the input.
Example
>>> x = np.array([-1., 0., 1. , 1.])
>>> x_glu = keras.ops.glu(x)
>>> print(x_glu)
array([-0.73105858, 0. ], shape=(2,), dtype=float64)
tanh_shrink functionkeras.ops.tanh_shrink(x)
Applies the tanh shrink function element-wise.
It is defined as:
f(x) = x - tanh(x).
Arguments
Returns
Output tensor of the same shape as x, where each element is
transformed according to the tanh shrink operation.
Example
>>> x = np.array([ -1., 0., 1.])
>>> x_tanh_shrink = keras.ops.tanh_shrink(x)
>>> print(x_tanh_shrink)
array([-0.23840584 0. 0.23840584], shape=(3,), dtype=float64)
hard_tanh functionkeras.ops.hard_tanh(x)
Applies the HardTanh function element-wise.
It is defined as:
f(x) = -1 for x < -1, f(x) = x for -1 <= x <= 1, f(x) = 1 for x > 1.
Arguments
Returns
Output tensor of same shape as x
where values are clamped between -1 and 1.
Example
>>> x = np.array([-2., -1., 0., 1., 2.])
>>> x_hard_tanh = keras.ops.hard_tanh(x)
>>> print(x_hard_tanh)
array([-1. -1. 0. 1. 1.], shape=(5,), dtype=float64)
hard_shrink functionkeras.ops.hard_shrink(x, threshold=0.5)
Hard Shrink activation function.
The Hard Shrink function is a thresholding operation defined as:
f(x) = x if |x| > threshold,
f(x) = 0 otherwise.
Arguments
Returns
A tensor with the same shape as x.
Example
>>> x = np.array([-0.5, 0., 1.])
>>> x_hard_shrink = keras.ops.hard_shrink(x)
>>> print(x_hard_shrink)
array([0. 0. 1.], shape=(3,), dtype=float64)