FeatureCross layer

[source]

FeatureCross class

keras_rs.layers.FeatureCross(
    projection_dim: Optional[int] = None,
    diag_scale: Optional[float] = 0.0,
    use_bias: bool = True,
    pre_activation: Union[
        str, keras.src.layers.activations.activation.Activation, NoneType
    ] = None,
    kernel_initializer: Union[
        str, keras.src.initializers.initializer.Initializer
    ] = "glorot_uniform",
    bias_initializer: Union[
        str, keras.src.initializers.initializer.Initializer
    ] = "zeros",
    kernel_regularizer: Union[
        str, NoneType, keras.src.regularizers.regularizers.Regularizer
    ] = None,
    bias_regularizer: Union[
        str, NoneType, keras.src.regularizers.regularizers.Regularizer
    ] = None,
    **kwargs: Any
)

FeatureCross layer in Deep & Cross Network (DCN).

A layer that creates explicit and bounded-degree feature interactions efficiently. The call method accepts two inputs: x0 contains the original features; the second input xi is the output of the previous FeatureCross layer in the stack, i.e., the i-th FeatureCross layer. For the first FeatureCross layer in the stack, x0 = xi.

The output is x_{i+1} = x0 .* (W * x_i + bias + diag_scale * x_i) + x_i, where .* denotes element-wise multiplication. W could be a full-rank matrix, or a low-rank matrix U*V to reduce the computational cost, and diag_scale increases the diagonal of W to improve training stability ( especially for the low-rank case).

Arguments

  • projection_dim: int. Dimension for down-projecting the input to reduce computational cost. If None (default), the full matrix, W (with shape (input_dim, input_dim)) is used. Otherwise, a low-rank matrix W = U*V will be used, where U is of shape (input_dim, projection_dim) and V is of shape (projection_dim, input_dim). projection_dim need to be smaller than input_dim//2 to improve the model efficiency. In practice, we've observed that projection_dim = input_dim//4 consistently preserved the accuracy of a full-rank version.
  • diag_scale: non-negative float. Used to increase the diagonal of the kernel W by diag_scale, i.e., W + diag_scale * I, where I is the identity matrix. Defaults to None.
  • use_bias: bool. Whether to add a bias term for this layer. Defaults to True.
  • pre_activation: string or keras.activations. Activation applied to output matrix of the layer, before multiplication with the input. Can be used to control the scale of the layer's outputs and improve stability. Defaults to None.
  • kernel_initializer: string or keras.initializers initializer. Initializer to use for the kernel matrix. Defaults to "glorot_uniform".
  • bias_initializer: string or keras.initializers initializer. Initializer to use for the bias vector. Defaults to "ones".
  • kernel_regularizer: string or keras.regularizer regularizer. Regularizer to use for the kernel matrix.
  • bias_regularizer: string or keras.regularizer regularizer. Regularizer to use for the bias vector.
  • **kwargs: Args to pass to the base class.

Example

# 1. Simple forward pass
batch_size = 2
embedding_dim = 32
feature1 = np.random.randn(batch_size, embedding_dim)
feature2 = np.random.randn(batch_size, embedding_dim)
crossed_features = keras_rs.layers.FeatureCross()(feature1, feature2)

# 2. After embedding layer in a model
vocabulary_size = 32
embedding_dim = 6

# Create a simple model containing the layer.
inputs = keras.Input(shape=(), name='indices', dtype="int32")
x0 = keras.layers.Embedding(
    input_dim=vocabulary_size,
    output_dim=embedding_dim
)(inputs)
x1 = keras_rs.layers.FeatureCross()(x0, x0)
x2 = keras_rs.layers.FeatureCross()(x0, x1)
logits = keras.layers.Dense(units=10)(x2)
model = keras.Model(inputs, logits)

# Call the model on the inputs.
batch_size = 2
input_data = np.random.randint(0, vocabulary_size, size=(batch_size,))
outputs = model(input_data)

References


[source]

call method

FeatureCross.call(x0: Any, x: Optional[Any] = None)

Forward pass of the cross layer.

Arguments

  • x0: a Tensor. The input to the cross layer. N-rank tensor with shape (batch_size, ..., input_dim).
  • x: a Tensor. Optional. If provided, the layer will compute crosses between x0 and x. Otherwise, the layer will compute crosses between x0 and itself. Should have the same shape as x0.

Returns

Tensor of crosses, with the same shape as x0.