ยป Keras API reference / KerasCV / Models / The RetinaNet model

The RetinaNet model

[source]

RetinaNet class

keras_cv.models.RetinaNet(
    classes,
    bounding_box_format,
    backbone,
    include_rescaling=None,
    backbone_weights=None,
    anchor_generator=None,
    label_encoder=None,
    prediction_decoder=None,
    feature_pyramid=None,
    classification_head=None,
    box_head=None,
    evaluate_train_time_metrics=False,
    name="RetinaNet",
    **kwargs
)

A Keras model implementing the RetinaNet architecture.

Implements the RetinaNet architecture for object detection. The constructor requires classes, bounding_box_format and a backbone. Optionally, a custom label encoder, feature pyramid network, and prediction decoder may all be provided.

Usage:

retina_net = keras_cv.models.RetinaNet(
    classes=20,
    bounding_box_format="xywh",
    backbone="resnet50",
    backbone_weights="imagenet",
    include_rescaling=True,
)

Arguments

  • classes: the number of classes in your dataset excluding the background class. Classes should be represented by integers in the range [0, classes).
  • bounding_box_format: The format of bounding boxes of input dataset. Refer to the keras.io docs for more details on supported bounding box formats.
  • backbone: Either "resnet50" or a custom backbone model.
  • include_rescaling: Required if provided backbone is a pre-configured model. If set to True, inputs will be passed through a Rescaling(1/255.0) layer.
  • backbone_weights: (Optional) if using a KerasCV provided backbone, the underlying backbone model will be loaded using the weights provided in this argument. Can be a model checkpoint path, or a string from the supported weight sets in the underlying model.
  • anchor_generator: (Optional) a keras_cv.layers.AnchorGenerator. If provided, the anchor generator will be passed to both the label_encoder and the prediction_decoder. Only to be used when both label_encoder and prediction_decoder are both None. Defaults to an anchor generator with the parameterization: strides=[2**i for i in range(3, 8)], scales=[2**x for x in [0, 1 / 3, 2 / 3]], sizes=[32.0, 64.0, 128.0, 256.0, 512.0], and aspect_ratios=[0.5, 1.0, 2.0].
  • label_encoder: (Optional) a keras.Layer that accepts an image Tensor and a bounding box Tensor to its call() method, and returns RetinaNet training targets. By default, a KerasCV standard LabelEncoder is created and used. Results of this call() method are passed to the loss object passed into compile() as the y_true argument.
  • prediction_decoder: (Optional) A keras.layer that is responsible for transforming RetinaNet predictions into usable bounding box Tensors. If not provided, a default is provided. The default prediction_decoder layer uses a NonMaxSuppression operation for box pruning.
  • feature_pyramid: (Optional) A keras.Model representing a feature pyramid network (FPN). The feature pyramid network is called on the outputs of the backbone. The KerasCV default backbones return three outputs in a list, but custom backbones may be written and used with custom feature pyramid networks. If not provided, a default feature pyramid neetwork is produced by the library. The default feature pyramid network is compatible with all standard keras_cv backbones.
  • classification_head: (Optional) A keras.Layer that performs classification of the bounding boxes. If not provided, a simple ConvNet with 1 layer will be used.
  • box_head: (Optional) A keras.Layer that performs regression of the bounding boxes. If not provided, a simple ConvNet with 1 layer will be used.
  • evaluate_train_time_metrics: (Optional) whether or not to evaluate metrics passed in compile() inside of the train_step(). This is NOT recommended, as it dramatically reduces performance due to the synchronous label decoding and COCO metric evaluation. For example, on a single GPU on the PascalVOC dataset epoch time goes from 3 minutes to 30 minutes with this set to True. Defaults to False.
  • name: (Optional) name for the model, defaults to "RetinaNet".