affine_transform functionkeras.utils.bounding_boxes.affine_transform(
boxes,
angle,
translate_x,
translate_y,
scale,
shear_x,
shear_y,
height,
width,
center_x=None,
center_y=None,
bounding_box_format="xyxy",
)
Applies an affine transformation to the bounding boxes.
The height and width parameters are used to normalize the
translation and scaling factors.
Arguments
(N, 4) or (batch_size, N, 4)."xyxy".Returns
The transformed bounding boxes, a tensor/array with the same shape
as the input boxes.
clip_to_image_size functionkeras.utils.bounding_boxes.clip_to_image_size(
bounding_boxes, height=None, width=None, bounding_box_format="xyxy"
)
Clips bounding boxes to be within the image dimensions. Arguments
(N, 4) or
(batch, N, 4) and 'labels' shape (N,) or (batch, N,)."xyxy".Returns
Clipped bounding boxes.
Example
boxes = {"boxes": np.array([[-10, -20, 150, 160], [50, 40, 70, 80]]),
"labels": np.array([0, 1])}
clipped_boxes = keras.utils.bounding_boxes.clip_to_image_size(
boxes, height=100, width=120,
)
# Output will have boxes clipped to the image boundaries, and labels
# potentially adjusted if the clipped area becomes zero
compute_ciou functionkeras.utils.bounding_boxes.compute_ciou(
boxes1, boxes2, bounding_box_format, image_shape=None
)
Computes the Complete IoU (CIoU) between two bounding boxes or between two batches of bounding boxes.
CIoU loss is an extension of GIoU loss, which further improves the IoU optimization for object detection. CIoU loss not only penalizes the bounding box coordinates but also considers the aspect ratio and center distance of the boxes. The length of the last dimension should be 4 to represent the bounding boxes.
Arguments
Tuple[int]. The shape of the image (height, width, 3).
When using relative bounding box format for box_format the
image_shape is used for normalization.Returns
compute_iou functionkeras.utils.bounding_boxes.compute_iou(
boxes1, boxes2, bounding_box_format, use_masking=False, mask_val=-1, image_shape=None
)
Computes a lookup table vector containing the ious for a given set boxes.
The lookup vector is to be indexed by [boxes1_index,boxes2_index] if
boxes are unbatched and by [batch, boxes1_index,boxes2_index] if the
boxes are batched.
The users can pass boxes1 and boxes2 to be different ranks. For example:
1) boxes1: [batch_size, M, 4], boxes2: [batch_size, N, 4] -> return
[batch_size, M, N].
2) boxes1: [batch_size, M, 4], boxes2: [N, 4] -> return
[batch_size, M, N]
3) boxes1: [M, 4], boxes2: [batch_size, N, 4] -> return
[batch_size, M, N]
4) boxes1: [M, 4], boxes2: [N, 4] -> return [M, N]
Arguments
"xyxy",
"rel_xyxy", "xyWH", "center_xyWH", "yxyx", "rel_yxyx".
For detailed information on the supported format, see theboxes1 or boxes2 that have values less than 0 in all its 4
dimensions. Default to False.Tuple[int]. The shape of the image (height, width, 3).
When using relative bounding box format for box_format the
image_shape is used for normalization.Returns
convert_format functionkeras.utils.bounding_boxes.convert_format(
boxes, source, target, height=None, width=None, dtype="float32"
)
Converts bounding boxes between formats.
Supported formats (case-insensitive):
"xyxy": [left, top, right, bottom]
"yxyx": [top, left, bottom, right]
"xywh": [left, top, width, height]
"center_xywh": [center_x, center_y, width, height]
"center_yxhw": [center_y, center_x, height, width]
"rel_xyxy", "rel_yxyx", "rel_xywh", "rel_center_xywh": Relative
versions of the above formats, where coordinates are normalized
to the range [0, 1] based on the image height and width.
Arguments
boxes and
labels.Returns
Converted boxes.
Raises
Example
boxes = np.array([[10, 20, 30, 40], [50, 60, 70, 80]])
# Convert from 'xyxy' to 'xywh' format
boxes_xywh = keras.utils.bounding_boxes.convert_format(
boxes, source='xyxy', target='xywh'
) # Output: [[10. 20. 20. 20.], [50. 60. 20. 20.]]
# Convert to relative 'rel_xyxy' format
boxes_rel_xyxy = keras.utils.bounding_boxes.convert_format(
boxes, source='xyxy', target='rel_xyxy', height=200, width=300
) # Output: [[0.03333334 0.1 0.1 0.2 ],
#[0.16666667 0.3 0.23333333 0.4 ]]
crop functionkeras.utils.bounding_boxes.crop(
boxes, top, left, height, width, bounding_box_format="xyxy"
)
Crops bounding boxes based on the given offsets and dimensions.
This function crops bounding boxes to a specified region defined by
top, left, height, and width. The boxes are first converted to
xyxy format, cropped, and then returned.
Arguments
(N, 4) or (batch_size, N, 4).None.None."xyxy".Returns
The cropped bounding boxes.
Example
boxes = np.array([[10, 20, 50, 60], [70, 80, 100, 120]]) # xyxy format
cropped_boxes = keras.utils.bounding_boxes.crop(
boxes, bounding_box_format="xyxy", top=10, left=20, height=40, width=30
) # Cropping a 30x40 region starting at (20, 10)
print(cropped_boxes)
__Expected output__
:
__array([[ 0., 10., 30., 50.]__
,
__ [50., 70., 80., 110.]])__
----
<span style="float:right;">[[source]](https://github.com/keras-team/keras/tree/v3.13.2/keras/src/layers/preprocessing/image_preprocessing/bounding_boxes/converters.py#L347)</span>
### `decode_deltas_to_boxes` function
```python
keras.utils.bounding_boxes.decode_deltas_to_boxes(
anchors,
boxes_delta,
anchor_format,
box_format,
encoded_format="center_yxhw",
variance=None,
image_shape=None,
)
Converts bounding boxes from delta format to the specified box_format.
This function decodes bounding box deltas relative to anchors to obtain the
final bounding box coordinates. The boxes are encoded in a specific
encoded_format (center_yxhw by default) during the decoding process.
This allows flexibility in how the deltas are applied to the anchors.
Arguments
Tensors or Dict[Tensors] where keys are level
indices and values are corresponding anchor boxes.
The shape of the array/tensor should be (N, 4) where N is the
number of anchors.
boxes_delta Can be Tensors or Dict[Tensors] Bounding box deltas
must have the same type and structure as anchors. The
shape of the array/tensor can be (N, 4) or (B, N, 4) where N is
the number of boxes.anchors.
(e.g., "xyxy", "xywh", etc.)"xyxy", "xywh", etc.)"center_yxhw".List[floats]. A 4-element array/tensor representing
variance factors to scale the box deltas. If provided, the deltas
are multiplied by the variance before being applied to the anchors.
Defaults to None.Tuple[int]. The shape of the image (height, width, 3).
When using relative bounding box format for box_format the
image_shape is used for normalization.Returns
Decoded box coordinates. The return type matches the box_format.
Raises
variance is not None and its length is not 4.encoded_format is not "center_xywh" or
"center_yxhw".encode_box_to_deltas functionkeras.utils.bounding_boxes.encode_box_to_deltas(
anchors,
boxes,
anchor_format,
box_format,
encoding_format="center_yxhw",
variance=None,
image_shape=None,
)
Encodes bounding boxes relative to anchors as deltas.
This function calculates the deltas that represent the difference between bounding boxes and provided anchors. Deltas encode the offsets and scaling factors to apply to anchors to obtain the target boxes.
Boxes and anchors are first converted to the specified encoding_format
(defaulting to center_yxhw) for consistent delta representation.
Arguments
Tensors. Anchor boxes with shape of (N, 4) where N is the
number of anchors.Tensors Bounding boxes to encode. Boxes can be of shape
(B, N, 4) or (N, 4).anchors
(e.g., "xyxy", "xywh", etc.).boxes
(e.g., "xyxy", "xywh", etc.).List[float]. A 4-element array/tensor representing variance
factors to scale the box deltas. If provided, the calculated deltas
are divided by the variance. Defaults to None.Tuple[int]. The shape of the image (height, width, 3).
When using relative bounding box format for box_format the
image_shape is used for normalization.Returns
Encoded box deltas. The return type matches the encode_format.
Raises
variance is not None and its length is not 4.encoding_format is not "center_xywh" or
"center_yxhw".pad functionkeras.utils.bounding_boxes.pad(
boxes, top, left, height=None, width=None, bounding_box_format="xyxy"
)
Pads bounding boxes by adding top and left offsets.
This function adds padding to the bounding boxes by increasing the 'top'
and 'left' coordinates by the specified amounts. The method assume the
input bounding_box_format is xyxy.
Arguments
(N, 4) or (batch, N, 4)."xyxy".Returns
Padded bounding boxes in the original format.