kwimage.structs.mask

Data structure for Binary Masks

Structure for efficient encoding of per-annotation segmentation masks Based on efficient cython/C code in the cocoapi [1].

References

1

https://github.com/nightrome/cocostuffapi/blob/master/PythonAPI/pycocotools/_mask.pyx

2

https://github.com/nightrome/cocostuffapi/blob/master/common/maskApi.c

3

https://github.com/nightrome/cocostuffapi/blob/master/common/maskApi.h

4

https://github.com/nightrome/cocostuffapi/blob/master/PythonAPI/pycocotools/mask.py

Goals:

The goal of this file is to create a datastructure that lets the developer seemlessly convert between:

  1. raw binary uint8 masks

(2) memory-efficient compressed run-length-encodings of binary segmentation masks. (3) convex polygons (4) convex hull polygons (5) bounding box

It is not there yet, and the API is subject to change in order to better accomplish these goals.

Notes

IN THIS FILE ONLY: size corresponds to a h/w tuple to be compatible with the coco semantics. Everywhere else in this repo, size uses opencv semantics which are w/h.

Module Contents

Classes

Mask

Manages a single segmentation mask and can convert to and from

MaskList

Store and manipulate multiple masks, usually within the same image

class kwimage.structs.mask.Mask(data=None, format=None)[source]

Bases: ubelt.NiceRepr, _MaskConversionMixin, _MaskConstructorMixin, _MaskTransformMixin, _MaskDrawMixin

Manages a single segmentation mask and can convert to and from multiple formats including:

  • bytes_rle - byte encoded run length encoding

  • array_rle - raw run length encoding

  • c_mask - c-style binary mask

  • f_mask - fortran-style binary mask

Example

>>> # xdoc: +REQUIRES(--mask)
>>> # a ms-coco style compressed bytes rle segmentation
>>> segmentation = {'size': [5, 9], 'counts': ';?1B10O30O4'}
>>> mask = Mask(segmentation, 'bytes_rle')
>>> # convert to binary numpy representation
>>> binary_mask = mask.to_c_mask().data
>>> print(ub.repr2(binary_mask.tolist(), nl=1, nobr=1))
[0, 0, 0, 1, 1, 1, 1, 1, 0],
[0, 0, 1, 1, 1, 0, 0, 0, 0],
[0, 0, 1, 1, 1, 1, 1, 1, 0],
[0, 0, 1, 1, 1, 0, 1, 1, 0],
[0, 0, 1, 1, 1, 0, 1, 1, 0],
property dtype(self)[source]
__nice__(self)[source]
classmethod random(Mask, rng=None, shape=(32, 32))[source]

Create a random binary mask object

Parameters
  • rng (int | RandomState | None) – the random seed

  • shape (Tuple[int, int]) – the height / width of the returned mask

Returns

the random mask

Return type

Mask

Example

>>> import kwimage
>>> mask = kwimage.Mask.random()
>>> # xdoc: +REQUIRES(--show)
>>> import kwplot
>>> kwplot.autompl()
>>> mask.draw()
>>> kwplot.show_if_requested()
classmethod demo(cls)[source]

Demo mask with holes and disjoint shapes

Returns

the demo mask

Return type

Mask

copy(self)[source]

Performs a deep copy of the mask data

Returns

the copied mask

Return type

Mask

Example

>>> self = Mask.random(shape=(8, 8), rng=0)
>>> other = self.copy()
>>> assert other.data is not self.data
union(self, *others)[source]

This can be used as a staticmethod or an instancemethod

Parameters

*others – multiple input masks to union

Returns

the unioned mask

Return type

Mask

Example

>>> # xdoc: +REQUIRES(--mask)
>>> from kwimage.structs.mask import *  # NOQA
>>> masks = [Mask.random(shape=(8, 8), rng=i) for i in range(2)]
>>> mask = Mask.union(*masks)
>>> print(mask.area)
>>> masks = [m.to_c_mask() for m in masks]
>>> mask = Mask.union(*masks)
>>> print(mask.area)
>>> masks = [m.to_bytes_rle() for m in masks]
>>> mask = Mask.union(*masks)
>>> print(mask.area)
Benchmark:

import ubelt as ub ti = ub.Timerit(100, bestof=10, verbose=2)

masks = [Mask.random(shape=(172, 172), rng=i) for i in range(2)]

for timer in ti.reset(‘native rle union’):

masks = [m.to_bytes_rle() for m in masks] with timer:

mask = Mask.union(*masks)

for timer in ti.reset(‘native cmask union’):

masks = [m.to_c_mask() for m in masks] with timer:

mask = Mask.union(*masks)

for timer in ti.reset(‘cmask->rle union’):

masks = [m.to_c_mask() for m in masks] with timer:

mask = Mask.union(*[m.to_bytes_rle() for m in masks])

intersection(self, *others)[source]

This can be used as a staticmethod or an instancemethod

Parameters

*others – multiple input masks to intersect

Returns

the intersection of the masks

Return type

Mask

Example

>>> n = 3
>>> masks = [Mask.random(shape=(8, 8), rng=i) for i in range(n)]
>>> items = masks
>>> mask = Mask.intersection(*masks)
>>> areas = [item.area for item in items]
>>> print('areas = {!r}'.format(areas))
>>> print(mask.area)
>>> print(Mask.intersection(*masks).area / Mask.union(*masks).area)
property shape(self)[source]
property area(self)[source]

Returns the number of non-zero pixels

Returns

the number of non-zero pixels

Return type

int

Example

>>> self = Mask.demo()
>>> self.area
150
get_patch(self)[source]

Extract the patch with non-zero data

Example

>>> # xdoc: +REQUIRES(--mask)
>>> from kwimage.structs.mask import *  # NOQA
>>> self = Mask.random(shape=(8, 8), rng=0)
>>> self.get_patch()
get_xywh(self)[source]

Gets the bounding xywh box coordinates of this mask

Returns

x, y, w, h: Note we dont use a Boxes object because

a general singular version does not yet exist.

Return type

ndarray

Example

>>> # xdoc: +REQUIRES(--mask)
>>> self = Mask.random(shape=(8, 8), rng=0)
>>> self.get_xywh().tolist()
>>> self = Mask.random(rng=0).translate((10, 10))
>>> self.get_xywh().tolist()

Example

>>> # test empty case
>>> import kwimage
>>> self = kwimage.Mask(np.empty((0, 0), dtype=np.uint8), format='c_mask')
>>> assert self.get_xywh().tolist() == [0, 0, 0, 0]
Ignore:
>>> import kwimage
>>> self = kwimage.Mask(np.zeros((768, 768), dtype=np.uint8), format='c_mask')
>>> x_coords = np.array([621, 752])
>>> y_coords = np.array([366, 292])
>>> self.data[y_coords, x_coords] = 1
>>> self.get_xywh()
>>> # References:
>>> # https://stackoverflow.com/questions/33281957/faster-alternative-to-numpy-where
>>> # https://answers.opencv.org/question/4183/what-is-the-best-way-to-find-bounding-box-for-binary-mask/
>>> import timerit
>>> ti = timerit.Timerit(100, bestof=10, verbose=2)
>>> for timer in ti.reset('time'):
>>>     with timer:
>>>         y_coords, x_coords = np.where(self.data)
>>> #
>>> for timer in ti.reset('time'):
>>>     with timer:
>>>         cv2.findNonZero(data)

self.data = np.random.rand(800, 700) > 0.5

import timerit ti = timerit.Timerit(100, bestof=10, verbose=2) for timer in ti.reset(‘time’):

with timer:

y_coords, x_coords = np.where(self.data)

# for timer in ti.reset(‘time’):

with timer:

data = np.ascontiguousarray(self.data).astype(np.uint8) cv2_coords = cv2.findNonZero(data)

>>> poly = self.to_multi_polygon()
get_polygon(self)[source]

DEPRECATED: USE to_multi_polygon

Returns a list of (x,y)-coordinate lists. The length of the list is equal to the number of disjoint regions in the mask.

Returns

polygon around each connected component of the

mask. Each ndarray is an Nx2 array of xy points.

Return type

List[ndarray]

Note

The returned polygon may not surround points that are only one pixel thick.

Example

>>> # xdoc: +REQUIRES(--mask)
>>> from kwimage.structs.mask import *  # NOQA
>>> self = Mask.random(shape=(8, 8), rng=0)
>>> polygons = self.get_polygon()
>>> print('polygons = ' + ub.repr2(polygons))
>>> polygons = self.get_polygon()
>>> self = self.to_bytes_rle()
>>> other = Mask.from_polygons(polygons, self.shape)
>>> # xdoc: +REQUIRES(--show)
>>> import kwplot
>>> kwplot.autompl()
>>> image = np.ones(self.shape)
>>> image = self.draw_on(image, color='blue')
>>> image = other.draw_on(image, color='red')
>>> kwplot.imshow(image)
polygons = [

np.array([[6, 4],[7, 4]], dtype=np.int32), np.array([[0, 1],[0, 3],[2, 3],[2, 1]], dtype=np.int32),

]

to_mask(self, dims=None)[source]

Converts to a mask object (which does nothing because this already is mask object!)

Returns

kwimage.Mask

to_boxes(self)[source]

Returns the bounding box of the mask.

Returns

kwimage.Boxes

to_multi_polygon(self)[source]

Returns a MultiPolygon object fit around this raster including disjoint pieces and holes.

Returns

vectorized representation

Return type

MultiPolygon

Example

>>> # xdoc: +REQUIRES(--mask)
>>> from kwimage.structs.mask import *  # NOQA
>>> self = Mask.demo()
>>> self = self.scale(5)
>>> multi_poly = self.to_multi_polygon()
>>> # xdoc: +REQUIRES(module:kwplot)
>>> # xdoc: +REQUIRES(--show)
>>> self.draw(color='red')
>>> multi_poly.scale(1.1).draw(color='blue')
>>> # xdoc: +REQUIRES(--show)
>>> import kwplot
>>> kwplot.autompl()
>>> image = np.ones(self.shape)
>>> image = self.draw_on(image, color='blue')
>>> #image = other.draw_on(image, color='red')
>>> kwplot.imshow(image)
>>> multi_poly.draw()

Example

>>> import kwimage
>>> self = kwimage.Mask(np.empty((0, 0), dtype=np.uint8), format='c_mask')
>>> poly = self.to_multi_polygon()
>>> poly.to_multi_polygon()

Example

# Corner case, only two pixels are on >>> import kwimage >>> self = kwimage.Mask(np.zeros((768, 768), dtype=np.uint8), format=’c_mask’) >>> x_coords = np.array([621, 752]) >>> y_coords = np.array([366, 292]) >>> self.data[y_coords, x_coords] = 1 >>> poly = self.to_multi_polygon()

poly.to_mask(self.shape).data.sum()

self.to_array_rle().to_c_mask().data.sum() temp.to_c_mask().data.sum()

Example

>>> # TODO: how do we correctly handle the 1 or 2 point to a poly
>>> # case?
>>> import kwimage
>>> data = np.zeros((8, 8), dtype=np.uint8)
>>> data[0, 3:5] = 1
>>> data[7, 3:5] = 1
>>> data[3:5, 0:2] = 1
>>> self = kwimage.Mask.coerce(data)
>>> polys = self.to_multi_polygon()
>>> # xdoc: +REQUIRES(--show)
>>> import kwplot
>>> kwplot.autompl()
>>> kwplot.imshow(data)
>>> polys.draw(border=True, linewidth=5, alpha=0.5, radius=0.2)
get_convex_hull(self)[source]

Returns a list of xy points around the convex hull of this mask

Note

The returned polygon may not surround points that are only one pixel thick.

Example

>>> # xdoc: +REQUIRES(--mask)
>>> self = Mask.random(shape=(8, 8), rng=0)
>>> polygons = self.get_convex_hull()
>>> print('polygons = ' + ub.repr2(polygons))
>>> other = Mask.from_polygons(polygons, self.shape)
iou(self, other)[source]

The area of intersection over the area of union

Todo

  • [ ] Write plural Masks version of this class, which should

    be able to perform this operation more efficiently.

CommandLine:

xdoctest -m kwimage.structs.mask Mask.iou

Example

>>> # xdoc: +REQUIRES(--mask)
>>> self = Mask.demo()
>>> other = self.translate(1)
>>> iou = self.iou(other)
>>> print('iou = {:.4f}'.format(iou))
iou = 0.0830
>>> iou2 = self.intersection(other).area / self.union(other).area
>>> print('iou2 = {:.4f}'.format(iou2))
classmethod coerce(Mask, data, dims=None)[source]

Attempts to auto-inspect the format of the data and conver to Mask

Parameters
  • data – the data to coerce

  • dims (Tuple) – required for certain formats like polygons height / width of the source image

Returns

the constructed mask object

Return type

Mask

Example

>>> # xdoc: +REQUIRES(--mask)
>>> segmentation = {'size': [5, 9], 'counts': ';?1B10O30O4'}
>>> polygon = [
>>>     [np.array([[3, 0],[2, 1],[2, 4],[4, 4],[4, 3],[7, 0]])],
>>>     [np.array([[2, 1],[2, 2],[4, 2],[4, 1]])],
>>> ]
>>> dims = (9, 5)
>>> mask = (np.random.rand(32, 32) > .5).astype(np.uint8)
>>> Mask.coerce(polygon, dims).to_bytes_rle()
>>> Mask.coerce(segmentation).to_bytes_rle()
>>> Mask.coerce(mask).to_bytes_rle()
_to_coco(self)[source]

use to_coco instead

to_coco(self, style='orig')[source]

Convert the Mask to a COCO json representation based on the current format.

A COCO mask is formatted as a run-length-encoding (RLE), of which there are two variants: (1) a array RLE, which is slightly more readable and extensible, and (2) a bytes RLE, which is slightly more concise. The returned format will depend on the current format of the Mask object. If it is in “bytes_rle” format, it will be returned in that format, otherwise it will be converted to the “array_rle” format and returned as such.

Parameters

style (str) – Does nothing for this particular method, exists for API compatibility and if alternate encoding styles are implemented in the future.

Returns

either a bytes-rle or array-rle encoding, depending

on the current mask format. The keys in this dictionary are as follows:

counts (List[int] | str): the array or bytes rle encoding

size (Tuple[int]): the height and width of the encoded mask

see note.

shape (Tuple[int]): only present in array-rle mode. This

is also the height/width of the underlying encoded array. This exists for semantic consistency with other kwimage conventions, and is not part of the original coco spec.

order (str): only present in array-rle mode.

Either C or F, indicating if counts is aranged in row-major or column-major order. For COCO-compatibility this is always returned in F (column-major) order.

binary (bool): only present in array-rle mode.

For COCO-compatibility this is always returned as False, indicating the mask only contains binary 0 or 1 values.

Return type

dict

Note

The output dictionary will contain a key named “size”, this is the only location in kwimage where “size” refers to a tuple in (height/width) order, in order to be backwards compatible with the original coco spec. In all other locations in kwimage a “size” will refer to a (width/height) ordered tuple.

SeeAlso:
func

kwimage.im_runlen.encode_run_length - backend function that does array-style run length encoding.

Example

>>> # xdoc: +REQUIRES(--mask)
>>> from kwimage.structs.mask import *  # NOQA
>>> self = Mask.demo()
>>> coco_data1 = self.toformat('array_rle').to_coco()
>>> coco_data2 = self.toformat('bytes_rle').to_coco()
>>> print('coco_data1 = {}'.format(ub.repr2(coco_data1, nl=1)))
>>> print('coco_data2 = {}'.format(ub.repr2(coco_data2, nl=1)))
coco_data1 = {
    'binary': True,
    'counts': [47, 5, 3, 1, 14, ... 1, 4, 19, 141],
    'order': 'F',
    'shape': (23, 32),
    'size': (23, 32),
}
coco_data2 = {
    'counts': '_153L;4EL...ON3060L0N060L0Nb0Y4',
    'size': [23, 32],
}
class kwimage.structs.mask.MaskList[source]

Bases: kwimage.structs._generic.ObjectList

Store and manipulate multiple masks, usually within the same image

to_polygon_list(self)[source]

Converts all mask objects to multi-polygon objects

Returns

kwimage.PolygonList

to_segmentation_list(self)[source]

Converts all items to segmentation objects

Returns

kwimage.SegmentationList

to_mask_list(self)[source]

returns this object

Returns

kwimage.MaskList