kwimage.structs.mask
¶
Data structure for Binary Masks
Structure for efficient encoding of per-annotation segmentation masks Based on efficient cython/C code in the cocoapi [1].
References
- 1
https://github.com/nightrome/cocostuffapi/blob/master/PythonAPI/pycocotools/_mask.pyx
- 2
https://github.com/nightrome/cocostuffapi/blob/master/common/maskApi.c
- 3
https://github.com/nightrome/cocostuffapi/blob/master/common/maskApi.h
- 4
https://github.com/nightrome/cocostuffapi/blob/master/PythonAPI/pycocotools/mask.py
- Goals:
The goal of this file is to create a datastructure that lets the developer seemlessly convert between:
raw binary uint8 masks
(2) memory-efficient compressed run-length-encodings of binary segmentation masks. (3) convex polygons (4) convex hull polygons (5) bounding box
It is not there yet, and the API is subject to change in order to better accomplish these goals.
Notes
IN THIS FILE ONLY: size corresponds to a h/w tuple to be compatible with the coco semantics. Everywhere else in this repo, size uses opencv semantics which are w/h.
Module Contents¶
Classes¶
Manages a single segmentation mask and can convert to and from |
|
Store and manipulate multiple masks, usually within the same image |
- class kwimage.structs.mask.Mask(data=None, format=None)[source]¶
Bases:
ubelt.NiceRepr
,_MaskConversionMixin
,_MaskConstructorMixin
,_MaskTransformMixin
,_MaskDrawMixin
Manages a single segmentation mask and can convert to and from multiple formats including:
bytes_rle - byte encoded run length encoding
array_rle - raw run length encoding
c_mask - c-style binary mask
f_mask - fortran-style binary mask
Example
>>> # xdoc: +REQUIRES(--mask) >>> # a ms-coco style compressed bytes rle segmentation >>> segmentation = {'size': [5, 9], 'counts': ';?1B10O30O4'} >>> mask = Mask(segmentation, 'bytes_rle') >>> # convert to binary numpy representation >>> binary_mask = mask.to_c_mask().data >>> print(ub.repr2(binary_mask.tolist(), nl=1, nobr=1)) [0, 0, 0, 1, 1, 1, 1, 1, 0], [0, 0, 1, 1, 1, 0, 0, 0, 0], [0, 0, 1, 1, 1, 1, 1, 1, 0], [0, 0, 1, 1, 1, 0, 1, 1, 0], [0, 0, 1, 1, 1, 0, 1, 1, 0],
- classmethod random(Mask, rng=None, shape=(32, 32))[source]¶
Create a random binary mask object
- Parameters
rng (int | RandomState | None) – the random seed
shape (Tuple[int, int]) – the height / width of the returned mask
- Returns
the random mask
- Return type
Example
>>> import kwimage >>> mask = kwimage.Mask.random() >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> mask.draw() >>> kwplot.show_if_requested()
- classmethod demo(cls)[source]¶
Demo mask with holes and disjoint shapes
- Returns
the demo mask
- Return type
- copy(self)[source]¶
Performs a deep copy of the mask data
- Returns
the copied mask
- Return type
Example
>>> self = Mask.random(shape=(8, 8), rng=0) >>> other = self.copy() >>> assert other.data is not self.data
- union(self, *others)[source]¶
This can be used as a staticmethod or an instancemethod
- Parameters
*others – multiple input masks to union
- Returns
the unioned mask
- Return type
Example
>>> # xdoc: +REQUIRES(--mask) >>> from kwimage.structs.mask import * # NOQA >>> masks = [Mask.random(shape=(8, 8), rng=i) for i in range(2)] >>> mask = Mask.union(*masks) >>> print(mask.area) >>> masks = [m.to_c_mask() for m in masks] >>> mask = Mask.union(*masks) >>> print(mask.area)
>>> masks = [m.to_bytes_rle() for m in masks] >>> mask = Mask.union(*masks) >>> print(mask.area)
- Benchmark:
import ubelt as ub ti = ub.Timerit(100, bestof=10, verbose=2)
masks = [Mask.random(shape=(172, 172), rng=i) for i in range(2)]
- for timer in ti.reset(‘native rle union’):
masks = [m.to_bytes_rle() for m in masks] with timer:
mask = Mask.union(*masks)
- for timer in ti.reset(‘native cmask union’):
masks = [m.to_c_mask() for m in masks] with timer:
mask = Mask.union(*masks)
- for timer in ti.reset(‘cmask->rle union’):
masks = [m.to_c_mask() for m in masks] with timer:
mask = Mask.union(*[m.to_bytes_rle() for m in masks])
- intersection(self, *others)[source]¶
This can be used as a staticmethod or an instancemethod
- Parameters
*others – multiple input masks to intersect
- Returns
the intersection of the masks
- Return type
Example
>>> n = 3 >>> masks = [Mask.random(shape=(8, 8), rng=i) for i in range(n)] >>> items = masks >>> mask = Mask.intersection(*masks) >>> areas = [item.area for item in items] >>> print('areas = {!r}'.format(areas)) >>> print(mask.area) >>> print(Mask.intersection(*masks).area / Mask.union(*masks).area)
- property area(self)[source]¶
Returns the number of non-zero pixels
- Returns
the number of non-zero pixels
- Return type
Example
>>> self = Mask.demo() >>> self.area 150
- get_patch(self)[source]¶
Extract the patch with non-zero data
Example
>>> # xdoc: +REQUIRES(--mask) >>> from kwimage.structs.mask import * # NOQA >>> self = Mask.random(shape=(8, 8), rng=0) >>> self.get_patch()
- get_xywh(self)[source]¶
Gets the bounding xywh box coordinates of this mask
- Returns
- x, y, w, h: Note we dont use a Boxes object because
a general singular version does not yet exist.
- Return type
ndarray
Example
>>> # xdoc: +REQUIRES(--mask) >>> self = Mask.random(shape=(8, 8), rng=0) >>> self.get_xywh().tolist() >>> self = Mask.random(rng=0).translate((10, 10)) >>> self.get_xywh().tolist()
Example
>>> # test empty case >>> import kwimage >>> self = kwimage.Mask(np.empty((0, 0), dtype=np.uint8), format='c_mask') >>> assert self.get_xywh().tolist() == [0, 0, 0, 0]
- Ignore:
>>> import kwimage >>> self = kwimage.Mask(np.zeros((768, 768), dtype=np.uint8), format='c_mask') >>> x_coords = np.array([621, 752]) >>> y_coords = np.array([366, 292]) >>> self.data[y_coords, x_coords] = 1 >>> self.get_xywh()
>>> # References: >>> # https://stackoverflow.com/questions/33281957/faster-alternative-to-numpy-where >>> # https://answers.opencv.org/question/4183/what-is-the-best-way-to-find-bounding-box-for-binary-mask/ >>> import timerit >>> ti = timerit.Timerit(100, bestof=10, verbose=2) >>> for timer in ti.reset('time'): >>> with timer: >>> y_coords, x_coords = np.where(self.data) >>> # >>> for timer in ti.reset('time'): >>> with timer: >>> cv2.findNonZero(data)
self.data = np.random.rand(800, 700) > 0.5
import timerit ti = timerit.Timerit(100, bestof=10, verbose=2) for timer in ti.reset(‘time’):
- with timer:
y_coords, x_coords = np.where(self.data)
# for timer in ti.reset(‘time’):
- with timer:
data = np.ascontiguousarray(self.data).astype(np.uint8) cv2_coords = cv2.findNonZero(data)
>>> poly = self.to_multi_polygon()
- get_polygon(self)[source]¶
DEPRECATED: USE to_multi_polygon
Returns a list of (x,y)-coordinate lists. The length of the list is equal to the number of disjoint regions in the mask.
- Returns
- polygon around each connected component of the
mask. Each ndarray is an Nx2 array of xy points.
- Return type
List[ndarray]
Note
The returned polygon may not surround points that are only one pixel thick.
Example
>>> # xdoc: +REQUIRES(--mask) >>> from kwimage.structs.mask import * # NOQA >>> self = Mask.random(shape=(8, 8), rng=0) >>> polygons = self.get_polygon() >>> print('polygons = ' + ub.repr2(polygons)) >>> polygons = self.get_polygon() >>> self = self.to_bytes_rle() >>> other = Mask.from_polygons(polygons, self.shape) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> image = np.ones(self.shape) >>> image = self.draw_on(image, color='blue') >>> image = other.draw_on(image, color='red') >>> kwplot.imshow(image)
- polygons = [
np.array([[6, 4],[7, 4]], dtype=np.int32), np.array([[0, 1],[0, 3],[2, 3],[2, 1]], dtype=np.int32),
]
- to_mask(self, dims=None)[source]¶
Converts to a mask object (which does nothing because this already is mask object!)
- Returns
kwimage.Mask
- to_multi_polygon(self)[source]¶
Returns a MultiPolygon object fit around this raster including disjoint pieces and holes.
- Returns
vectorized representation
- Return type
Example
>>> # xdoc: +REQUIRES(--mask) >>> from kwimage.structs.mask import * # NOQA >>> self = Mask.demo() >>> self = self.scale(5) >>> multi_poly = self.to_multi_polygon() >>> # xdoc: +REQUIRES(module:kwplot) >>> # xdoc: +REQUIRES(--show) >>> self.draw(color='red') >>> multi_poly.scale(1.1).draw(color='blue')
>>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> image = np.ones(self.shape) >>> image = self.draw_on(image, color='blue') >>> #image = other.draw_on(image, color='red') >>> kwplot.imshow(image) >>> multi_poly.draw()
Example
>>> import kwimage >>> self = kwimage.Mask(np.empty((0, 0), dtype=np.uint8), format='c_mask') >>> poly = self.to_multi_polygon() >>> poly.to_multi_polygon()
Example
# Corner case, only two pixels are on >>> import kwimage >>> self = kwimage.Mask(np.zeros((768, 768), dtype=np.uint8), format=’c_mask’) >>> x_coords = np.array([621, 752]) >>> y_coords = np.array([366, 292]) >>> self.data[y_coords, x_coords] = 1 >>> poly = self.to_multi_polygon()
poly.to_mask(self.shape).data.sum()
self.to_array_rle().to_c_mask().data.sum() temp.to_c_mask().data.sum()
Example
>>> # TODO: how do we correctly handle the 1 or 2 point to a poly >>> # case? >>> import kwimage >>> data = np.zeros((8, 8), dtype=np.uint8) >>> data[0, 3:5] = 1 >>> data[7, 3:5] = 1 >>> data[3:5, 0:2] = 1 >>> self = kwimage.Mask.coerce(data) >>> polys = self.to_multi_polygon() >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(data) >>> polys.draw(border=True, linewidth=5, alpha=0.5, radius=0.2)
- get_convex_hull(self)[source]¶
Returns a list of xy points around the convex hull of this mask
Note
The returned polygon may not surround points that are only one pixel thick.
Example
>>> # xdoc: +REQUIRES(--mask) >>> self = Mask.random(shape=(8, 8), rng=0) >>> polygons = self.get_convex_hull() >>> print('polygons = ' + ub.repr2(polygons)) >>> other = Mask.from_polygons(polygons, self.shape)
- iou(self, other)[source]¶
The area of intersection over the area of union
Todo
- [ ] Write plural Masks version of this class, which should
be able to perform this operation more efficiently.
- CommandLine:
xdoctest -m kwimage.structs.mask Mask.iou
Example
>>> # xdoc: +REQUIRES(--mask) >>> self = Mask.demo() >>> other = self.translate(1) >>> iou = self.iou(other) >>> print('iou = {:.4f}'.format(iou)) iou = 0.0830 >>> iou2 = self.intersection(other).area / self.union(other).area >>> print('iou2 = {:.4f}'.format(iou2))
- classmethod coerce(Mask, data, dims=None)[source]¶
Attempts to auto-inspect the format of the data and conver to Mask
- Parameters
data – the data to coerce
dims (Tuple) – required for certain formats like polygons height / width of the source image
- Returns
the constructed mask object
- Return type
Example
>>> # xdoc: +REQUIRES(--mask) >>> segmentation = {'size': [5, 9], 'counts': ';?1B10O30O4'} >>> polygon = [ >>> [np.array([[3, 0],[2, 1],[2, 4],[4, 4],[4, 3],[7, 0]])], >>> [np.array([[2, 1],[2, 2],[4, 2],[4, 1]])], >>> ] >>> dims = (9, 5) >>> mask = (np.random.rand(32, 32) > .5).astype(np.uint8) >>> Mask.coerce(polygon, dims).to_bytes_rle() >>> Mask.coerce(segmentation).to_bytes_rle() >>> Mask.coerce(mask).to_bytes_rle()
- to_coco(self, style='orig')[source]¶
Convert the Mask to a COCO json representation based on the current format.
A COCO mask is formatted as a run-length-encoding (RLE), of which there are two variants: (1) a array RLE, which is slightly more readable and extensible, and (2) a bytes RLE, which is slightly more concise. The returned format will depend on the current format of the Mask object. If it is in “bytes_rle” format, it will be returned in that format, otherwise it will be converted to the “array_rle” format and returned as such.
- Parameters
style (str) – Does nothing for this particular method, exists for API compatibility and if alternate encoding styles are implemented in the future.
- Returns
- either a bytes-rle or array-rle encoding, depending
on the current mask format. The keys in this dictionary are as follows:
counts (List[int] | str): the array or bytes rle encoding
- size (Tuple[int]): the height and width of the encoded mask
see note.
- shape (Tuple[int]): only present in array-rle mode. This
is also the height/width of the underlying encoded array. This exists for semantic consistency with other kwimage conventions, and is not part of the original coco spec.
- order (str): only present in array-rle mode.
Either C or F, indicating if counts is aranged in row-major or column-major order. For COCO-compatibility this is always returned in F (column-major) order.
- binary (bool): only present in array-rle mode.
For COCO-compatibility this is always returned as False, indicating the mask only contains binary 0 or 1 values.
- Return type
Note
The output dictionary will contain a key named “size”, this is the only location in kwimage where “size” refers to a tuple in (height/width) order, in order to be backwards compatible with the original coco spec. In all other locations in kwimage a “size” will refer to a (width/height) ordered tuple.
- SeeAlso:
- func
kwimage.im_runlen.encode_run_length - backend function that does array-style run length encoding.
Example
>>> # xdoc: +REQUIRES(--mask) >>> from kwimage.structs.mask import * # NOQA >>> self = Mask.demo() >>> coco_data1 = self.toformat('array_rle').to_coco() >>> coco_data2 = self.toformat('bytes_rle').to_coco() >>> print('coco_data1 = {}'.format(ub.repr2(coco_data1, nl=1))) >>> print('coco_data2 = {}'.format(ub.repr2(coco_data2, nl=1))) coco_data1 = { 'binary': True, 'counts': [47, 5, 3, 1, 14, ... 1, 4, 19, 141], 'order': 'F', 'shape': (23, 32), 'size': (23, 32), } coco_data2 = { 'counts': '_153L;4EL...ON3060L0N060L0Nb0Y4', 'size': [23, 32], }
- class kwimage.structs.mask.MaskList[source]¶
Bases:
kwimage.structs._generic.ObjectList
Store and manipulate multiple masks, usually within the same image
- to_polygon_list(self)[source]¶
Converts all mask objects to multi-polygon objects
- Returns
kwimage.PolygonList