Welcome to kwimage’s documentation!¶
Function Usefulness¶
Function name |
Usefulness |
---|---|
461 |
|
245 |
|
198 |
|
177 |
|
132 |
|
119 |
|
91 |
|
80 |
|
74 |
|
71 |
|
71 |
|
63 |
|
62 |
|
58 |
|
52 |
|
49 |
|
46 |
|
45 |
|
44 |
|
41 |
|
38 |
|
28 |
|
27 |
|
26 |
|
25 |
|
25 |
|
16 |
|
15 |
|
14 |
|
14 |
|
14 |
|
14 |
|
13 |
|
11 |
|
11 |
|
11 |
|
11 |
|
9 |
|
8 |
|
7 |
|
7 |
|
6 |
|
6 |
|
5 |
|
5 |
|
5 |
|
4 |
|
4 |
|
4 |
|
3 |
|
3 |
|
2 |
|
2 |
|
2 |
|
1 |
|
1 |
|
1 |
|
1 |
|
1 |
API Reference¶
This page contains auto-generated API reference documentation 1.
kwimage
¶
The Kitware Image Module (kwimage) contains functions to accomplish lower-level image operations via a high level API.
Subpackages¶
kwimage.algo
¶
mkinit ~/code/kwimage/kwimage/algo/__init__.py -w –relative
Subpackages¶
kwimage.algo._nms_backend
¶kwimage.algo._nms_backend.py_nms
¶Fast R-CNN Copyright (c) 2015 Microsoft Licensed under The MIT License [see LICENSE for details] Written by Ross Girshick
|
Pure Python NMS baseline. |
- kwimage.algo._nms_backend.py_nms.py_nms(np_ltrb, np_scores, thresh, bias=1)[source]¶
Pure Python NMS baseline.
References
https://github.com/rbgirshick/fast-rcnn/blob/master/lib/utils/nms.py
Example
>>> np_ltrb = np.array([ >>> [0, 0, 100, 100], >>> [0, 0, 100, 100], >>> [100, 100, 10, 10], >>> [10, 10, 100, 100], >>> [50, 50, 100, 100], >>> [100, 100, 150, 101], >>> [120, 100, 180, 101], >>> [150, 100, 200, 101], >>> ], dtype=np.float32) >>> np_scores = np.linspace(0, 1, len(np_ltrb)) >>> thresh = 0.1 >>> bias = 0.0 >>> keep = sorted(py_nms(np_ltrb, np_scores, thresh, bias)) >>> print('keep = {!r}'.format(keep)) keep = [2, 4, 5, 7]
Example
>>> from kwimage.algo._nms_backend.py_nms import * # NOQA >>> np_ltrb = np.array([ >>> [0, 0, 100, 100], >>> [100, 100, 10, 10], >>> [10, 10, 100, 100], >>> [50, 50, 100, 100], >>> ], dtype=np.float32) >>> np_scores = np.array([.1, .5, .9, .1]) >>> keep = list(py_nms(np_ltrb, np_scores, thresh=0.0, bias=1.0)) >>> print('keep@0.0 = {!r}'.format(keep)) >>> keep = list(py_nms(np_ltrb, np_scores, thresh=0.2, bias=1.0)) >>> print('keep@0.2 = {!r}'.format(keep)) >>> keep = list(py_nms(np_ltrb, np_scores, thresh=0.5, bias=1.0)) >>> print('keep@0.5 = {!r}'.format(keep)) >>> keep = list(py_nms(np_ltrb, np_scores, thresh=1.0, bias=1.0)) >>> print('keep@1.0 = {!r}'.format(keep)) keep@0.0 = [2, 1] keep@0.2 = [2, 1] keep@0.5 = [2, 1, 3] keep@1.0 = [2, 1, 3, 0]
kwimage.algo._nms_backend.torch_nms
¶
|
Non maximum suppression implemented with pytorch tensors |
- kwimage.algo._nms_backend.torch_nms.torch_nms(ltrb, scores, classes=None, thresh=0.5, bias=0, fast=False)[source]¶
Non maximum suppression implemented with pytorch tensors
CURRENTLY NOT WORKING
- Parameters
ltrb (Tensor) – Bounding boxes of one image in the format (ltrb)
scores (Tensor) – Scores of each box
classes (Tensor, optional) – the classes of each box. If specified nms is applied to each class separately.
thresh (float) – iou threshold
- Returns
keep: boolean array indicating which boxes were not pruned.
- Return type
ByteTensor
Example
>>> # DISABLE_DOCTEST >>> # xdoctest: +REQUIRES(module:torch) >>> import torch >>> import numpy as np >>> ltrb = torch.FloatTensor(np.array([ >>> [0, 0, 100, 100], >>> [100, 100, 10, 10], >>> [10, 10, 100, 100], >>> [50, 50, 100, 100], >>> [100, 100, 130, 130], >>> [100, 100, 130, 130], >>> [100, 100, 130, 130], >>> ], dtype=np.float32)) >>> scores = torch.FloatTensor(np.array([.1, .5, .9, .1, .3, .5, .4])) >>> classes = torch.LongTensor(np.array([0, 0, 0, 0, 0, 0, 0])) >>> thresh = .5 >>> flags = torch_nms(ltrb, scores, classes, thresh) >>> keep = np.nonzero(flags).view(-1) >>> ltrb[flags] >>> ltrb[keep]
Example
>>> # DISABLE_DOCTEST >>> # xdoctest: +REQUIRES(module:torch) >>> import torch >>> import numpy as np >>> # Test to check that conflicts are correctly resolved >>> ltrb = torch.FloatTensor(np.array([ >>> [100, 100, 150, 101], >>> [120, 100, 180, 101], >>> [150, 100, 200, 101], >>> ], dtype=np.float32)) >>> scores = torch.FloatTensor(np.linspace(.8, .9, len(ltrb))) >>> classes = None >>> thresh = .3 >>> keep = torch_nms(ltrb, scores, classes, thresh, fast=False) >>> bboxes[keep]
Submodules¶
kwimage.algo.algo_nms
¶Generic Non-Maximum Suppression API with efficient backend implementations
|
Divide and conquor speedup non-max-supression algorithm for when bboxes |
List available values for the impl kwarg of non_max_supression |
|
|
Defined with help from |
|
Non-Maximum Suppression - remove redundant bounding boxes |
- kwimage.algo.algo_nms.daq_spatial_nms(ltrb, scores, diameter, thresh, max_depth=6, stop_size=2048, recsize=2048, impl='auto', device_id=None)[source]¶
Divide and conquor speedup non-max-supression algorithm for when bboxes have a known max size
- Parameters
ltrb (ndarray) – boxes in (tlx, tly, brx, bry) format
scores (ndarray) – scores of each box
diameter (int or Tuple[int, int]) – Distance from split point to consider rectification. If specified as an integer, then number is used for both height and width. If specified as a tuple, then dims are assumed to be in [height, width] format.
thresh (float) – iou threshold. Boxes are removed if they overlap greater than this threshold. 0 is the most strict, resulting in the fewest boxes, and 1 is the most permissive resulting in the most.
max_depth (int) – maximum number of times we can divide and conquor
stop_size (int) – number of boxes that triggers full NMS computation
recsize (int) – number of boxes that triggers full NMS recombination
impl (str) – algorithm to use
- LookInfo:
# Didn’t read yet but it seems similar http://www.cyberneum.de/fileadmin/user_upload/files/publications/CVPR2010-Lampert_[0].pdf
https://www.researchgate.net/publication/220929789_Efficient_Non-Maximum_Suppression
# This seems very similar https://projet.liris.cnrs.fr/m2disco/pub/Congres/2006-ICPR/DATA/C03_0406.PDF
Example
>>> import kwimage >>> # Make a bunch of boxes with the same width and height >>> #boxes = kwimage.Boxes.random(230397, scale=1000, format='cxywh') >>> boxes = kwimage.Boxes.random(237, scale=1000, format='cxywh') >>> boxes.data.T[2] = 10 >>> boxes.data.T[3] = 10 >>> # >>> ltrb = boxes.to_ltrb().data.astype(np.float32) >>> scores = np.arange(0, len(ltrb)).astype(np.float32) >>> # >>> n_megabytes = (ltrb.size * ltrb.dtype.itemsize) / (2 ** 20) >>> print('n_megabytes = {!r}'.format(n_megabytes)) >>> # >>> thresh = iou_thresh = 0.01 >>> impl = 'auto' >>> max_depth = 20 >>> diameter = 10 >>> stop_size = 2000 >>> recsize = 500 >>> # >>> import ubelt as ub >>> # >>> with ub.Timer(label='daq'): >>> keep1 = daq_spatial_nms(ltrb, scores, >>> diameter=diameter, thresh=thresh, max_depth=max_depth, >>> stop_size=stop_size, recsize=recsize, impl=impl) >>> # >>> with ub.Timer(label='full'): >>> keep2 = non_max_supression(ltrb, scores, >>> thresh=thresh, impl=impl) >>> # >>> # Due to the greedy nature of the algorithm, there will be slight >>> # differences in results, but they will be mostly similar. >>> similarity = len(set(keep1) & set(keep2)) / len(set(keep1) | set(keep2)) >>> print('similarity = {!r}'.format(similarity))
- kwimage.algo.algo_nms.available_nms_impls()[source]¶
List available values for the impl kwarg of non_max_supression
- CommandLine:
xdoctest -m kwimage.algo.algo_nms available_nms_impls
Example
>>> impls = available_nms_impls() >>> assert 'numpy' in impls >>> print('impls = {!r}'.format(impls))
- kwimage.algo.algo_nms._heuristic_auto_nms_impl(code, num, valid=None)[source]¶
Defined with help from
~/code/kwimage/dev/bench_nms.py
- Parameters
code (str) – text that indicates which type of data you have tensor0 is a tensor on a cuda device, tensor is on the cpu, and numpy is a ndarray.
num (int) – number of boxes you have to supress.
valid (List[str]) – the list of valid implementations, an error will be raised if heuristic preferences do not intersect with this list.
- Ignore:
_impls._funcs valid_pref = ub.oset(preference) & set(_impls._funcs.keys()) python ~/code/kwimage/dev/bench_nms.py –show –small-boxes –thresh=0.6
- kwimage.algo.algo_nms.non_max_supression(ltrb, scores, thresh, bias=0.0, classes=None, impl='auto', device_id=None)[source]¶
Non-Maximum Suppression - remove redundant bounding boxes
- Parameters
ltrb (ndarray[float32]) – Nx4 boxes in ltrb format
scores (ndarray[float32]) – score for each bbox
thresh (float) – iou threshold. Boxes are removed if they overlap greater than this threshold (i.e. Boxes are removed if iou > threshold). Thresh = 0 is the most strict, resulting in the fewest boxes, and 1 is the most permissive resulting in the most.
bias (float) – bias for iou computation either 0 or 1
classes (ndarray[int64] or None) – integer classes. If specified NMS is done on a perclass basis.
impl (str) – implementation can be “auto”, “python”, “cython_cpu”, “gpu”, “torch”, or “torchvision”.
device_id (int) – used if impl is gpu, device id to work on. If not specified torch.cuda.current_device() is used.
Notes
Using impl=’cython_gpu’ may result in an CUDA memory error that is not exposed to the python processes. In other words your program will hard crash if impl=’cython_gpu’, and you feed it too many bounding boxes. Ideally this will be fixed in the future.
References
https://github.com/facebookresearch/Detectron/blob/master/detectron/utils/cython_nms.pyx https://www.pyimagesearch.com/2015/02/16/faster-non-maximum-suppression-python/ https://github.com/bharatsingh430/soft-nms/blob/master/lib/nms/cpu_nms.pyx <- TODO
- CommandLine:
xdoctest -m ~/code/kwimage/kwimage/algo/algo_nms.py non_max_supression
Example
>>> from kwimage.algo.algo_nms import * >>> from kwimage.algo.algo_nms import _impls >>> ltrb = np.array([ >>> [0, 0, 100, 100], >>> [100, 100, 10, 10], >>> [10, 10, 100, 100], >>> [50, 50, 100, 100], >>> ], dtype=np.float32) >>> scores = np.array([.1, .5, .9, .1]) >>> keep = non_max_supression(ltrb, scores, thresh=0.5, impl='numpy') >>> print('keep = {!r}'.format(keep)) >>> assert keep == [2, 1, 3] >>> thresh = 0.0 >>> non_max_supression(ltrb, scores, thresh, impl='numpy') >>> if 'numpy' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='numpy') >>> assert list(keep) == [2, 1] >>> if 'cython_cpu' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='cython_cpu') >>> assert list(keep) == [2, 1] >>> if 'cython_gpu' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='cython_gpu') >>> assert list(keep) == [2, 1] >>> if 'torch' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='torch') >>> assert set(keep.tolist()) == {2, 1} >>> if 'torchvision' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='torchvision') # note torchvision has no bias >>> assert list(keep) == [2] >>> thresh = 1.0 >>> if 'numpy' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='numpy') >>> assert list(keep) == [2, 1, 3, 0] >>> if 'cython_cpu' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='cython_cpu') >>> assert list(keep) == [2, 1, 3, 0] >>> if 'cython_gpu' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='cython_gpu') >>> assert list(keep) == [2, 1, 3, 0] >>> if 'torch' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='torch') >>> assert set(keep.tolist()) == {2, 1, 3, 0} >>> if 'torchvision' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='torchvision') # note torchvision has no bias >>> assert set(kwarray.ArrayAPI.tolist(keep)) == {2, 1, 3, 0}
Example
>>> import ubelt as ub >>> ltrb = np.array([ >>> [0, 0, 100, 100], >>> [100, 100, 10, 10], >>> [10, 10, 100, 100], >>> [50, 50, 100, 100], >>> [100, 100, 150, 101], >>> [120, 100, 180, 101], >>> [150, 100, 200, 101], >>> ], dtype=np.float32) >>> scores = np.linspace(0, 1, len(ltrb)) >>> thresh = .2 >>> solutions = {} >>> if not _impls._funcs: >>> _impls._lazy_init() >>> for impl in _impls._funcs: >>> keep = non_max_supression(ltrb, scores, thresh, impl=impl) >>> solutions[impl] = sorted(keep) >>> assert 'numpy' in solutions >>> print('solutions = {}'.format(ub.repr2(solutions, nl=1))) >>> assert ub.allsame(solutions.values())
- CommandLine:
xdoctest -m ~/code/kwimage/kwimage/algo/algo_nms.py non_max_supression
Example
>>> import ubelt as ub >>> # Check that zero-area boxes are ok >>> ltrb = np.array([ >>> [0, 0, 0, 0], >>> [0, 0, 0, 0], >>> [10, 10, 10, 10], >>> ], dtype=np.float32) >>> scores = np.array([1, 2, 3], dtype=np.float32) >>> thresh = .2 >>> solutions = {} >>> if not _impls._funcs: >>> _impls._lazy_init() >>> for impl in _impls._funcs: >>> keep = non_max_supression(ltrb, scores, thresh, impl=impl) >>> solutions[impl] = sorted(keep) >>> assert 'numpy' in solutions >>> print('solutions = {}'.format(ub.repr2(solutions, nl=1))) >>> assert ub.allsame(solutions.values())
Package Contents¶
List available values for the impl kwarg of non_max_supression |
|
|
Divide and conquor speedup non-max-supression algorithm for when bboxes |
|
Non-Maximum Suppression - remove redundant bounding boxes |
- kwimage.algo.available_nms_impls()[source]¶
List available values for the impl kwarg of non_max_supression
- CommandLine:
xdoctest -m kwimage.algo.algo_nms available_nms_impls
Example
>>> impls = available_nms_impls() >>> assert 'numpy' in impls >>> print('impls = {!r}'.format(impls))
- kwimage.algo.daq_spatial_nms(ltrb, scores, diameter, thresh, max_depth=6, stop_size=2048, recsize=2048, impl='auto', device_id=None)[source]¶
Divide and conquor speedup non-max-supression algorithm for when bboxes have a known max size
- Parameters
ltrb (ndarray) – boxes in (tlx, tly, brx, bry) format
scores (ndarray) – scores of each box
diameter (int or Tuple[int, int]) – Distance from split point to consider rectification. If specified as an integer, then number is used for both height and width. If specified as a tuple, then dims are assumed to be in [height, width] format.
thresh (float) – iou threshold. Boxes are removed if they overlap greater than this threshold. 0 is the most strict, resulting in the fewest boxes, and 1 is the most permissive resulting in the most.
max_depth (int) – maximum number of times we can divide and conquor
stop_size (int) – number of boxes that triggers full NMS computation
recsize (int) – number of boxes that triggers full NMS recombination
impl (str) – algorithm to use
- LookInfo:
# Didn’t read yet but it seems similar http://www.cyberneum.de/fileadmin/user_upload/files/publications/CVPR2010-Lampert_[0].pdf
https://www.researchgate.net/publication/220929789_Efficient_Non-Maximum_Suppression
# This seems very similar https://projet.liris.cnrs.fr/m2disco/pub/Congres/2006-ICPR/DATA/C03_0406.PDF
Example
>>> import kwimage >>> # Make a bunch of boxes with the same width and height >>> #boxes = kwimage.Boxes.random(230397, scale=1000, format='cxywh') >>> boxes = kwimage.Boxes.random(237, scale=1000, format='cxywh') >>> boxes.data.T[2] = 10 >>> boxes.data.T[3] = 10 >>> # >>> ltrb = boxes.to_ltrb().data.astype(np.float32) >>> scores = np.arange(0, len(ltrb)).astype(np.float32) >>> # >>> n_megabytes = (ltrb.size * ltrb.dtype.itemsize) / (2 ** 20) >>> print('n_megabytes = {!r}'.format(n_megabytes)) >>> # >>> thresh = iou_thresh = 0.01 >>> impl = 'auto' >>> max_depth = 20 >>> diameter = 10 >>> stop_size = 2000 >>> recsize = 500 >>> # >>> import ubelt as ub >>> # >>> with ub.Timer(label='daq'): >>> keep1 = daq_spatial_nms(ltrb, scores, >>> diameter=diameter, thresh=thresh, max_depth=max_depth, >>> stop_size=stop_size, recsize=recsize, impl=impl) >>> # >>> with ub.Timer(label='full'): >>> keep2 = non_max_supression(ltrb, scores, >>> thresh=thresh, impl=impl) >>> # >>> # Due to the greedy nature of the algorithm, there will be slight >>> # differences in results, but they will be mostly similar. >>> similarity = len(set(keep1) & set(keep2)) / len(set(keep1) | set(keep2)) >>> print('similarity = {!r}'.format(similarity))
- kwimage.algo.non_max_supression(ltrb, scores, thresh, bias=0.0, classes=None, impl='auto', device_id=None)[source]¶
Non-Maximum Suppression - remove redundant bounding boxes
- Parameters
ltrb (ndarray[float32]) – Nx4 boxes in ltrb format
scores (ndarray[float32]) – score for each bbox
thresh (float) – iou threshold. Boxes are removed if they overlap greater than this threshold (i.e. Boxes are removed if iou > threshold). Thresh = 0 is the most strict, resulting in the fewest boxes, and 1 is the most permissive resulting in the most.
bias (float) – bias for iou computation either 0 or 1
classes (ndarray[int64] or None) – integer classes. If specified NMS is done on a perclass basis.
impl (str) – implementation can be “auto”, “python”, “cython_cpu”, “gpu”, “torch”, or “torchvision”.
device_id (int) – used if impl is gpu, device id to work on. If not specified torch.cuda.current_device() is used.
Notes
Using impl=’cython_gpu’ may result in an CUDA memory error that is not exposed to the python processes. In other words your program will hard crash if impl=’cython_gpu’, and you feed it too many bounding boxes. Ideally this will be fixed in the future.
References
https://github.com/facebookresearch/Detectron/blob/master/detectron/utils/cython_nms.pyx https://www.pyimagesearch.com/2015/02/16/faster-non-maximum-suppression-python/ https://github.com/bharatsingh430/soft-nms/blob/master/lib/nms/cpu_nms.pyx <- TODO
- CommandLine:
xdoctest -m ~/code/kwimage/kwimage/algo/algo_nms.py non_max_supression
Example
>>> from kwimage.algo.algo_nms import * >>> from kwimage.algo.algo_nms import _impls >>> ltrb = np.array([ >>> [0, 0, 100, 100], >>> [100, 100, 10, 10], >>> [10, 10, 100, 100], >>> [50, 50, 100, 100], >>> ], dtype=np.float32) >>> scores = np.array([.1, .5, .9, .1]) >>> keep = non_max_supression(ltrb, scores, thresh=0.5, impl='numpy') >>> print('keep = {!r}'.format(keep)) >>> assert keep == [2, 1, 3] >>> thresh = 0.0 >>> non_max_supression(ltrb, scores, thresh, impl='numpy') >>> if 'numpy' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='numpy') >>> assert list(keep) == [2, 1] >>> if 'cython_cpu' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='cython_cpu') >>> assert list(keep) == [2, 1] >>> if 'cython_gpu' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='cython_gpu') >>> assert list(keep) == [2, 1] >>> if 'torch' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='torch') >>> assert set(keep.tolist()) == {2, 1} >>> if 'torchvision' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='torchvision') # note torchvision has no bias >>> assert list(keep) == [2] >>> thresh = 1.0 >>> if 'numpy' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='numpy') >>> assert list(keep) == [2, 1, 3, 0] >>> if 'cython_cpu' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='cython_cpu') >>> assert list(keep) == [2, 1, 3, 0] >>> if 'cython_gpu' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='cython_gpu') >>> assert list(keep) == [2, 1, 3, 0] >>> if 'torch' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='torch') >>> assert set(keep.tolist()) == {2, 1, 3, 0} >>> if 'torchvision' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='torchvision') # note torchvision has no bias >>> assert set(kwarray.ArrayAPI.tolist(keep)) == {2, 1, 3, 0}
Example
>>> import ubelt as ub >>> ltrb = np.array([ >>> [0, 0, 100, 100], >>> [100, 100, 10, 10], >>> [10, 10, 100, 100], >>> [50, 50, 100, 100], >>> [100, 100, 150, 101], >>> [120, 100, 180, 101], >>> [150, 100, 200, 101], >>> ], dtype=np.float32) >>> scores = np.linspace(0, 1, len(ltrb)) >>> thresh = .2 >>> solutions = {} >>> if not _impls._funcs: >>> _impls._lazy_init() >>> for impl in _impls._funcs: >>> keep = non_max_supression(ltrb, scores, thresh, impl=impl) >>> solutions[impl] = sorted(keep) >>> assert 'numpy' in solutions >>> print('solutions = {}'.format(ub.repr2(solutions, nl=1))) >>> assert ub.allsame(solutions.values())
- CommandLine:
xdoctest -m ~/code/kwimage/kwimage/algo/algo_nms.py non_max_supression
Example
>>> import ubelt as ub >>> # Check that zero-area boxes are ok >>> ltrb = np.array([ >>> [0, 0, 0, 0], >>> [0, 0, 0, 0], >>> [10, 10, 10, 10], >>> ], dtype=np.float32) >>> scores = np.array([1, 2, 3], dtype=np.float32) >>> thresh = .2 >>> solutions = {} >>> if not _impls._funcs: >>> _impls._lazy_init() >>> for impl in _impls._funcs: >>> keep = non_max_supression(ltrb, scores, thresh, impl=impl) >>> solutions[impl] = sorted(keep) >>> assert 'numpy' in solutions >>> print('solutions = {}'.format(ub.repr2(solutions, nl=1))) >>> assert ub.allsame(solutions.values())
kwimage.structs
¶
mkinit ~/code/kwimage/kwimage/structs/__init__.py -w –relative –nomod
A common thread in many kwimage.structs / kwannot objects is that they attempt to store multiple data elements using a single data structure when possible e.g. the classes are Boxes, Points, Detections, Coords, and not Box, Detection, Coord. The exceptions are Polygon, Heatmap, and Mask, where it made more sense to have one object-per item because each individual item is a reasonably sized chuck of data.
Another commonality is that objects have only two main attributes: .data and .meta. These allow the underlying representation of the object to vary as needed.
Currently Boxes and Mask do not have a .meta attribute. They instead have a .format attribute which is a text-code indicating the underlying layout of the data.
The data and meta instance attributes in the Points, Detections, and Heatmaps classes are dictionaries. These classes also have a __datakeys__ and __metakeys__ class attribute, which are lists of strings. These lists specify which keys are expected in each dictionary. For instance, Points.__datakeys__ = [‘xy’, ‘class_idxs’, ‘visible’] and Points.__metakeys__ = [‘classes’]. All objects in the data dictionary are expected to be aligned, whereas the meta dictionary is for auxillay data. For example in Points, the xy position data[‘xy’][i] is expected to have the class index data[‘class_idxs’][i]. By convention, a class index indexes into the list of category names stored in meta[‘classes’].
The Heatmap.data behaves slighly different than Points. Its data dictionary stores different per-pixel attributes like class probability scores, or offset vectors. The meta dictionary stores data like the originaly image dimensions (heatmaps are usually downsampled wrt the image that they correspond to) and the transformation matrices would warp the “data” space back onto the original image space.
Note that the developer can add any extra data or meta keys that they like, but they should keep in mind that all items in data should be aligned, whereas meta can contain arbitrary information.
Subpackages¶
Submodules¶
kwimage.structs._generic
¶Abstract base class defining the spatial annotation API |
|
Stores a list of potentially heterogenous structures, each item usually |
|
helper for ensuring out.dtype == in.dtype |
|
|
|
|
|
Uses string comparisons to avoid ipython reload errors. |
|
Uses string comparisons to avoid ipython reload errors. |
- class kwimage.structs._generic.Spatial[source]¶
Bases:
ubelt.NiceRepr
Abstract base class defining the spatial annotation API
- class kwimage.structs._generic.ObjectList(data, meta=None)[source]¶
Bases:
Spatial
Stores a list of potentially heterogenous structures, each item usually corresponds to a different object.
- classmethod concatenate(cls, items, axis=0)[source]¶
- Parameters
items (Sequence[ObjectList]) – multiple object lists of the same type
axis (int | None) – unused, always implied to be axis 0
- Returns
combined object list
- Return type
Example
>>> import kwimage >>> cls = kwimage.MaskList >>> sub_cls = kwimage.Mask >>> item1 = cls([sub_cls.random(), sub_cls.random()]) >>> item2 = cls([sub_cls.random()]) >>> items = [item1, item2] >>> new = cls.concatenate(items) >>> assert len(new) == 3
- kwimage.structs._generic._consistent_dtype_fixer(data)[source]¶
helper for ensuring out.dtype == in.dtype
- kwimage.structs._generic._issubclass2(child, parent)[source]¶
Uses string comparisons to avoid ipython reload errors. Much less robust though.
- kwimage.structs._generic._isinstance2(obj, cls)[source]¶
Uses string comparisons to avoid ipython reload errors. Much less robust though.
Example
import kwimage from kwimage.structs import _generic cls = kwimage.structs._generic.ObjectList obj = kwimage.MaskList([]) _generic._isinstance2(obj, cls)
_generic._isinstance2(kwimage.MaskList([]), _generic.ObjectList)
- dets = kwimage.Detections(
boxes=kwimage.Boxes.random(3).numpy(), class_idxs=[0, 1, 1], segmentations=kwimage.MaskList([None] * 3)
)
kwimage.structs.boxes
¶Vectorized Bounding Boxes
kwimage.Boxes
is a tool for efficiently transporting a set of bounding
boxes within python as well as methods for operating on bounding boxes. It is a
VERY thin wrapper around a pure numpy/torch array/tensor representation, and
thus it is very fast.
Raw bounding boxes come in lots of different formats. There are lots of ways to parameterize two points! Because of this THE USER MUST ALWAYS BE EXPLICIT ABOUT THE BOX FORMAT.
- There are 3 main bounding box formats:
xywh: top left xy-coordinates and width height offsets cxywh: center xy-coordinates and width height offsets ltrb: top left and bottom right xy coordinates
Here is some example usage
Example
>>> from kwimage.structs.boxes import Boxes
>>> data = np.array([[ 0, 0, 10, 10],
>>> [ 5, 5, 50, 50],
>>> [10, 0, 20, 10],
>>> [20, 0, 30, 10]])
>>> # Note that the format of raw data is ambiguous, so you must specify
>>> boxes = Boxes(data, 'ltrb')
>>> print('boxes = {!r}'.format(boxes))
boxes = <Boxes(ltrb,
array([[ 0, 0, 10, 10],
[ 5, 5, 50, 50],
[10, 0, 20, 10],
[20, 0, 30, 10]]))>
>>> # Now you can operate on those boxes easily
>>> print(boxes.translate((10, 10)))
<Boxes(ltrb,
array([[10., 10., 20., 20.],
[15., 15., 60., 60.],
[20., 10., 30., 20.],
[30., 10., 40., 20.]]))>
>>> print(boxes.to_cxywh())
<Boxes(cxywh,
array([[ 5. , 5. , 10. , 10. ],
[27.5, 27.5, 45. , 45. ],
[15. , 5. , 10. , 10. ],
[25. , 5. , 10. , 10. ]]))>
>>> print(ub.repr2(boxes.ious(boxes), precision=2, with_dtype=False))
np.array([[1. , 0.01, 0. , 0. ],
[0.01, 1. , 0.02, 0.02],
[0. , 0.02, 1. , 0. ],
[0. , 0.02, 0. , 1. ]])
Converts boxes between different formats as long as the last dimension |
- class kwimage.structs.boxes.Boxes(data, format=None, check=True)[source]¶
Bases:
_BoxConversionMixins
,_BoxPropertyMixins
,_BoxTransformMixins
,_BoxDrawMixins
,ubelt.NiceRepr
Converts boxes between different formats as long as the last dimension contains 4 coordinates and the format is specified.
This is a convinience class, and should not not store the data for very long. The general idiom should be create class, convert data, and then get the raw data and let the class be garbage collected. This will help ensure that your code is portable and understandable if this class is not available.
Example
>>> # xdoctest: +IGNORE_WHITESPACE >>> import kwimage >>> import numpy as np >>> # Given an array / tensor that represents one or more boxes >>> data = np.array([[ 0, 0, 10, 10], >>> [ 5, 5, 50, 50], >>> [20, 0, 30, 10]]) >>> # The kwimage.Boxes data structure is a thin fast wrapper >>> # that provides methods for operating on the boxes. >>> # It requires that the user explicitly provide a code that denotes >>> # the format of the boxes (i.e. what each column represents) >>> boxes = kwimage.Boxes(data, 'ltrb') >>> # This means that there is no ambiguity about box format >>> # The representation string of the Boxes object demonstrates this >>> print('boxes = {!r}'.format(boxes)) boxes = <Boxes(ltrb, array([[ 0, 0, 10, 10], [ 5, 5, 50, 50], [20, 0, 30, 10]]))> >>> # if you pass this data around. You can convert to other formats >>> # For docs on available format codes see :class:`BoxFormat`. >>> # In this example we will convert (left, top, right, bottom) >>> # to (left-x, top-y, width, height). >>> boxes.toformat('xywh') <Boxes(xywh, array([[ 0, 0, 10, 10], [ 5, 5, 45, 45], [20, 0, 10, 10]]))> >>> # In addition to format conversion there are other operations >>> # We can quickly (using a C-backend) find IoUs >>> ious = boxes.ious(boxes) >>> print('{}'.format(ub.repr2(ious, nl=1, precision=2, with_dtype=False))) np.array([[1. , 0.01, 0. ], [0.01, 1. , 0.02], [0. , 0.02, 1. ]]) >>> # We can ask for the area of each box >>> print('boxes.area = {}'.format(ub.repr2(boxes.area, nl=0, with_dtype=False))) boxes.area = np.array([[ 100],[2025],[ 100]]) >>> # We can ask for the center of each box >>> print('boxes.center = {}'.format(ub.repr2(boxes.center, nl=1, with_dtype=False))) boxes.center = ( np.array([[ 5. ],[27.5],[25. ]]), np.array([[ 5. ],[27.5],[ 5. ]]), ) >>> # We can translate / scale the boxes >>> boxes.translate((10, 10)).scale(100) <Boxes(ltrb, array([[1000., 1000., 2000., 2000.], [1500., 1500., 6000., 6000.], [3000., 1000., 4000., 2000.]]))> >>> # We can clip the bounding boxes >>> boxes.translate((10, 10)).scale(100).clip(1200, 1200, 1700, 1800) <Boxes(ltrb, array([[1200., 1200., 1700., 1800.], [1500., 1500., 1700., 1800.], [1700., 1200., 1700., 1800.]]))> >>> # We can perform arbitrary warping of the boxes >>> # (note that if the transform is not axis aligned, the axis aligned >>> # bounding box of the transform result will be returned) >>> transform = np.array([[-0.83907153, 0.54402111, 0. ], >>> [-0.54402111, -0.83907153, 0. ], >>> [ 0. , 0. , 1. ]]) >>> boxes.warp(transform) <Boxes(ltrb, array([[ -8.3907153 , -13.8309264 , 5.4402111 , 0. ], [-39.23347095, -69.154632 , 23.00569785, -6.9154632 ], [-25.1721459 , -24.7113486 , -11.3412195 , -10.8804222 ]]))> >>> # Note, that we can transform the box to a Polygon for more >>> # accurate warping. >>> transform = np.array([[-0.83907153, 0.54402111, 0. ], >>> [-0.54402111, -0.83907153, 0. ], >>> [ 0. , 0. , 1. ]]) >>> warped_polys = boxes.to_polygons().warp(transform) >>> print(ub.repr2(warped_polys.data, sv=1)) [ <Polygon({ 'exterior': <Coords(data= array([[ 0. , 0. ], [ 5.4402111, -8.3907153], [ -2.9505042, -13.8309264], [ -8.3907153, -5.4402111], [ 0. , 0. ]]))>, 'interiors': [], })>, <Polygon({ 'exterior': <Coords(data= array([[ -1.4752521 , -6.9154632 ], [ 23.00569785, -44.67368205], [-14.752521 , -69.154632 ], [-39.23347095, -31.39641315], [ -1.4752521 , -6.9154632 ]]))>, 'interiors': [], })>, <Polygon({ 'exterior': <Coords(data= array([[-16.7814306, -10.8804222], [-11.3412195, -19.2711375], [-19.7319348, -24.7113486], [-25.1721459, -16.3206333], [-16.7814306, -10.8804222]]))>, 'interiors': [], })>, ] >>> # The kwimage.Boxes data structure is also convertable to >>> # several alternative data structures, like shapely, coco, and imgaug. >>> print(ub.repr2(boxes.to_shapely(), sv=1)) [ POLYGON ((0 0, 0 10, 10 10, 10 0, 0 0)), POLYGON ((5 5, 5 50, 50 50, 50 5, 5 5)), POLYGON ((20 0, 20 10, 30 10, 30 0, 20 0)), ] >>> # xdoctest: +REQUIRES(module:imgaug) >>> print(ub.repr2(boxes[0:1].to_imgaug(shape=(100, 100)), sv=1)) BoundingBoxesOnImage([BoundingBox(x1=0.0000, y1=0.0000, x2=10.0000, y2=10.0000, label=None)], shape=(100, 100)) >>> # xdoctest: -REQUIRES(module:imgaug) >>> print(ub.repr2(list(boxes.to_coco()), sv=1)) [ [0, 0, 10, 10], [5, 5, 45, 45], [20, 0, 10, 10], ] >>> # Finally, when you are done with your boxes object, you can >>> # unwrap the raw data by using the ``.data`` attribute >>> # all operations are done on this data, which gives the >>> # kwiamge.Boxes data structure almost no overhead when >>> # inserted into existing code. >>> print('boxes.data =\n{}'.format(ub.repr2(boxes.data, nl=1))) boxes.data = np.array([[ 0, 0, 10, 10], [ 5, 5, 50, 50], [20, 0, 30, 10]], dtype=np.int64) >>> # xdoctest: +REQUIRES(module:torch) >>> # This data structure was designed for use with both torch >>> # and numpy, the underlying data can be either an array or tensor. >>> boxes.tensor() <Boxes(ltrb, tensor([[ 0, 0, 10, 10], [ 5, 5, 50, 50], [20, 0, 30, 10]]))> >>> boxes.numpy() <Boxes(ltrb, array([[ 0, 0, 10, 10], [ 5, 5, 50, 50], [20, 0, 30, 10]]))>
Example
>>> # xdoctest: +IGNORE_WHITESPACE >>> from kwimage.structs.boxes import * # NOQA >>> # Demo of conversion methods >>> import kwimage >>> kwimage.Boxes([[25, 30, 15, 10]], 'xywh') <Boxes(xywh, array([[25, 30, 15, 10]]))> >>> kwimage.Boxes([[25, 30, 15, 10]], 'xywh').to_xywh() <Boxes(xywh, array([[25, 30, 15, 10]]))> >>> kwimage.Boxes([[25, 30, 15, 10]], 'xywh').to_cxywh() <Boxes(cxywh, array([[32.5, 35. , 15. , 10. ]]))> >>> kwimage.Boxes([[25, 30, 15, 10]], 'xywh').to_ltrb() <Boxes(ltrb, array([[25, 30, 40, 40]]))> >>> kwimage.Boxes([[25, 30, 15, 10]], 'xywh').scale(2).to_ltrb() <Boxes(ltrb, array([[50., 60., 80., 80.]]))> >>> # xdoctest: +REQUIRES(module:torch) >>> kwimage.Boxes(torch.FloatTensor([[25, 30, 15, 20]]), 'xywh').scale(.1).to_ltrb() <Boxes(ltrb, tensor([[ 2.5000, 3.0000, 4.0000, 5.0000]]))>
Notes
In the following examples we show cases where
Boxes
can hold a single 1-dimensional box array. This is a holdover from an older codebase, and some functions may assume that the input is at least 2-D. Thus when representing a single bounding box it is best practice to view it as a list of 1 box. While many function will work in the 1-D case, not all functions have been tested and thus we cannot gaurentee correctness.Example
>>> # xdoctest: +IGNORE_WHITESPACE >>> Boxes([25, 30, 15, 10], 'xywh') <Boxes(xywh, array([25, 30, 15, 10]))> >>> Boxes([25, 30, 15, 10], 'xywh').to_xywh() <Boxes(xywh, array([25, 30, 15, 10]))> >>> Boxes([25, 30, 15, 10], 'xywh').to_cxywh() <Boxes(cxywh, array([32.5, 35. , 15. , 10. ]))> >>> Boxes([25, 30, 15, 10], 'xywh').to_ltrb() <Boxes(ltrb, array([25, 30, 40, 40]))> >>> Boxes([25, 30, 15, 10], 'xywh').scale(2).to_ltrb() <Boxes(ltrb, array([50., 60., 80., 80.]))> >>> # xdoctest: +REQUIRES(module:torch) >>> Boxes(torch.FloatTensor([[25, 30, 15, 20]]), 'xywh').scale(.1).to_ltrb() <Boxes(ltrb, tensor([[ 2.5000, 3.0000, 4.0000, 5.0000]]))>
Example
>>> datas = [ >>> [1, 2, 3, 4], >>> [[1, 2, 3, 4], [4, 5, 6, 7]], >>> [[[1, 2, 3, 4], [4, 5, 6, 7]]], >>> ] >>> formats = BoxFormat.cannonical >>> for format1 in formats: >>> for data in datas: >>> self = box1 = Boxes(data, format1) >>> for format2 in formats: >>> box2 = box1.toformat(format2) >>> back = box2.toformat(format1) >>> assert box1 == back
- __eq__(self, other)[source]¶
Tests equality of two Boxes objects
Example
>>> box0 = box1 = Boxes([[1, 2, 3, 4]], 'xywh') >>> box2 = Boxes(box0.data, 'ltrb') >>> box3 = Boxes([[0, 2, 3, 4]], box0.format) >>> box4 = Boxes(box0.data, box2.format) >>> assert box0 == box1 >>> assert not box0 == box2 >>> assert not box2 == box3 >>> assert box2 == box4
- classmethod random(Boxes, num=1, scale=1.0, format=BoxFormat.XYWH, anchors=None, anchor_std=1.0 / 6, tensor=False, rng=None)[source]¶
Makes random boxes; typically for testing purposes
- Parameters
num (int) – number of boxes to generate
scale (float | Tuple[float, float]) – size of imgdims
format (str) – format of boxes to be created (e.g. ltrb, xywh)
anchors (ndarray) – normalized width / heights of anchor boxes to perterb and randomly place. (must be in range 0-1)
anchor_std (float) – magnitude of noise applied to anchor shapes
tensor (bool) – if True, returns boxes in tensor format
rng (None | int | RandomState) – initial random seed
Example
>>> # xdoctest: +IGNORE_WHITESPACE >>> Boxes.random(3, rng=0, scale=100) <Boxes(xywh, array([[54, 54, 6, 17], [42, 64, 1, 25], [79, 38, 17, 14]]))> >>> # xdoctest: +REQUIRES(module:torch) >>> Boxes.random(3, rng=0, scale=100).tensor() <Boxes(xywh, tensor([[ 54, 54, 6, 17], [ 42, 64, 1, 25], [ 79, 38, 17, 14]]))> >>> anchors = np.array([[.5, .5], [.3, .3]]) >>> Boxes.random(3, rng=0, scale=100, anchors=anchors) <Boxes(xywh, array([[ 2, 13, 51, 51], [32, 51, 32, 36], [36, 28, 23, 26]]))>
Example
>>> # Boxes position/shape within 0-1 space should be uniform. >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> fig = kwplot.figure(fnum=1, doclf=True) >>> fig.gca().set_xlim(0, 128) >>> fig.gca().set_ylim(0, 128) >>> import kwimage >>> kwimage.Boxes.random(num=10).scale(128).draw()
- classmethod concatenate(cls, boxes, axis=0)[source]¶
Concatenates multiple boxes together
- Parameters
boxes (Sequence[Boxes]) – list of boxes to concatenate
axis (int, default=0) – axis to stack on
- Returns
stacked boxes
- Return type
Example
>>> boxes = [Boxes.random(3) for _ in range(3)] >>> new = Boxes.concatenate(boxes) >>> assert len(new) == 9 >>> assert np.all(new.data[3:6] == boxes[1].data)
Example
>>> boxes = [Boxes.random(3) for _ in range(3)] >>> boxes[0].data = boxes[0].data[0] >>> boxes[1].data = boxes[0].data[0:0] >>> new = Boxes.concatenate(boxes) >>> assert len(new) == 4 >>> # xdoctest: +REQUIRES(module:torch) >>> new = Boxes.concatenate([b.tensor() for b in boxes]) >>> assert len(new) == 4
- compress(self, flags, axis=0, inplace=False)[source]¶
Filters boxes based on a boolean criterion
- Parameters
flags (ArrayLike[bool]) – true for items to be kept
axis (int) – you usually want this to be 0
inplace (bool) – if True, modifies this object
Example
>>> self = Boxes([[25, 30, 15, 10]], 'ltrb') >>> self.compress([True]) <Boxes(ltrb, array([[25, 30, 15, 10]]))> >>> self.compress([False]) <Boxes(ltrb, array([], shape=(0, 4), dtype=int64))>
- take(self, idxs, axis=0, inplace=False)[source]¶
Takes a subset of items at specific indices
- Parameters
indices (ArrayLike[int]) – indexes of items to take
axis (int) – you usually want this to be 0
inplace (bool) – if True, modifies this object
Example
>>> self = Boxes([[25, 30, 15, 10]], 'ltrb') >>> self.take([0]) <Boxes(ltrb, array([[25, 30, 15, 10]]))> >>> self.take([]) <Boxes(ltrb, array([], shape=(0, 4), dtype=int64))>
- _impl(self)[source]¶
returns the kwarray.ArrayAPI implementation for the data
Example
>>> assert Boxes.random().numpy()._impl.is_numpy >>> # xdoctest: +REQUIRES(module:torch) >>> assert Boxes.random().tensor()._impl.is_tensor
- astype(self, dtype)[source]¶
Changes the type of the internal array used to represent the boxes
Notes
this operation is not inplace
Example
>>> # xdoctest: +IGNORE_WHITESPACE >>> # xdoctest: +REQUIRES(module:torch) >>> Boxes.random(3, 100, rng=0).tensor().astype('int32') <Boxes(xywh, tensor([[54, 54, 6, 17], [42, 64, 1, 25], [79, 38, 17, 14]], dtype=torch.int32))> >>> Boxes.random(3, 100, rng=0).numpy().astype('int32') <Boxes(xywh, array([[54, 54, 6, 17], [42, 64, 1, 25], [79, 38, 17, 14]], dtype=int32))> >>> Boxes.random(3, 100, rng=0).tensor().astype('float32') >>> Boxes.random(3, 100, rng=0).numpy().astype('float32')
- round(self, inplace=False)[source]¶
Rounds data coordinates to the nearest integer.
This operation is applied directly to the box coordinates, so its output will depend on the format the boxes are stored in.
- Parameters
inplace (bool, default=False) – if True, modifies this object
- SeeAlso:
Example
>>> import kwimage >>> self = kwimage.Boxes.random(3, rng=0).scale(10) >>> new = self.round() >>> print('self = {!r}'.format(self)) >>> print('new = {!r}'.format(new)) self = <Boxes(xywh, array([[5.48813522, 5.44883192, 0.53949833, 1.70306146], [4.23654795, 6.4589411 , 0.13932407, 2.45878875], [7.91725039, 3.83441508, 1.71937704, 1.45453393]]))> new = <Boxes(xywh, array([[5., 5., 1., 2.], [4., 6., 0., 2.], [8., 4., 2., 1.]]))>
- quantize(self, inplace=False, dtype=np.int32)[source]¶
Converts the box to integer coordinates.
This operation takes the floor of the left side and the ceil of the right side. Thus the area of the box will never decreases.
- Parameters
inplace (bool, default=False) – if True, modifies this object
dtype (type) – type to cast as
- SeeAlso:
Example
>>> import kwimage >>> self = kwimage.Boxes.random(3, rng=0).scale(10) >>> new = self.quantize() >>> print('self = {!r}'.format(self)) >>> print('new = {!r}'.format(new)) self = <Boxes(xywh, array([[5.48813522, 5.44883192, 0.53949833, 1.70306146], [4.23654795, 6.4589411 , 0.13932407, 2.45878875], [7.91725039, 3.83441508, 1.71937704, 1.45453393]]))> new = <Boxes(xywh, array([[5, 5, 2, 3], [4, 6, 1, 3], [7, 3, 3, 3]], dtype=int32))>
Example
>>> import kwimage >>> self = kwimage.Boxes.random(3, rng=0) >>> orig = self.copy() >>> self.quantize(inplace=True) >>> assert np.any(self.data != orig.data)
- numpy(self)[source]¶
Converts tensors to numpy. Does not change memory if possible.
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> self = Boxes.random(3).tensor() >>> newself = self.numpy() >>> self.data[0, 0] = 0 >>> assert newself.data[0, 0] == 0 >>> self.data[0, 0] = 1 >>> assert self.data[0, 0] == 1
- tensor(self, device=ub.NoParam)[source]¶
Converts numpy to tensors. Does not change memory if possible.
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> self = Boxes.random(3) >>> # xdoctest: +REQUIRES(module:torch) >>> newself = self.tensor() >>> self.data[0, 0] = 0 >>> assert newself.data[0, 0] == 0 >>> self.data[0, 0] = 1 >>> assert self.data[0, 0] == 1
- ious(self, other, bias=0, impl='auto', mode=None)[source]¶
Intersection over union.
Compute IOUs (intersection area over union area) between these boxes and another set of boxes. This is a symmetric measure of similarity between boxes.
Todo
- [ ] Add pairwise flag to toggle between one-vs-one and all-vs-all
computation. I.E. Add option for componentwise calculation.
- Parameters
other (Boxes) – boxes to compare IoUs against
bias (int, default=0) – either 0 or 1, does TL=BR have area of 0 or 1?
impl (str, default=’auto’) – code to specify implementation used to ious. Can be either torch, py, c, or auto. Efficiency and the exact result will vary by implementation, but they will always be close. Some implementations only accept certain data types (e.g. impl=’c’, only accepts float32 numpy arrays). See ~/code/kwimage/dev/bench_bbox.py for benchmark details. On my system the torch impl was fastest (when the data was on the GPU).
mode – depricated, use impl
- SeeAlso:
iooas - for a measure of coverage between boxes
Examples
>>> import kwimage >>> self = kwimage.Boxes(np.array([[ 0, 0, 10, 10], >>> [10, 0, 20, 10], >>> [20, 0, 30, 10]]), 'ltrb') >>> other = kwimage.Boxes(np.array([6, 2, 20, 10]), 'ltrb') >>> overlaps = self.ious(other, bias=1).round(2) >>> assert np.all(np.isclose(overlaps, [0.21, 0.63, 0.04])), repr(overlaps)
Examples
>>> import kwimage >>> boxes1 = kwimage.Boxes(np.array([[ 0, 0, 10, 10], >>> [10, 0, 20, 10], >>> [20, 0, 30, 10]]), 'ltrb') >>> other = kwimage.Boxes(np.array([[6, 2, 20, 10], >>> [100, 200, 300, 300]]), 'ltrb') >>> overlaps = boxes1.ious(other) >>> print('{}'.format(ub.repr2(overlaps, precision=2, nl=1))) np.array([[0.18, 0. ], [0.61, 0. ], [0. , 0. ]]...)
Examples
>>> # xdoctest: +IGNORE_WHITESPACE >>> Boxes(np.empty(0), 'xywh').ious(Boxes(np.empty(4), 'xywh')).shape (0,) >>> #Boxes(np.empty(4), 'xywh').ious(Boxes(np.empty(0), 'xywh')).shape >>> Boxes(np.empty((0, 4)), 'xywh').ious(Boxes(np.empty((0, 4)), 'xywh')).shape (0, 0) >>> Boxes(np.empty((1, 4)), 'xywh').ious(Boxes(np.empty((0, 4)), 'xywh')).shape (1, 0) >>> Boxes(np.empty((0, 4)), 'xywh').ious(Boxes(np.empty((1, 4)), 'xywh')).shape (0, 1)
Examples
>>> # xdoctest: +REQUIRES(module:torch) >>> formats = BoxFormat.cannonical >>> istensors = [False, True] >>> results = {} >>> for format in formats: >>> for tensor in istensors: >>> boxes1 = Boxes.random(5, scale=10.0, rng=0, format=format, tensor=tensor) >>> boxes2 = Boxes.random(7, scale=10.0, rng=1, format=format, tensor=tensor) >>> ious = boxes1.ious(boxes2) >>> results[(format, tensor)] = ious >>> results = {k: v.numpy() if torch.is_tensor(v) else v for k, v in results.items() } >>> results = {k: v.tolist() for k, v in results.items()} >>> print(ub.repr2(results, sk=True, precision=3, nl=2)) >>> from functools import partial >>> assert ub.allsame(results.values(), partial(np.allclose, atol=1e-07))
- Ignore:
>>> # does this work with backprop? >>> # xdoctest: +REQUIRES(module:torch) >>> import torch >>> import kwimage >>> num = 1000 >>> true_boxes = kwimage.Boxes.random(num).tensor() >>> inputs = torch.rand(num, 10) >>> regress = torch.nn.Linear(10, 4) >>> energy = regress(inputs) >>> energy.retain_grad() >>> outputs = energy.sigmoid() >>> outputs.retain_grad() >>> out_boxes = kwimage.Boxes(outputs, 'cxywh') >>> ious = out_boxes.ious(true_boxes) >>> loss = ious.sum() >>> loss.backward()
- iooas(self, other, bias=0)[source]¶
Intersection over other area.
This is an asymetric measure of coverage. How much of the “other” boxes are covered by these boxes. It is the area of intersection between each pair of boxes and the area of the “other” boxes.
- SeeAlso:
ious - for a measure of similarity between boxes
- Parameters
other (Boxes) – boxes to compare IoOA against
bias (int, default=0) – either 0 or 1, does TL=BR have area of 0 or 1?
Examples
>>> self = Boxes(np.array([[ 0, 0, 10, 10], >>> [10, 0, 20, 10], >>> [20, 0, 30, 10]]), 'ltrb') >>> other = Boxes(np.array([[6, 2, 20, 10], [0, 0, 0, 3]]), 'xywh') >>> coverage = self.iooas(other, bias=0).round(2) >>> print('coverage = {!r}'.format(coverage))
- isect_area(self, other, bias=0)[source]¶
Intersection part of intersection over union computation
Examples
>>> # xdoctest: +IGNORE_WHITESPACE >>> self = Boxes.random(5, scale=10.0, rng=0, format='ltrb') >>> other = Boxes.random(3, scale=10.0, rng=1, format='ltrb') >>> isect = self.isect_area(other, bias=0) >>> ious_v1 = isect / ((self.area + other.area.T) - isect) >>> ious_v2 = self.ious(other, bias=0) >>> assert np.allclose(ious_v1, ious_v2)
- intersection(self, other)[source]¶
Componentwise intersection between two sets of Boxes
intersections of boxes are always boxes, so this works
- Returns
intersected boxes
- Return type
Examples
>>> # xdoctest: +IGNORE_WHITESPACE >>> from kwimage.structs.boxes import * # NOQA >>> self = Boxes.random(5, rng=0).scale(10.) >>> other = self.translate(1) >>> new = self.intersection(other) >>> new_area = np.nan_to_num(new.area).ravel() >>> alt_area = np.diag(self.isect_area(other)) >>> close = np.isclose(new_area, alt_area) >>> assert np.all(close)
- union_hull(self, other)[source]¶
Componentwise hull union between two sets of Boxes
NOTE: convert to polygon to do a real union.
- Returns
unioned boxes
- Return type
Examples
>>> # xdoctest: +IGNORE_WHITESPACE >>> from kwimage.structs.boxes import * # NOQA >>> self = Boxes.random(5, rng=0).scale(10.) >>> other = self.translate(1) >>> new = self.union_hull(other) >>> new_area = np.nan_to_num(new.area).ravel()
- bounding_box(self)[source]¶
Returns the box that bounds all of the contained boxes
- Returns
a single box
- Return type
Examples
>>> # xdoctest: +IGNORE_WHITESPACE >>> from kwimage.structs.boxes import * # NOQA >>> self = Boxes.random(5, rng=0).scale(10.) >>> other = self.translate(1) >>> new = self.union_hull(other) >>> new_area = np.nan_to_num(new.area).ravel()
- contains(self, other)[source]¶
Determine of points are completely contained by these boxes
- Parameters
other (Points) – points to test for containment. TODO: support generic data types
- Returns
- N x M boolean matrix indicating which box
contains which points, where N is the number of boxes and M is the number of points.
- Return type
flags (ArrayLike)
Examples
>>> import kwimage >>> self = kwimage.Boxes.random(10).scale(10).round() >>> other = kwimage.Points.random(10).scale(10).round() >>> flags = self.contains(other) >>> flags = self.contains(self.xy_center) >>> assert np.all(np.diag(flags))
- view(self, *shape)[source]¶
Passthrough method to view or reshape
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> self = Boxes.random(6, scale=10.0, rng=0, format='xywh').tensor() >>> assert list(self.view(3, 2, 4).data.shape) == [3, 2, 4] >>> self = Boxes.random(6, scale=10.0, rng=0, format='ltrb').tensor() >>> assert list(self.view(3, 2, 4).data.shape) == [3, 2, 4]
kwimage.structs.coords
¶Coordinates the fundamental “point” datatype. They do not contain metadata, only geometry. See the Points data type for a structure that maintains metadata on top of coordinate data.
A data structure to store n-dimensional coordinate geometry. |
- class kwimage.structs.coords.Coords(data=None, meta=None)[source]¶
Bases:
kwimage.structs._generic.Spatial
,ubelt.NiceRepr
A data structure to store n-dimensional coordinate geometry.
Currently it is up to the user to maintain what coordinate system this geometry belongs to.
Note
This class was designed to hold coordinates in r/c format, but in general this class is anostic to dimension ordering as long as you are consistent. However, there are two places where this matters:
(1) drawing and (2) gdal/imgaug-warping. In these places we will assume x/y for legacy reasons. This may change in the future.
The term axes with resepct to
Coords
always refers to the final numpy axis. In other words the final numpy-axis represents ALL of the coordinate-axes.- CommandLine:
xdoctest -m kwimage.structs.coords Coords
Example
>>> from kwimage.structs.coords import * # NOQA >>> import kwarray >>> rng = kwarray.ensure_rng(0) >>> self = Coords.random(num=4, dim=3, rng=rng) >>> print('self = {}'.format(self)) self = <Coords(data= array([[0.5488135 , 0.71518937, 0.60276338], [0.54488318, 0.4236548 , 0.64589411], [0.43758721, 0.891773 , 0.96366276], [0.38344152, 0.79172504, 0.52889492]]))> >>> matrix = rng.rand(4, 4) >>> self.warp(matrix) <Coords(data= array([[0.71037426, 1.25229659, 1.39498435], [0.60799503, 1.26483447, 1.42073131], [0.72106004, 1.39057144, 1.38757508], [0.68384299, 1.23914654, 1.29258196]]))> >>> self.translate(3, inplace=True) <Coords(data= array([[3.5488135 , 3.71518937, 3.60276338], [3.54488318, 3.4236548 , 3.64589411], [3.43758721, 3.891773 , 3.96366276], [3.38344152, 3.79172504, 3.52889492]]))> >>> self.translate(3, inplace=True) <Coords(data= array([[6.5488135 , 6.71518937, 6.60276338], [6.54488318, 6.4236548 , 6.64589411], [6.43758721, 6.891773 , 6.96366276], [6.38344152, 6.79172504, 6.52889492]]))> >>> self.scale(2) <Coords(data= array([[13.09762701, 13.43037873, 13.20552675], [13.08976637, 12.8473096 , 13.29178823], [12.87517442, 13.783546 , 13.92732552], [12.76688304, 13.58345008, 13.05778984]]))> >>> # xdoctest: +REQUIRES(module:torch) >>> self.tensor() >>> self.tensor().tensor().numpy().numpy() >>> self.numpy() >>> #self.draw_on()
- classmethod random(Coords, num=1, dim=2, rng=None, meta=None)[source]¶
Makes random coordinates; typically for testing purposes
- compress(self, flags, axis=0, inplace=False)[source]¶
Filters items based on a boolean criterion
- Parameters
flags (ArrayLike[bool]) – true for items to be kept
axis (int) – you usually want this to be 0
inplace (bool, default=False) – if True, modifies this object
- Returns
filtered coords
- Return type
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10, rng=0) >>> self.compress([True] * len(self)) >>> self.compress([False] * len(self)) <Coords(data=array([], shape=(0, 2), dtype=float64))> >>> # xdoctest: +REQUIRES(module:torch) >>> self = self.tensor() >>> self.compress([True] * len(self)) >>> self.compress([False] * len(self))
- take(self, indices, axis=0, inplace=False)[source]¶
Takes a subset of items at specific indices
- Parameters
indices (ArrayLike[int]) – indexes of items to take
axis (int) – you usually want this to be 0
inplace (bool, default=False) – if True, modifies this object
- Returns
filtered coords
- Return type
Example
>>> self = Coords(np.array([[25, 30, 15, 10]])) >>> self.take([0]) <Coords(data=array([[25, 30, 15, 10]]))> >>> self.take([]) <Coords(data=array([], shape=(0, 4), dtype=int64))>
- astype(self, dtype, inplace=False)[source]¶
Changes the data type
- Parameters
dtype – new type
inplace (bool, default=False) – if True, modifies this object
- Returns
modified coordinates
- Return type
- round(self, inplace=False)[source]¶
Rounds data to the nearest integer
- Parameters
inplace (bool, default=False) – if True, modifies this object
Example
>>> import kwimage >>> self = kwimage.Coords.random(3).scale(10) >>> self.round()
- view(self, *shape)[source]¶
Passthrough method to view or reshape
- Parameters
*shape – new shape of the data
- Returns
modified coordinates
- Return type
Example
>>> self = Coords.random(6, dim=4).numpy() >>> assert list(self.view(3, 2, 4).data.shape) == [3, 2, 4] >>> # xdoctest: +REQUIRES(module:torch) >>> self = Coords.random(6, dim=4).tensor() >>> assert list(self.view(3, 2, 4).data.shape) == [3, 2, 4]
- classmethod concatenate(cls, coords, axis=0)[source]¶
Concatenates lists of coordinates together
- Parameters
coords (Sequence[Coords]) – list of coords to concatenate
axis (int, default=0) – axis to stack on
- Returns
stacked coords
- Return type
- CommandLine:
xdoctest -m kwimage.structs.coords Coords.concatenate
Example
>>> coords = [Coords.random(3) for _ in range(3)] >>> new = Coords.concatenate(coords) >>> assert len(new) == 9 >>> assert np.all(new.data[3:6] == coords[1].data)
- tensor(self, device=ub.NoParam)[source]¶
Converts numpy to tensors. Does not change memory if possible.
- Returns
modified coordinates
- Return type
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> self = Coords.random(3).numpy() >>> newself = self.tensor() >>> self.data[0, 0] = 0 >>> assert newself.data[0, 0] == 0 >>> self.data[0, 0] = 1 >>> assert self.data[0, 0] == 1
- numpy(self)[source]¶
Converts tensors to numpy. Does not change memory if possible.
- Returns
modified coordinates
- Return type
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> self = Coords.random(3).tensor() >>> newself = self.numpy() >>> self.data[0, 0] = 0 >>> assert newself.data[0, 0] == 0 >>> self.data[0, 0] = 1 >>> assert self.data[0, 0] == 1
- reorder_axes(self, new_order, inplace=False)[source]¶
Change the ordering of the coordinate axes.
- Parameters
new_order (Tuple[int]) –
new_order[i]
should specify which axes in the original coordinates should be mapped to thei-th
position in the returned axes.inplace (bool, default=False) – if True, modifies data inplace
- Returns
modified coordinates
- Return type
Note
This is the ordering of the “columns” in final numpy axis, not the numpy axes themselves.
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords(data=np.array([ >>> [7, 11], >>> [13, 17], >>> [21, 23], >>> ])) >>> new = self.reorder_axes((1, 0)) >>> print('new = {!r}'.format(new)) new = <Coords(data= array([[11, 7], [17, 13], [23, 21]]))>
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10, rng=0) >>> new = self.reorder_axes((1, 0)) >>> # Remapping using 1, 0 reverses the axes >>> assert np.all(new.data[:, 0] == self.data[:, 1]) >>> assert np.all(new.data[:, 1] == self.data[:, 0]) >>> # Remapping using 0, 1 does nothing >>> eye = self.reorder_axes((0, 1)) >>> assert np.all(eye.data == self.data) >>> # Remapping using 0, 0, destroys the 1-th column >>> bad = self.reorder_axes((0, 0)) >>> assert np.all(bad.data[:, 0] == self.data[:, 0]) >>> assert np.all(bad.data[:, 1] == self.data[:, 0])
- warp(self, transform, input_dims=None, output_dims=None, inplace=False)[source]¶
Generalized coordinate transform.
- Parameters
transform (GeometricTransform | ArrayLike | Augmenter | callable) – scikit-image tranform, a 3x3 transformation matrix, an imgaug Augmenter, or generic callable which transforms an NxD ndarray.
input_dims (Tuple) – shape of the image these objects correspond to (only needed / used when transform is an imgaug augmenter)
output_dims (Tuple) – unused in non-raster structures, only exists for compatibility.
inplace (bool, default=False) – if True, modifies data inplace
- Returns
modified coordinates
- Return type
Notes
Let D = self.dims
- transformation matrices can be either:
(D + 1) x (D + 1) # for homog
D x D # for scale / rotate
D x (D + 1) # for affine
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10, rng=0) >>> transform = skimage.transform.AffineTransform(scale=(2, 2)) >>> new = self.warp(transform) >>> assert np.all(new.data == self.scale(2).data)
- Doctest:
>>> self = Coords.random(10, rng=0) >>> assert np.all(self.warp(np.eye(3)).data == self.data) >>> assert np.all(self.warp(np.eye(2)).data == self.data)
- Doctest:
>>> # xdoctest: +REQUIRES(module:osgeo) >>> from osgeo import osr >>> wgs84_crs = osr.SpatialReference() >>> wgs84_crs.ImportFromEPSG(4326) >>> dst_crs = osr.SpatialReference() >>> dst_crs.ImportFromEPSG(2927) >>> transform = osr.CoordinateTransformation(wgs84_crs, dst_crs) >>> self = Coords.random(10, rng=0) >>> new = self.warp(transform) >>> assert np.all(new.data != self.data)
>>> # Alternative using generic func >>> def _gdal_coord_tranform(pts): ... return np.array([transform.TransformPoint(x, y, 0)[0:2] ... for x, y in pts]) >>> alt = self.warp(_gdal_coord_tranform) >>> assert np.all(alt.data != self.data) >>> assert np.all(alt.data == new.data)
- Doctest:
>>> # can use a generic function >>> def func(xy): ... return np.zeros_like(xy) >>> self = Coords.random(10, rng=0) >>> assert np.all(self.warp(func).data == 0)
- _warp_imgaug(self, augmenter, input_dims, inplace=False)[source]¶
Warps by applying an augmenter from the imgaug library
Note
We are assuming you are using X/Y coordinates here.
- Parameters
augmenter (imgaug.augmenters.Augmenter)
input_dims (Tuple) – h/w of the input image
inplace (bool, default=False) – if True, modifies data inplace
- CommandLine:
xdoctest -m ~/code/kwimage/kwimage/structs/coords.py Coords._warp_imgaug
Example
>>> # xdoctest: +REQUIRES(module:imgaug) >>> from kwimage.structs.coords import * # NOQA >>> import imgaug >>> input_dims = (10, 10) >>> self = Coords.random(10).scale(input_dims) >>> augmenter = imgaug.augmenters.Fliplr(p=1) >>> new = self._warp_imgaug(augmenter, input_dims) >>> # y coordinate should not change >>> assert np.allclose(self.data[:, 1], new.data[:, 1]) >>> assert np.allclose(input_dims[0] - self.data[:, 0], new.data[:, 0])
>>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.figure(fnum=1, doclf=True) >>> from matplotlib import pyplot as pl >>> ax = plt.gca() >>> ax.set_xlim(0, input_dims[0]) >>> ax.set_ylim(0, input_dims[1]) >>> self.draw(color='red', alpha=.4, radius=0.1) >>> new.draw(color='blue', alpha=.4, radius=0.1)
Example
>>> # xdoctest: +REQUIRES(module:imgaug) >>> from kwimage.structs.coords import * # NOQA >>> import imgaug >>> input_dims = (32, 32) >>> inplace = 0 >>> self = Coords.random(1000, rng=142).scale(input_dims).scale(.8) >>> self.data = self.data.astype(np.int32).astype(np.float32) >>> augmenter = imgaug.augmenters.CropAndPad(px=(-4, 4), keep_size=1).to_deterministic() >>> new = self._warp_imgaug(augmenter, input_dims) >>> # Change should be linear >>> norm1 = (self.data - self.data.min(axis=0)) / (self.data.max(axis=0) - self.data.min(axis=0)) >>> norm2 = (new.data - new.data.min(axis=0)) / (new.data.max(axis=0) - new.data.min(axis=0)) >>> diff = norm1 - norm2 >>> assert np.allclose(diff, 0, atol=1e-6, rtol=1e-4) >>> #assert np.allclose(self.data[:, 1], new.data[:, 1]) >>> #assert np.allclose(input_dims[0] - self.data[:, 0], new.data[:, 0]) >>> # xdoc: +REQUIRES(--show) >>> import kwimage >>> im = kwimage.imresize(kwimage.grab_test_image(), dsize=input_dims[::-1]) >>> new_im = augmenter.augment_image(im) >>> import kwplot >>> plt = kwplot.autoplt() >>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.imshow(im, pnum=(1, 2, 1), fnum=1) >>> self.draw(color='red', alpha=.8, radius=0.5) >>> kwplot.imshow(new_im, pnum=(1, 2, 2), fnum=1) >>> new.draw(color='blue', alpha=.8, radius=0.5, coord_axes=[1, 0])
- to_imgaug(self, input_dims)[source]¶
Translate to an imgaug object
- Returns
imgaug data structure
- Return type
imgaug.KeypointsOnImage
Example
>>> # xdoctest: +REQUIRES(module:imgaug) >>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10) >>> input_dims = (10, 10) >>> kpoi = self.to_imgaug(input_dims) >>> new = Coords.from_imgaug(kpoi) >>> assert np.allclose(new.data, self.data)
- scale(self, factor, about=None, output_dims=None, inplace=False)[source]¶
Scale coordinates by a factor
- Parameters
factor (float or Tuple[float, float]) – scale factor as either a scalar or per-dimension tuple.
about (Tuple | None) – if unspecified scales about the origin (0, 0), otherwise the rotation is about this point.
output_dims (Tuple) – unused in non-raster spatial structures
inplace (bool, default=False) – if True, modifies data inplace
- Returns
modified coordinates
- Return type
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10, rng=0) >>> new = self.scale(10) >>> assert new.data.max() <= 10
>>> self = Coords.random(10, rng=0) >>> self.data = (self.data * 10).astype(int) >>> new = self.scale(10) >>> assert new.data.dtype.kind == 'i' >>> new = self.scale(10.0) >>> assert new.data.dtype.kind == 'f'
- translate(self, offset, output_dims=None, inplace=False)[source]¶
Shift the coordinates
- Parameters
offset (float or Tuple[float]) – transation offset as either a scalar or a per-dimension tuple.
output_dims (Tuple) – unused in non-raster spatial structures
inplace (bool, default=False) – if True, modifies data inplace
- Returns
modified coordinates
- Return type
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10, dim=3, rng=0) >>> new = self.translate(10) >>> assert new.data.min() >= 10 >>> assert new.data.max() <= 11 >>> Coords.random(3, dim=3, rng=0) >>> Coords.random(3, dim=3, rng=0).translate((1, 2, 3))
- rotate(self, theta, about=None, output_dims=None, inplace=False)[source]¶
Rotate the coordinates about a point.
- Parameters
theta (float) – rotation angle in radians
about (Tuple | None) – if unspecified rotates about the origin (0, 0), otherwise the rotation is about this point.
output_dims (Tuple) – unused in non-raster spatial structures
inplace (bool, default=False) – if True, modifies data inplace
- Returns
modified coordinates
- Return type
Todo
[ ] Generalized ND Rotations?
References
https://math.stackexchange.com/questions/197772/gen-rot-matrix
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10, dim=2, rng=0) >>> theta = np.pi / 2 >>> new = self.rotate(theta)
>>> # Test rotate agrees with warp >>> sin_ = np.sin(theta) >>> cos_ = np.cos(theta) >>> rot_ = np.array([[cos_, -sin_], [sin_, cos_]]) >>> new2 = self.warp(rot_) >>> assert np.allclose(new.data, new2.data)
>>> # >>> # Rotate about a custom point >>> theta = np.pi / 2 >>> new3 = self.rotate(theta, about=(0.5, 0.5)) >>> # >>> # Rotate about the center of mass >>> about = self.data.mean(axis=0) >>> new4 = self.rotate(theta, about=about) >>> # xdoc: +REQUIRES(--show) >>> # xdoc: +REQUIRES(module:kwplot) >>> import kwplot >>> kwplot.figure(fnum=1, doclf=True) >>> plt = kwplot.autoplt() >>> self.draw(radius=0.01, color='blue', alpha=.5, coord_axes=[1, 0], setlim='grow') >>> plt.gca().set_aspect('equal') >>> new3.draw(radius=0.01, color='red', alpha=.5, coord_axes=[1, 0], setlim='grow')
- _rectify_about(self, about)[source]¶
Ensures that about returns a specified point. Allows for special keys like center to be used.
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10, dim=2, rng=0)
- fill(self, image, value, coord_axes=None, interp='bilinear')[source]¶
Sets sub-coordinate locations in a grid to a particular value
- Parameters
coord_axes (Tuple) – specify which image axes each coordinate dim corresponds to. For 2D images, if you are storing r/c data, set to [0,1], if you are storing x/y data, set to [1,0].
- Returns
image with coordinates rasterized on it
- Return type
ndarray
- soft_fill(self, image, coord_axes=None, radius=5)[source]¶
Used for drawing keypoint truth in heatmaps
- Parameters
coord_axes (Tuple) – specify which image axes each coordinate dim corresponds to. For 2D images, if you are storing r/c data, set to [0,1], if you are storing x/y data, set to [1,0].
In other words the i-th entry in coord_axes specifies which row-major spatial dimension the i-th column of a coordinate corresponds to. The index is the coordinate dimension and the value is the axes dimension.
- Returns
image with coordinates rasterized on it
- Return type
ndarray
References
https://stackoverflow.com/questions/54726703/generating-keypoint-heatmaps-in-tensorflow
Example
>>> from kwimage.structs.coords import * # NOQA >>> s = 64 >>> self = Coords.random(10, meta={'shape': (s, s)}).scale(s) >>> # Put points on edges to to verify "edge cases" >>> self.data[1] = [0, 0] # top left >>> self.data[2] = [s, s] # bottom right >>> self.data[3] = [0, s + 10] # bottom left >>> self.data[4] = [-3, s // 2] # middle left >>> self.data[5] = [s + 1, -1] # top right >>> # Put points in the middle to verify overlap blending >>> self.data[6] = [32.5, 32.5] # middle >>> self.data[7] = [34.5, 34.5] # middle >>> fill_value = 1 >>> coord_axes = [1, 0] >>> radius = 10 >>> image1 = np.zeros((s, s)) >>> self.soft_fill(image1, coord_axes=coord_axes, radius=radius) >>> radius = 3.0 >>> image2 = np.zeros((s, s)) >>> self.soft_fill(image2, coord_axes=coord_axes, radius=radius) >>> # xdoc: +REQUIRES(--show) >>> # xdoc: +REQUIRES(module:kwplot) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(image1, pnum=(1, 2, 1)) >>> kwplot.imshow(image2, pnum=(1, 2, 2))
- draw_on(self, image=None, fill_value=1, coord_axes=[1, 0], interp='bilinear')[source]¶
Note
unlike other methods, the defaults assume x/y internal data
- Parameters
coord_axes (Tuple) – specify which image axes each coordinate dim corresponds to. For 2D images, if you are storing r/c data, set to [0,1], if you are storing x/y data, set to [1,0].
In other words the i-th entry in coord_axes specifies which row-major spatial dimension the i-th column of a coordinate corresponds to. The index is the coordinate dimension and the value is the axes dimension.
- Returns
image with coordinates drawn on it
- Return type
ndarray
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> from kwimage.structs.coords import * # NOQA >>> s = 256 >>> self = Coords.random(10, meta={'shape': (s, s)}).scale(s) >>> self.data[0] = [10, 10] >>> self.data[1] = [20, 40] >>> image = np.zeros((s, s)) >>> fill_value = 1 >>> image = self.draw_on(image, fill_value, coord_axes=[1, 0], interp='bilinear') >>> # image = self.draw_on(image, fill_value, coord_axes=[0, 1], interp='nearest') >>> # image = self.draw_on(image, fill_value, coord_axes=[1, 0], interp='bilinear') >>> # image = self.draw_on(image, fill_value, coord_axes=[1, 0], interp='nearest') >>> # xdoc: +REQUIRES(--show) >>> # xdoc: +REQUIRES(module:kwplot) >>> import kwplot >>> kwplot.autompl() >>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.imshow(image) >>> self.draw(radius=3, alpha=.5, coord_axes=[1, 0])
- draw(self, color='blue', ax=None, alpha=None, coord_axes=[1, 0], radius=1, setlim=False)[source]¶
Note
unlike other methods, the defaults assume x/y internal data
- Parameters
setlim (bool) – if True ensures the limits of the axes contains the polygon
coord_axes (Tuple) – specify which image axes each coordinate dim corresponds to. For 2D images,
if you are storing r/c data, set to [0,1], if you are storing x/y data, set to [1,0].
- Returns
drawn matplotlib objects
- Return type
List[mpl.collections.PatchCollection]
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10) >>> # xdoc: +REQUIRES(--show) >>> self.draw(radius=3.0, setlim=True) >>> import kwplot >>> kwplot.autompl() >>> self.draw(radius=3.0)
kwimage.structs.detections
¶Structure for efficient access and modification of bounding boxes with associated scores and class labels. Builds on top of the kwimage.Boxes structure.
Also can optionally incorporate kwimage.PolygonList for segmentation masks and kwimage.PointsList for keypoints.
- If you want to visualize boxes and scores you can do this:
>>> # Given data >>> data = np.random.rand(10, 4) * 224 >>> scores = np.random.rand(10,) >>> class_idxs = np.random.randint(0, 3, size=10) >>> classes = ['class1', 'class2', 'class3'] >>> # >>> # Wrap your data with a Detections object >>> import kwimage >>> dets = kwimage.Detections( >>> boxes=kwimage.Boxes(data, format='xywh'), >>> scores=scores, >>> class_idxs=class_idxs, >>> classes=classes, >>> ) >>> dets.draw() >>> import matplotlib.pyplot as plt >>> plt.gca().set_xlim(0, 224) >>> plt.gca().set_ylim(0, 224)
Non critical methods for visualizing detections |
|
Non critical methods for algorithmic manipulation of detections |
|
Container for holding and manipulating multiple detections. |
|
Hacking in unit tests as doctests the file itself so it is easy to move to |
|
Construct semantic segmentation detection targets from annotations in |
- class kwimage.structs.detections._DetDrawMixin[source]¶
Non critical methods for visualizing detections
- draw(self, color='blue', alpha=None, labels=True, centers=False, lw=2, fill=False, ax=None, radius=5, kpts=True, sseg=True, setlim=False, boxes=True)[source]¶
Draws boxes using matplotlib
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> self = Detections.random(num=10, scale=512.0, rng=0, classes=['a', 'b', 'c']) >>> self.boxes.translate((-128, -128), inplace=True) >>> image = (np.random.rand(256, 256) * 255).astype(np.uint8) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> fig = kwplot.figure(fnum=1, doclf=True) >>> kwplot.imshow(image) >>> # xdoc: +REQUIRES(--show) >>> self.draw(color='blue', alpha=None) >>> # xdoc: +REQUIRES(--show) >>> for o in fig.findobj(): # http://matplotlib.1069221.n5.nabble.com/How-to-turn-off-all-clipping-td1813.html >>> o.set_clip_on(False) >>> kwplot.show_if_requested()
- draw_on(self, image, color='blue', alpha=None, labels=True, radius=5, kpts=True, sseg=True, boxes=True, ssegkw=None, label_loc='top_left', thickness=2)[source]¶
Draws boxes directly on the image using OpenCV
- Parameters
image (ndarray[uint8]) – must be in uint8 format
color (str | ColorLike | List[ColorLike]) – one color for all boxes or a list of colors for each box
alpha (float) – Transparency of overlay. can be a scalar or a list for each box
labels (bool | str | List[str]) – if True, use categorie names as the labels. See _make_labels for details. Otherwise a manually specified text label for each box.
boxes (bool) – if True draw the boxes
kpts (bool) – if True draw the keypoints
sseg (bool) – if True draw the segmentations
ssegkw (dict) – extra arguments passed to segmentations.draw_on
radius (float) – passed to keypoints.draw_on
label_loc (str) – indicates where labels (if specified) should be drawn. passed to boxes.draw_on
thickness (int, default=2) – rectangle thickness, negative values will draw a filled rectangle. passed to boxes.draw_on
- Returns
image with labeled boxes drawn on it
- Return type
ndarray[uint8]
- CommandLine:
xdoctest -m kwimage.structs.detections _DetDrawMixin.draw_on:1 –profile –show
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> import kwplot >>> self = Detections.random(num=10, scale=512, rng=0) >>> image = (np.random.rand(512, 512) * 255).astype(np.uint8) >>> image2 = self.draw_on(image, color='blue') >>> # xdoc: +REQUIRES(--show) >>> kwplot.figure(fnum=2000, doclf=True) >>> kwplot.autompl() >>> kwplot.imshow(image2) >>> kwplot.show_if_requested()
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> from kwimage.structs.detections import * # NOQA >>> import kwplot >>> self = Detections.random(num=10, scale=512, rng=0) >>> image = (np.random.rand(512, 512) * 255).astype(np.uint8) >>> image2 = self.draw_on(image, color='classes') >>> # xdoc: +REQUIRES(--show) >>> kwplot.figure(fnum=2000, doclf=True) >>> kwplot.autompl() >>> kwplot.imshow(image2) >>> kwplot.show_if_requested()
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> # xdoc: +REQUIRES(--profile) >>> import kwplot >>> self = Detections.random(num=100, scale=512, rng=0, keypoints=True, segmentations=True) >>> image = (np.random.rand(512, 512) * 255).astype(np.uint8) >>> image2 = self.draw_on(image, color='blue') >>> # xdoc: +REQUIRES(--show) >>> kwplot.figure(fnum=2000, doclf=True) >>> kwplot.autompl() >>> kwplot.imshow(image2) >>> kwplot.show_if_requested()
- Ignore:
import xdev globals().update(xdev.get_func_kwargs(kwimage.Detections.draw_on))
- _make_colors(self, color)[source]¶
Handles special settings of color.
If color == ‘classes’, then choose a distinct color for each category
- class kwimage.structs.detections._DetAlgoMixin[source]¶
Non critical methods for algorithmic manipulation of detections
- non_max_supression(self, thresh=0.0, perclass=False, impl='auto', daq=False, device_id=None)[source]¶
Find high scoring minimally overlapping detections
- Parameters
thresh (float) – iou threshold between 0 and 1. A box is removed if it overlaps with a previously chosen box by more than this threshold. Higher values are are more permissive (more boxes are returned). A value of 0 means that returned boxes will have no overlap.
perclass (bool) – if True, works on a per-class basis
impl (str) – nms implementation to use
daq (Bool | Dict) – if False, uses reqgular nms, otherwise uses divide and conquor algorithm. If daq is a Dict, then it is used as the kwargs to kwimage.daq_spatial_nms
device_id – try not to use. only used if impl is gpu
- Returns
indices of boxes to keep
- Return type
ndarray[int]
- non_max_supress(self, thresh=0.0, perclass=False, impl='auto', daq=False)[source]¶
Convinience method. Like non_max_supression, but returns to supressed boxes instead of the indices to keep.
- rasterize(self, bg_size, input_dims, soften=1, tf_data_to_img=None, img_dims=None, exclude=[])[source]¶
Ambiguous conversion from a Heatmap to a Detections object.
- SeeAlso:
Heatmap.detect
- Returns
raster-space detections.
- Return type
Example
>>> # xdoctest: +REQUIRES(module:ndsampler) >>> from kwimage.structs.detections import * # NOQA >>> self, iminfo, sampler = Detections.demo() >>> image = iminfo['imdata'][:] >>> input_dims = iminfo['imdata'].shape[0:2] >>> bg_size = [100, 100] >>> heatmap = self.rasterize(bg_size, input_dims) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.figure(fnum=1, pnum=(2, 2, 1)) >>> heatmap.draw(invert=True) >>> kwplot.figure(fnum=1, pnum=(2, 2, 2)) >>> kwplot.imshow(heatmap.draw_on(image)) >>> kwplot.figure(fnum=1, pnum=(2, 1, 2)) >>> kwplot.imshow(heatmap.draw_stacked())
- class kwimage.structs.detections.Detections(data=None, meta=None, datakeys=None, metakeys=None, checks=True, **kwargs)[source]¶
Bases:
ubelt.NiceRepr
,_DetAlgoMixin
,_DetDrawMixin
Container for holding and manipulating multiple detections.
- Variables
data (Dict) –
dictionary containing corresponding lists. The length of each list is the number of detections. This contains the bounding boxes, confidence scores, and class indices. Details of the most common keys and types are as follows:
boxes (kwimage.Boxes[ArrayLike]): multiple bounding boxes scores (ArrayLike): associated scores class_idxs (ArrayLike): associated class indices segmentations (ArrayLike): segmentations masks for each box,
members can be
Mask
orMultiPolygon
.- keypoints (ArrayLike): keypoints for each box. Members should
be
Points
.
Additional custom keys may be specified as long as (a) the values are array-like and the first axis corresponds to the standard data values and (b) are custom keys are listed in the datakeys kwargs when constructing the Detections.
meta (Dict) – This contains contextual information about the detections. This includes the class names, which can be indexed into via the class indexes.
Example
>>> import kwimage >>> dets = kwimage.Detections( >>> # there are expected keys that do not need registration >>> boxes=kwimage.Boxes.random(3), >>> class_idxs=[0, 1, 1], >>> classes=['a', 'b'], >>> # custom data attrs must align with boxes >>> myattr1=np.random.rand(3), >>> myattr2=np.random.rand(3, 2, 8), >>> # there are no restrictions on metadata >>> mymeta='a custom metadata string', >>> # Note that any key not in kwimage.Detections.__datakeys__ or >>> # kwimage.Detections.__metakeys__ must be registered at the >>> # time of construction. >>> datakeys=['myattr1', 'myattr2'], >>> metakeys=['mymeta'], >>> checks=True, >>> ) >>> print('dets = {}'.format(dets)) dets = <Detections(3)>
- __datakeys__ = ['boxes', 'scores', 'class_idxs', 'probs', 'weights', 'keypoints', 'segmentations'][source]¶
- classmethod coerce(cls, data=None, **kwargs)[source]¶
The “try-anything to get what I want” constructor
- Parameters
data
**kwargs – currently boxes and cnames
Example
>>> from kwimage.structs.detections import * # NOQA >>> import kwimage >>> kwargs = dict( >>> boxes=kwimage.Boxes.random(4), >>> cnames=['a', 'b', 'c', 'c'], >>> ) >>> data = {} >>> self = kwimage.Detections.coerce(data, **kwargs)
- classmethod from_coco_annots(cls, anns, cats=None, classes=None, kp_classes=None, shape=None, dset=None)[source]¶
Create a Detections object from a list of coco-like annotations.
- Parameters
anns (List[Dict]) – list of coco-like annotation objects
dset (CocoDataset) – if specified, cats, classes, and kp_classes can are ignored.
cats (List[Dict]) – coco-format category information. Used only if dset is not specified.
classes (ndsampler.CategoryTree) – category tree with coco class info. Used only if dset is not specified.
kp_classes (ndsampler.CategoryTree) – keypoint category tree with coco keypoint class info. Used only if dset is not specified.
shape (tuple) – shape of parent image
- Returns
a detections object
- Return type
Example
>>> from kwimage.structs.detections import * # NOQA >>> # xdoctest: +REQUIRES(--module:ndsampler) >>> anns = [{ >>> 'id': 0, >>> 'image_id': 1, >>> 'category_id': 2, >>> 'bbox': [2, 3, 10, 10], >>> 'keypoints': [4.5, 4.5, 2], >>> 'segmentation': { >>> 'counts': '_11a04M2O0O20N101N3L_5', >>> 'size': [20, 20], >>> }, >>> }] >>> dataset = { >>> 'images': [], >>> 'annotations': [], >>> 'categories': [ >>> {'id': 0, 'name': 'background'}, >>> {'id': 2, 'name': 'class1', 'keypoints': ['spot']} >>> ] >>> } >>> #import ndsampler >>> #dset = ndsampler.CocoDataset(dataset) >>> cats = dataset['categories'] >>> dets = Detections.from_coco_annots(anns, cats)
Example
>>> # xdoctest: +REQUIRES(--module:ndsampler) >>> # Test case with no category information >>> from kwimage.structs.detections import * # NOQA >>> anns = [{ >>> 'id': 0, >>> 'image_id': 1, >>> 'category_id': None, >>> 'bbox': [2, 3, 10, 10], >>> 'prob': [.1, .9], >>> }] >>> cats = [ >>> {'id': 0, 'name': 'background'}, >>> {'id': 2, 'name': 'class1'} >>> ] >>> dets = Detections.from_coco_annots(anns, cats)
Example
>>> import kwimage >>> # xdoctest: +REQUIRES(--module:ndsampler) >>> import ndsampler >>> sampler = ndsampler.CocoSampler.demo('photos') >>> iminfo, anns = sampler.load_image_with_annots(1) >>> shape = iminfo['imdata'].shape[0:2] >>> kp_classes = sampler.dset.keypoint_categories() >>> dets = kwimage.Detections.from_coco_annots( >>> anns, sampler.dset.dataset['categories'], sampler.catgraph, >>> kp_classes, shape=shape)
- to_coco(self, cname_to_cat=None, style='orig', image_id=None, dset=None)[source]¶
Converts this set of detections into coco-like annotation dictionaries.
Notes
Not all aspects of the MS-COCO format can be accurately represented, so some liberties are taken. The MS-COCO standard defines that annotations should specifiy a category_id field, but in some cases this information is not available so we will populate a ‘category_name’ field if possible and in the worst case fall back to ‘category_index’.
Additionally, detections may contain additional information beyond the MS-COCO standard, and this information (e.g. weight, prob, score) is added as forign fields.
- Parameters
cname_to_cat – currently ignored.
style (str, default=’orig’) – either ‘orig’ (for the original coco format) or ‘new’ for the more general kwcoco-style coco format.
image_id (int, default=None) – if specified, populates the image_id field of each image
dset (CocoDataset, default=None) – if specified, attempts to populate the category_id field to be compatible with this coco dataset.
- Yields
dict – coco-like annotation structures
Example
>>> # xdoctest: +REQUIRES(module:ndsampler) >>> from kwimage.structs.detections import * >>> self = Detections.demo()[0] >>> cname_to_cat = None >>> list(self.to_coco())
- warp(self, transform, input_dims=None, output_dims=None, inplace=False)[source]¶
Spatially warp the detections.
Example
>>> import skimage >>> transform = skimage.transform.AffineTransform(scale=(2, 3), translation=(4, 5)) >>> self = Detections.random(2) >>> new = self.warp(transform) >>> assert new.boxes == self.boxes.warp(transform) >>> assert new != self
- scale(self, factor, output_dims=None, inplace=False)[source]¶
Spatially warp the detections.
Example
>>> import skimage >>> transform = skimage.transform.AffineTransform(scale=(2, 3), translation=(4, 5)) >>> self = Detections.random(2) >>> new = self.warp(transform) >>> assert new.boxes == self.boxes.warp(transform) >>> assert new != self
- translate(self, offset, output_dims=None, inplace=False)[source]¶
Spatially warp the detections.
Example
>>> import skimage >>> self = Detections.random(2) >>> new = self.translate(10)
- classmethod concatenate(cls, dets)[source]¶
- Parameters
boxes (Sequence[Detections]) – list of detections to concatenate
- Returns
stacked detections
- Return type
Example
>>> self = Detections.random(2) >>> other = Detections.random(3) >>> dets = [self, other] >>> new = Detections.concatenate(dets) >>> assert new.num_boxes() == 5
>>> self = Detections.random(2, segmentations=True) >>> other = Detections.random(3, segmentations=True) >>> dets = [self, other] >>> new = Detections.concatenate(dets) >>> assert new.num_boxes() == 5
- argsort(self, reverse=True)[source]¶
Sorts detection indices by descending (or ascending) scores
- Returns
sorted indices
- Return type
ndarray[int]
- sort(self, reverse=True)[source]¶
Sorts detections by descending (or ascending) scores
- Returns
sorted copy of self
- Return type
- compress(self, flags, axis=0)[source]¶
Returns a subset where corresponding locations are True.
- Parameters
flags (ndarray[bool]) – mask marking selected items
- Returns
subset of self
- Return type
- CommandLine:
xdoctest -m kwimage.structs.detections Detections.compress
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> import kwimage >>> dets = kwimage.Detections.random(keypoints='dense') >>> flags = np.random.rand(len(dets)) > 0.5 >>> subset = dets.compress(flags) >>> assert len(subset) == flags.sum() >>> subset = dets.tensor().compress(flags) >>> assert len(subset) == flags.sum()
- take(self, indices, axis=0)[source]¶
Returns a subset specified by indices
- Parameters
indices (ndarray[int]) – indices to select
- Returns
subset of self
- Return type
Example
>>> import kwimage >>> dets = kwimage.Detections(boxes=kwimage.Boxes.random(10)) >>> subset = dets.take([2, 3, 5, 7]) >>> assert len(subset) == 4 >>> # xdoctest: +REQUIRES(module:torch) >>> subset = dets.tensor().take([2, 3, 5, 7]) >>> assert len(subset) == 4
- __getitem__(self, index)[source]¶
Fancy slicing / subset / indexing.
Note: scalar indices are always coerced into index lists of length 1.
Example
>>> import kwimage >>> import kwarray >>> dets = kwimage.Detections(boxes=kwimage.Boxes.random(10)) >>> indices = [2, 3, 5, 7] >>> flags = kwarray.boolmask(indices, len(dets)) >>> assert dets[flags].data == dets[indices].data
- numpy(self)[source]¶
Converts tensors to numpy. Does not change memory if possible.
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> self = Detections.random(3).tensor() >>> newself = self.numpy() >>> self.scores[0] = 0 >>> assert newself.scores[0] == 0 >>> self.scores[0] = 1 >>> assert self.scores[0] == 1 >>> self.numpy().numpy()
- tensor(self, device=ub.NoParam)[source]¶
Converts numpy to tensors. Does not change memory if possible.
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> from kwimage.structs.detections import * >>> self = Detections.random(3) >>> newself = self.tensor() >>> self.scores[0] = 0 >>> assert newself.scores[0] == 0 >>> self.scores[0] = 1 >>> assert self.scores[0] == 1 >>> self.tensor().tensor()
- classmethod random(cls, num=10, scale=1.0, classes=3, keypoints=False, segmentations=False, tensor=False, rng=None)[source]¶
Creates dummy data, suitable for use in tests and benchmarks
- Parameters
num (int) – number of boxes
scale (float | tuple, default=1.0) – bounding image size
classes (int | Sequence) – list of class labels or number of classes
keypoints (bool, default=False) – if True include random keypoints for each box.
segmentations (bool, default=False) – if True include random segmentations for each box.
tensor (bool, default=False) – determines backend. DEPRECATED. Call tensor on resulting object instead.
rng (np.random.RandomState) – random state
Example
>>> import kwimage >>> dets = kwimage.Detections.random(keypoints='jagged') >>> dets.data['keypoints'].data[0].data >>> dets.data['keypoints'].meta >>> dets = kwimage.Detections.random(keypoints='dense') >>> dets = kwimage.Detections.random(keypoints='dense', segmentations=True).scale(1000) >>> # xdoctest:+REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> dets.draw(setlim=True)
Example
>>> import kwimage >>> dets = kwimage.Detections.random( >>> keypoints='jagged', segmentations=True, rng=0).scale(1000) >>> print('dets = {}'.format(dets)) dets = <Detections(10)> >>> dets.data['boxes'].quantize(inplace=True) >>> print('dets.data = {}'.format(ub.repr2( >>> dets.data, nl=1, with_dtype=False, strvals=True))) dets.data = { 'boxes': <Boxes(xywh, array([[548, 544, 55, 172], [423, 645, 15, 247], [791, 383, 173, 146], [ 71, 87, 498, 839], [ 20, 832, 759, 39], [461, 780, 518, 20], [118, 639, 26, 306], [264, 414, 258, 361], [ 18, 568, 439, 50], [612, 616, 332, 66]], dtype=int32))>, 'class_idxs': [1, 2, 0, 0, 2, 0, 0, 0, 0, 0], 'keypoints': <PointsList(n=10)>, 'scores': [0.3595079 , 0.43703195, 0.6976312 , 0.06022547, 0.66676672, 0.67063787,0.21038256, 0.1289263 , 0.31542835, 0.36371077], 'segmentations': <SegmentationList(n=10)>, } >>> # xdoctest:+REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> dets.draw(setlim=True)
Example
>>> # Boxes position/shape within 0-1 space should be uniform. >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> fig = kwplot.figure(fnum=1, doclf=True) >>> fig.gca().set_xlim(0, 128) >>> fig.gca().set_ylim(0, 128) >>> import kwimage >>> kwimage.Detections.random(num=10, segmentations=True).scale(128).draw()
- kwimage.structs.detections._dets_to_fcmaps(dets, bg_size, input_dims, bg_idx=0, pmin=0.6, pmax=1.0, soft=True, exclude=[])[source]¶
Construct semantic segmentation detection targets from annotations in dictionary format.
Rasterize detections.
- Parameters
dets (kwimage.Detections)
bg_size (tuple) – size (W, H) to predict for backgrounds
input_dims (tuple) – window H, W
- Returns
- with keys
size : 2D ndarray containing the W,H of the object dxdy : 2D ndarray containing the x,y offset of the object cidx : 2D ndarray containing the class index of the object
- Return type
- Ignore:
import xdev globals().update(xdev.get_func_kwargs(_dets_to_fcmaps))
Example
>>> # xdoctest: +REQUIRES(module:ndsampler) >>> from kwimage.structs.detections import * # NOQA >>> from kwimage.structs.detections import _dets_to_fcmaps >>> import kwimage >>> import ndsampler >>> sampler = ndsampler.CocoSampler.demo('photos') >>> iminfo, anns = sampler.load_image_with_annots(1) >>> image = iminfo['imdata'] >>> input_dims = image.shape[0:2] >>> kp_classes = sampler.dset.keypoint_categories() >>> dets = kwimage.Detections.from_coco_annots( >>> anns, sampler.dset.dataset['categories'], >>> sampler.catgraph, kp_classes, shape=input_dims) >>> bg_size = [100, 100] >>> bg_idxs = sampler.catgraph.index('background') >>> fcn_target = _dets_to_fcmaps(dets, bg_size, input_dims, bg_idxs) >>> fcn_target.keys() >>> print('fcn_target: ' + ub.repr2(ub.map_vals(lambda x: x.shape, fcn_target), nl=1)) fcn_target: { 'cidx': (512, 512), 'class_probs': (10, 512, 512), 'dxdy': (2, 512, 512), 'kpts': (2, 7, 512, 512), 'kpts_ignore': (7, 512, 512), 'size': (2, 512, 512), } >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> size_mask = fcn_target['size'] >>> dxdy_mask = fcn_target['dxdy'] >>> cidx_mask = fcn_target['cidx'] >>> kpts_mask = fcn_target['kpts'] >>> def _vizmask(dxdy_mask): >>> dx, dy = dxdy_mask >>> mag = np.sqrt(dx ** 2 + dy ** 2) >>> mag /= (mag.max() + 1e-9) >>> mask = (cidx_mask != 0).astype(np.float32) >>> angle = np.arctan2(dy, dx) >>> orimask = kwplot.make_orimask(angle, mask, alpha=mag) >>> vecmask = kwplot.make_vector_field( >>> dx, dy, stride=4, scale=0.1, thickness=1, tipLength=.2, >>> line_type=16) >>> return [vecmask, orimask] >>> vecmask, orimask = _vizmask(dxdy_mask) >>> raster = kwimage.overlay_alpha_layers( >>> [vecmask, orimask, image], keepalpha=False) >>> raster = dets.draw_on((raster * 255).astype(np.uint8), >>> labels=True, alpha=None) >>> kwplot.imshow(raster) >>> kwplot.show_if_requested()
raster = (kwimage.overlay_alpha_layers(_vizmask(kpts_mask[:, 5]) + [image], keepalpha=False) * 255).astype(np.uint8) kwplot.imshow(raster, pnum=(1, 3, 2), fnum=1) raster = (kwimage.overlay_alpha_layers(_vizmask(kpts_mask[:, 6]) + [image], keepalpha=False) * 255).astype(np.uint8) kwplot.imshow(raster, pnum=(1, 3, 3), fnum=1) raster = (kwimage.overlay_alpha_layers(_vizmask(dxdy_mask) + [image], keepalpha=False) * 255).astype(np.uint8) raster = dets.draw_on(raster, labels=True, alpha=None) kwplot.imshow(raster, pnum=(1, 3, 1), fnum=1) raster = kwimage.overlay_alpha_layers(
[vecmask, orimask, image], keepalpha=False)
- raster = dets.draw_on((raster * 255).astype(np.uint8),
labels=True, alpha=None)
kwplot.imshow(raster) kwplot.show_if_requested()
kwimage.structs.heatmap
¶[ ] Remove doctest dependency on ndsampler?
- [ ] Remove the datakeys that tries to define what heatmap should represent
(e.g. class_probs, keypoints, etc…) and instead just focus on a data structure that stores a [C, H, W] or [H, W] tensor?
- CommandLine:
xdoctest -m ~/code/kwimage/kwimage/structs/heatmap.py __doc__
Example
>>> # xdoctest: +REQUIRES(module:ndsampler)
>>> # xdoctest: +REQUIRES(--mask)
>>> from kwimage.structs.heatmap import * # NOQA
>>> import kwimage
>>> import ndsampler
>>> sampler = ndsampler.CocoSampler.demo('shapes')
>>> iminfo, anns = sampler.load_image_with_annots(1)
>>> image = iminfo['imdata']
>>> input_dims = image.shape[0:2]
>>> kp_classes = sampler.dset.keypoint_categories()
>>> dets = kwimage.Detections.from_coco_annots(
>>> anns, sampler.dset.dataset['categories'],
>>> sampler.catgraph, kp_classes, shape=input_dims)
>>> bg_size = [100, 100]
>>> heatmap = dets.rasterize(bg_size, input_dims, soften=2)
>>> # xdoctest: +REQUIRES(--show)
>>> import kwplot
>>> kwplot.autompl()
>>> kwplot.figure(fnum=1, doclf=True)
>>> kwplot.imshow(image)
>>> heatmap.draw(invert=True, kpts=[0, 1, 2, 3, 4])
Example
>>> # xdoctest: +REQUIRES(module:ndsampler)
>>> # xdoctest: +REQUIRES(--mask)
>>> from kwimage.structs.heatmap import * # NOQA
>>> from kwimage.structs.detections import _dets_to_fcmaps
>>> import kwimage
>>> import ndsampler
>>> sampler = ndsampler.CocoSampler.demo('shapes')
>>> iminfo, anns = sampler.load_image_with_annots(1)
>>> image = iminfo['imdata']
>>> input_dims = image.shape[0:2]
>>> kp_classes = sampler.dset.keypoint_categories()
>>> dets = kwimage.Detections.from_coco_annots(
>>> anns, sampler.dset.dataset['categories'],
>>> sampler.catgraph, kp_classes, shape=input_dims)
>>> bg_size = [100, 100]
>>> bg_idxs = sampler.catgraph.index('background')
>>> fcn_target = _dets_to_fcmaps(dets, bg_size, input_dims, bg_idxs)
>>> fcn_target.keys()
>>> print('fcn_target: ' + ub.repr2(ub.map_vals(lambda x: x.shape, fcn_target), nl=1))
>>> # xdoctest: +REQUIRES(--show)
>>> import kwplot
>>> kwplot.autompl()
>>> size_mask = fcn_target['size']
>>> dxdy_mask = fcn_target['dxdy']
>>> cidx_mask = fcn_target['cidx']
>>> kpts_mask = fcn_target['kpts']
>>> def _vizmask(dxdy_mask):
>>> dx, dy = dxdy_mask
>>> mag = np.sqrt(dx ** 2 + dy ** 2)
>>> mag /= (mag.max() + 1e-9)
>>> mask = (cidx_mask != 0).astype(np.float32)
>>> angle = np.arctan2(dy, dx)
>>> orimask = kwplot.make_orimask(angle, mask, alpha=mag)
>>> vecmask = kwplot.make_vector_field(
>>> dx, dy, stride=4, scale=0.1, thickness=1, tipLength=.2,
>>> line_type=16)
>>> return [vecmask, orimask]
>>> vecmask, orimask = _vizmask(dxdy_mask)
>>> raster = kwimage.overlay_alpha_layers(
>>> [vecmask, orimask, image], keepalpha=False)
>>> raster = dets.draw_on((raster * 255).astype(np.uint8),
>>> labels=True, alpha=None)
>>> kwplot.imshow(raster)
>>> kwplot.show_if_requested()
mixin methods for drawing heatmap details |
|
mixin method having to do with warping and aligning heatmaps |
|
Algorithmic operations on heatmaps |
|
Keeps track of a downscaled heatmap and how to transform it to overlay the |
|
Directly convert a one-channel probability map into a Detections object. |
|
Smooths the probability map, but preserves the magnitude of the peaks. |
Removes the translation component of a transform |
|
|
Compute the geometric mean along the specified axis. |
- class kwimage.structs.heatmap._HeatmapDrawMixin[source]¶
Bases:
object
mixin methods for drawing heatmap details
- colorize(self, channel=None, invert=False, with_alpha=1.0, interpolation='linear', imgspace=False, cmap=None)[source]¶
Creates a colorized version of a heatmap channel suitable for visualization
- Parameters
channel (int | str) – index of category to visualize, or a special code indicating how to visualize multiple classes.
imgspace (bool, default=False) – colorize the image after warping into the image space.
- CommandLine:
xdoctest -m ~/code/kwimage/kwimage/structs/heatmap.py _HeatmapDrawMixin.colorize –show
Example
>>> # xdoctest: +REQUIRES(module:kwplot) >>> self = Heatmap.random(rng=0, dims=(32, 32)) >>> colormask1 = self.colorize(0, imgspace=False) >>> colormask2 = self.colorize(0, imgspace=True) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(colormask1, pnum=(1, 2, 1), fnum=1, title='output space') >>> kwplot.imshow(colormask2, pnum=(1, 2, 2), fnum=1, title='image space') >>> kwplot.show_if_requested()
Example
>>> # xdoctest: +REQUIRES(module:kwplot) >>> self = Heatmap.random(rng=0, dims=(32, 32)) >>> colormask1 = self.colorize('diameter', imgspace=False) >>> colormask2 = self.colorize('diameter', imgspace=True) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(colormask1, pnum=(1, 2, 1), fnum=1, title='output space') >>> kwplot.imshow(colormask2, pnum=(1, 2, 2), fnum=1, title='image space') >>> kwplot.show_if_requested()
- Ignore:
>>> # xdoctest: +REQUIRES(module:kwplot) >>> self = Heatmap.random(rng=0, dims=(32, 32)) >>> self.data['class_energy'] = (self.data['class_probs'] - .5) * 10 >>> colormask1 = self.colorize('class_energy_color', imgspace=False) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(colormask1, fnum=1, title='output space') >>> kwplot.show_if_requested()
- draw_stacked(self, image=None, dsize=(224, 224), ignore_class_idxs={}, top=None, chosen_cxs=None)[source]¶
Draws per-class probabilities and stacks them into a single image
Example
>>> # xdoctest: +REQUIRES(module:kwplot) >>> self = Heatmap.random(rng=0, dims=(32, 32)) >>> stacked = self.draw_stacked() >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(stacked)
- draw(self, channel=None, image=None, imgspace=None, **kwargs)[source]¶
Accepts same args as draw_on, but uses maplotlib
- Parameters
channel (int | str) – category index to visualize, or special key
- draw_on(self, image=None, channel=None, invert=False, with_alpha=1.0, interpolation='linear', vecs=False, kpts=None, imgspace=None)[source]¶
Overlays a heatmap channel on top of an image
- Parameters
image (ndarray) – image to draw on, if unspecified one is created.
channel (int | str) – category index to visualize, or special key. special keys are: class_idx, class_probs, class_idx
imgspace (bool, default=False) – colorize the image after warping into the image space.
Todo
- [ ] Find a way to visualize offset, diameter, and class_probs
either individually or all at the same time
Example
>>> # xdoctest: +REQUIRES(module:kwplot) >>> import kwarray >>> import kwimage >>> image = kwimage.grab_test_image('astro') >>> probs = kwimage.gaussian_patch(image.shape[0:2])[None, :] >>> probs = probs / probs.max() >>> class_probs = kwarray.ArrayAPI.cat([probs, 1 - probs], axis=0) >>> self = kwimage.Heatmap(class_probs=class_probs, offset=5 * np.random.randn(2, *probs.shape[1:])) >>> toshow = self.draw_on(image, 0, vecs=True, with_alpha=0.85) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(toshow)
Example
>>> # xdoctest: +REQUIRES(module:kwplot) >>> # xdoctest: +REQUIRES(module:ndsampler) >>> import kwimage >>> self = kwimage.Heatmap.random(dims=(200, 200), dets='coco', keypoints=True) >>> image = kwimage.grab_test_image('astro') >>> toshow = self.draw_on(image, 0, vecs=False, with_alpha=0.85) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(toshow)
Example
>>> # xdoctest: +REQUIRES(module:kwplot) >>> # xdoctest: +REQUIRES(module:ndsampler) >>> import kwimage >>> self = kwimage.Heatmap.random(dims=(200, 200), dets='coco', keypoints=True) >>> kpts = [6] >>> self = self.warp(self.tf_data_to_img.params) >>> image = kwimage.grab_test_image('astro') >>> image = kwimage.ensure_alpha_channel(image) >>> toshow = self.draw_on(image, 0, with_alpha=0.85, kpts=kpts) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(toshow)
Example
>>> # xdoctest: +REQUIRES(module:kwplot) >>> # xdoctest: +REQUIRES(module:ndsampler) >>> import kwimage >>> mask = np.random.rand(32, 32) >>> self = kwimage.Heatmap( >>> class_probs=mask, >>> img_dims=mask.shape[0:2], >>> tf_data_to_img=np.eye(3), >>> ) >>> canvas = self.draw_on() >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(canvas)
import xdev globals().update(xdev.get_func_kwargs(Heatmap.draw_on))
- class kwimage.structs.heatmap._HeatmapWarpMixin[source]¶
Bases:
object
mixin method having to do with warping and aligning heatmaps
- _align_other(self, other)[source]¶
Warp another Heatmap (with the same underlying imgdims) into the same space as this heatmap. This lets us perform elementwise operations on the two heatmaps (like geometric mean).
- Parameters
other (Heatmap) – the heatmap to align with self
- Returns
warped version of other that aligns with self.
- Return type
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> self = Heatmap.random((120, 130), img_dims=(200, 210), classes=2, nblips=10, rng=0) >>> other = Heatmap.random((60, 70), img_dims=(200, 210), classes=2, nblips=10, rng=1) >>> other2 = self._align_other(other) >>> assert self.shape != other.shape >>> assert self.shape == other2.shape >>> # xdoctest: +REQUIRES(--show) >>> kwplot.autompl() >>> kwplot.imshow(self.colorize(0, imgspace=False), fnum=1, pnum=(3, 2, 1)) >>> kwplot.imshow(self.colorize(1, imgspace=False), fnum=1, pnum=(3, 2, 2)) >>> kwplot.imshow(other.colorize(0, imgspace=False), fnum=1, pnum=(3, 2, 3)) >>> kwplot.imshow(other.colorize(1, imgspace=False), fnum=1, pnum=(3, 2, 4))
- _align(self, mask, interpolation='linear')[source]¶
Align a linear combination of heatmap channels with the original image
DEPRICATE
- upscale(self, channel=None, interpolation='linear')[source]¶
Warp the heatmap with the image dimensions
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> self = Heatmap.random(rng=0, dims=(32, 32)) >>> colormask = self.upscale()
- warp(self, mat=None, input_dims=None, output_dims=None, interpolation='linear', modify_spatial_coords=True, int_interpolation='nearest', mat_is_xy=True, version=None)[source]¶
Warp all spatial maps. If the map contains spatial data, that data is also warped (ignoring the translation component).
- Parameters
mat (ArrayLike) – transformation matrix
input_dims (tuple) – unused, only exists for compatibility
output_dims (tuple) – size of the output heatmap
interpolation (str) – see kwimage.warp_tensor
int_interpolation (str) – interpolation used for interger types (should be nearest)
mat_is_xy (bool, default=True) – set to false if the matrix is in yx space instead of xy space
- Returns
this heatmap warped into a new spatial dimension
- Return type
- Ignore:
# Verify swapping rows 0 and 1 and then swapping columns 0 and 1 # Produces a matrix that works with permuted coordinates # It does. import sympy a, b, c, d, e, f, g, h, i, x, y, z = sympy.symbols(‘a, b, c, d, e, f, g, h, i, x, y, z’) M1 = sympy.Matrix([[a, b, c], [d, e, f], [g, h, i]]) M2 = sympy.Matrix([[e, d, f], [b, a, c], [h, g, i]]) xy = sympy.Matrix([[x], [y], [z]]) yx = sympy.Matrix([[y], [x], [z]])
R1 = M1.multiply(xy) R2 = M2.multiply(yx) R3 = sympy.Matrix([[R1[1]], [R1[0]], [R1[2]],]) assert R2 == R3
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> from kwimage.structs.heatmap import * # NOQA >>> self = Heatmap.random(rng=0, keypoints=True) >>> S = 3.0 >>> mat = np.eye(3) * S >>> mat[-1, -1] = 1 >>> newself = self.warp(mat, np.array(self.dims) * S).numpy() >>> assert newself.offset.shape[0] == 2 >>> assert newself.diameter.shape[0] == 2 >>> f1 = newself.offset.max() / self.offset.max() >>> assert f1 == S >>> f2 = newself.diameter.max() / self.diameter.max() >>> assert f2 == S
Example
>>> import kwimage >>> # xdoctest: +REQUIRES(module:ndsampler) >>> self = kwimage.Heatmap.random(dims=(100, 100), dets='coco', keypoints=True) >>> image = np.zeros(self.img_dims) >>> # xdoctest: +REQUIRES(module:kwplot) >>> toshow = self.draw_on(image, 1, vecs=True, with_alpha=0.85) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.imshow(toshow)
- class kwimage.structs.heatmap._HeatmapAlgoMixin[source]¶
Bases:
object
Algorithmic operations on heatmaps
- classmethod combine(cls, heatmaps, root_index=None, dtype=np.float32)[source]¶
Combine multiple heatmaps into a single heatmap.
- Parameters
heatmaps (Sequence[Heatmap]) – multiple heatmaps to combine into one
root_index (int) – which heatmap in the sequence to align other heatmaps with
- Returns
the combined heatmap
- Return type
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> from kwimage.structs.heatmap import * # NOQA >>> a = Heatmap.random((120, 130), img_dims=(200, 210), classes=2, nblips=10, rng=0) >>> b = Heatmap.random((60, 70), img_dims=(200, 210), classes=2, nblips=10, rng=1) >>> c = Heatmap.random((40, 30), img_dims=(200, 210), classes=2, nblips=10, rng=1) >>> heatmaps = [a, b, c] >>> newself = Heatmap.combine(heatmaps, root_index=2) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(a.colorize(0, imgspace=1), fnum=1, pnum=(4, 2, 1)) >>> kwplot.imshow(a.colorize(1, imgspace=1), fnum=1, pnum=(4, 2, 2)) >>> kwplot.imshow(b.colorize(0, imgspace=1), fnum=1, pnum=(4, 2, 3)) >>> kwplot.imshow(b.colorize(1, imgspace=1), fnum=1, pnum=(4, 2, 4)) >>> kwplot.imshow(c.colorize(0, imgspace=1), fnum=1, pnum=(4, 2, 5)) >>> kwplot.imshow(c.colorize(1, imgspace=1), fnum=1, pnum=(4, 2, 6)) >>> kwplot.imshow(newself.colorize(0, imgspace=1), fnum=1, pnum=(4, 2, 7)) >>> kwplot.imshow(newself.colorize(1, imgspace=1), fnum=1, pnum=(4, 2, 8)) >>> # xdoctest: +REQUIRES(--show) >>> kwplot.imshow(a.colorize('offset', imgspace=1), fnum=2, pnum=(4, 1, 1)) >>> kwplot.imshow(b.colorize('offset', imgspace=1), fnum=2, pnum=(4, 1, 2)) >>> kwplot.imshow(c.colorize('offset', imgspace=1), fnum=2, pnum=(4, 1, 3)) >>> kwplot.imshow(newself.colorize('offset', imgspace=1), fnum=2, pnum=(4, 1, 4)) >>> # xdoctest: +REQUIRES(--show) >>> kwplot.imshow(a.colorize('diameter', imgspace=1), fnum=3, pnum=(4, 1, 1)) >>> kwplot.imshow(b.colorize('diameter', imgspace=1), fnum=3, pnum=(4, 1, 2)) >>> kwplot.imshow(c.colorize('diameter', imgspace=1), fnum=3, pnum=(4, 1, 3)) >>> kwplot.imshow(newself.colorize('diameter', imgspace=1), fnum=3, pnum=(4, 1, 4))
- detect(self, channel, invert=False, min_score=0.01, num_min=10, max_dims=None, min_dims=None, dim_thresh_space='image')[source]¶
Lossy conversion from a Heatmap to a Detections object.
For efficiency, the detections are returned in the same space as the heatmap, which usually some downsampled version of the image space. This is because it is more efficient to transform the detections into image-space after non-max supression is applied.
- Parameters
channel (int | ArrayLike[*DIMS]) – class index to detect objects in. Alternatively, channel can be a custom probability map as long as its dimension agree with the heatmap.
invert (bool, default=False) – if True, inverts the probabilities in the chosen channel. (Useful if you have a background channel but want to detect foreground objects).
min_score (float, default=0.1) – probability threshold required for a pixel to be converted into a detection.
num_min (int, default=10) – always return at least nmin of the highest scoring detections even if they aren’t above the min_score threshold.
max_dims (Tuple[int, int]) – maximum height / width of detections By default these are expected to be in image-space.
min_dims (Tuple[int, int]) – minimum height / width of detections By default these are expected to be in image-space.
dim_thresh_space (str, default=’image’) – When dim_thresh_space==’native’, dimension thresholds (e.g. min_dims and max_dims) are specified in the native heatmap space (i.e. usually a downsampled space). If dim_thresh_space==’image’, then dimension thresholds are interpreted in the original image space.
- Returns
raw detections.
Note that these detections will not have class_idx populated
It is the users responsbility to run non-max suppression on these results to remove duplicate detections.
- Return type
- SeeAlso:
Detections.rasterize
Example
>>> # xdoctest: +REQUIRES(module:ndsampler) >>> from kwimage.structs.heatmap import * # NOQA >>> import ndsampler >>> self = Heatmap.random(rng=2, dims=(32, 32)) >>> dets = self.detect(channel=0, max_dims=7, num_min=None) >>> img_dets = dets.warp(self.tf_data_to_img) >>> assert img_dets.boxes.to_xywh().width.max() <= 7 >>> assert img_dets.boxes.to_xywh().height.max() <= 7 >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> dets1 = dets.sort().take(range(30)) >>> colormask1 = self.colorize(0, imgspace=False) >>> kwplot.imshow(colormask1, pnum=(1, 2, 1), fnum=1, title='output space') >>> dets1.draw() >>> # Transform heatmap and detections into image space. >>> dets2 = dets1.warp(self.tf_data_to_img) >>> colormask2 = self.colorize(0, imgspace=True) >>> kwplot.imshow(colormask2, pnum=(1, 2, 2), fnum=1, title='image space') >>> dets2.draw()
Example
>>> # xdoctest: +REQUIRES(module:ndsampler) >>> from kwimage.structs.heatmap import * # NOQA >>> import ndsampler >>> catgraph = ndsampler.CategoryTree.demo() >>> class_energy = torch.rand(len(catgraph), 32, 32) >>> class_probs = catgraph.hierarchical_softmax(class_energy, dim=0) >>> self = Heatmap.random(rng=0, dims=(32, 32), classes=catgraph, keypoints=True) >>> print(ub.repr2(ub.map_vals(lambda x: x.shape, self.data), nl=1)) >>> self.data['class_probs'] = class_probs.numpy() >>> channel = catgraph.index('background') >>> dets = self.detect(channel, invert=True) >>> class_idx, scores = catgraph.decision(dets.probs, dim=1) >>> dets.data['class_idx'] = class_idx >>> dets.data['scores'] = scores >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> dets1 = dets.sort().take(range(10)) >>> colormask1 = self.colorize(0, imgspace=False) >>> kwplot.imshow(colormask1, pnum=(1, 2, 1), fnum=1, title='output space') >>> dets1.draw(radius=1.0) >>> # Transform heatmap and detections into image space. >>> colormask2 = self.colorize(0, imgspace=True) >>> dets2 = dets1.warp(self.tf_data_to_img) >>> kwplot.imshow(colormask2, pnum=(1, 2, 2), fnum=1, title='image space') >>> dets2.draw(radius=1.0)
- class kwimage.structs.heatmap.Heatmap(data=None, meta=None, **kwargs)[source]¶
Bases:
kwimage.structs._generic.Spatial
,_HeatmapDrawMixin
,_HeatmapWarpMixin
,_HeatmapAlgoMixin
Keeps track of a downscaled heatmap and how to transform it to overlay the original input image. Heatmaps generally are used to estimate class probabilites at each pixel. This data struction additionally contains logic to augment pixel with offset (dydx) and scale (diamter) information.
- Variables
data (Dict[str, ArrayLike]) –
dictionary containing spatially aligned heatmap data. Valid keys are as follows.
- class_probs (ArrayLike[C, H, W] | ArrayLike[C, D, H, W]):
A probability map for each class. C is the number of classes.
- offset (ArrayLike[2, H, W] | ArrayLike[3, D, H, W], optional):
object center position offset in y,x / t,y,x coordinates
- diamter (ArrayLike[2, H, W] | ArrayLike[3, D, H, W], optional):
object bounding box sizes in h,w / d,h,w coordinates
- keypoints (ArrayLike[2, K, H, W] | ArrayLike[3, K, D, H, W], optional):
y/x offsets for K different keypoint classes
dictionary containing miscellanious metadata about the heatmap data. Valid keys are as follows.
- img_dims (Tuple[H, W] | Tuple[D, H, W]):
original image dimension
- tf_data_to_image (skimage.transform._geometric.GeometricTransform):
transformation matrix (typically similarity or affine) that projects the given, heatmap onto the image dimensions such that the image and heatmap are spatially aligned.
- classes (List[str] | ndsampler.CategoryTree):
information about which index in data[‘class_probs’] corresponds to which semantic class.
dims (Tuple) – dimensions of the heatmap (See `image_dims) for the original image dimensions.
**kwargs – any key that is accepted by the data or meta dictionaries can be specified as a keyword argument to this class and it will be properly placed in the appropriate internal dictionary.
- CommandLine:
xdoctest -m ~/code/kwimage/kwimage/structs/heatmap.py Heatmap –show
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> from kwimage.structs.heatmap import * # NOQA >>> import kwimage >>> class_probs = kwimage.grab_test_image(dsize=(32, 32), space='gray')[None, ] / 255.0 >>> img_dims = (220, 220) >>> tf_data_to_img = skimage.transform.AffineTransform(translation=(-18, -18), scale=(8, 8)) >>> self = Heatmap(class_probs=class_probs, img_dims=img_dims, >>> tf_data_to_img=tf_data_to_img) >>> aligned = self.upscale() >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(aligned[0]) >>> kwplot.show_if_requested()
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> import kwimage >>> self = Heatmap.random() >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> self.draw()
- __datakeys__ = ['class_probs', 'offset', 'diameter', 'keypoints', 'class_idx', 'class_energy'][source]¶
- property _impl(self)[source]¶
Returns the internal tensor/numpy ArrayAPI implementation
- Returns
kwarray.ArrayAPI
- classmethod random(cls, dims=(10, 10), classes=3, diameter=True, offset=True, keypoints=False, img_dims=None, dets=None, nblips=10, noise=0.0, rng=None)[source]¶
Creates dummy data, suitable for use in tests and benchmarks
- Parameters
dims (Tuple) – dimensions of the heatmap
img_dims (Tuple) – dimensions of the image the heatmap corresponds to
Example
>>> from kwimage.structs.heatmap import * # NOQA >>> self = Heatmap.random((128, 128), img_dims=(200, 200), >>> classes=3, nblips=10, rng=0, noise=0.1) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(self.colorize(0, imgspace=0), fnum=1, pnum=(1, 4, 1), doclf=1) >>> kwplot.imshow(self.colorize(1, imgspace=0), fnum=1, pnum=(1, 4, 2)) >>> kwplot.imshow(self.colorize(2, imgspace=0), fnum=1, pnum=(1, 4, 3)) >>> kwplot.imshow(self.colorize(3, imgspace=0), fnum=1, pnum=(1, 4, 4))
- Ignore:
self.detect(0).sort().non_max_supress()[-np.arange(1, 4)].draw() from kwimage.structs.heatmap import * # NOQA import xdev globals().update(xdev.get_func_kwargs(Heatmap.random))
Example
>>> # xdoctest: +REQUIRES(module:ndsampler) >>> import kwimage >>> self = kwimage.Heatmap.random(dims=(50, 200), dets='coco', >>> keypoints=True) >>> image = np.zeros(self.img_dims) >>> # xdoctest: +REQUIRES(module:kwplot) >>> toshow = self.draw_on(image, 1, vecs=True, kpts=0, with_alpha=0.85) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.imshow(toshow)
- Ignore:
>>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.imshow(image) >>> dets.draw() >>> dets.data['keypoints'].draw(radius=6) >>> dets.data['segmentations'].draw()
>>> self.draw()
- kwimage.structs.heatmap._prob_to_dets(probs, diameter=None, offset=None, class_probs=None, keypoints=None, min_score=0.01, num_min=10, max_dims=None, min_dims=None)[source]¶
Directly convert a one-channel probability map into a Detections object.
Helper for Heatmap.detect
It does this by converting each pixel above a threshold in a probability map to a detection with a specified diameter.
- Parameters
probs (ArrayLike[H, W]) – liklihood that each particular pixel should be detected as an object.
diameter (ArrayLike[2, H, W] | Tuple) – H, W sizes for the bounding box at each pixel location. If passed as a tuple, then all boxes receive that diameter.
offset (Tuple | ArrayLike[2, H, W], default=0) – Y, X offsets from the pixel location to the bounding box center. If passed as a tuple, then all boxes receive that offset.
class_probs (ArrayLike[C, H, W], optional) – probabilities for each class at each pixel location. If specified, this will populate the probs attribute of the returned Detections object.
keypoints (ArrayLike[2, K, H, W], optional) – Keypoint predictions for all keypoint classes
min_score (float, default=0.1) – probability threshold required for a pixel to be converted into a detection.
num_min (int, default=10) – always return at least nmin of the highest scoring detections even if they aren’t above the min_score threshold.
- Returns
- raw detections. It is the users responsbility to
run non-max suppression on these results to remove duplicate detections.
- Return type
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> rng = np.random.RandomState(0) >>> probs = rng.rand(3, 3).astype(np.float32) >>> min_score = .5 >>> diameter = [10, 10] >>> dets = _prob_to_dets(probs, diameter, min_score=min_score) >>> assert dets.boxes.data.dtype.kind == 'f' >>> assert len(dets) == 9 >>> dets = _prob_to_dets(torch.FloatTensor(probs), diameter, min_score=min_score) >>> assert dets.boxes.data.dtype.is_floating_point >>> assert len(dets) == 9
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> import kwimage >>> from kwimage.structs.heatmap import * >>> from kwimage.structs.heatmap import _prob_to_dets >>> heatmap = kwimage.Heatmap.random(rng=0, dims=(3, 3), keypoints=True) >>> # Try with numpy >>> min_score = .5 >>> dets = _prob_to_dets(heatmap.class_probs[0], heatmap.diameter, >>> heatmap.offset, heatmap.class_probs, >>> heatmap.data['keypoints'], >>> min_score) >>> assert dets.boxes.data.dtype.kind == 'f' >>> assert 'keypoints' in dets.data >>> dets_np = dets >>> # Try with torch >>> heatmap = heatmap.tensor() >>> dets = _prob_to_dets(heatmap.class_probs[0], heatmap.diameter, >>> heatmap.offset, heatmap.class_probs, >>> heatmap.data['keypoints'], >>> min_score) >>> assert dets.boxes.data.dtype.is_floating_point >>> assert len(dets) == len(dets_np) >>> dets_torch = dets >>> assert np.all(dets_torch.numpy().boxes.data == dets_np.boxes.data)
- Ignore:
import kwil kwil.autompl() dets.draw(setlim=True, radius=.1)
Example
>>> heatmap = Heatmap.random(rng=0, dims=(3, 3), diameter=1) >>> probs = heatmap.class_probs[0] >>> diameter = heatmap.diameter >>> offset = heatmap.offset >>> class_probs = heatmap.class_probs >>> min_score = 0.5 >>> dets = _prob_to_dets(probs, diameter, offset, class_probs, None, min_score)
- kwimage.structs.heatmap.smooth_prob(prob, k=3, inplace=False, eps=1e-09)[source]¶
Smooths the probability map, but preserves the magnitude of the peaks.
Notes
even if inplace is true, we still need to make a copy of the input array, however, we do ensure that it is cleaned up before we leave the function scope.
sigma=0.8 @ k=3, sigma=1.1 @ k=5, sigma=1.4 @ k=7
- kwimage.structs.heatmap._remove_translation(tf)[source]¶
Removes the translation component of a transform
Todo
[ ] Is this possible in more general cases? E.g. projective transforms?
- kwimage.structs.heatmap._gmean(a, axis=0, clobber=False)[source]¶
Compute the geometric mean along the specified axis.
Modification of the scipy.mstats method to be more memory efficient
- Example
>>> rng = np.random.RandomState(0) >>> C, H, W = 8, 32, 32 >>> axis = 0 >>> a = rng.rand(2, C, H, W) >>> _gmean(a)
kwimage.structs.mask
¶Data structure for Binary Masks
Structure for efficient encoding of per-annotation segmentation masks Based on efficient cython/C code in the cocoapi [1].
References
- 1
https://github.com/nightrome/cocostuffapi/blob/master/PythonAPI/pycocotools/_mask.pyx
- 2
https://github.com/nightrome/cocostuffapi/blob/master/common/maskApi.c
- 3
https://github.com/nightrome/cocostuffapi/blob/master/common/maskApi.h
- 4
https://github.com/nightrome/cocostuffapi/blob/master/PythonAPI/pycocotools/mask.py
- Goals:
The goal of this file is to create a datastructure that lets the developer seemlessly convert between:
raw binary uint8 masks
(2) memory-efficient compressed run-length-encodings of binary segmentation masks. (3) convex polygons (4) convex hull polygons (5) bounding box
It is not there yet, and the API is subject to change in order to better accomplish these goals.
Notes
IN THIS FILE ONLY: size corresponds to a h/w tuple to be compatible with the coco semantics. Everywhere else in this repo, size uses opencv semantics which are w/h.
Manages a single segmentation mask and can convert to and from |
|
Store and manipulate multiple masks, usually within the same image |
- class kwimage.structs.mask.Mask(data=None, format=None)[source]¶
Bases:
ubelt.NiceRepr
,_MaskConversionMixin
,_MaskConstructorMixin
,_MaskTransformMixin
,_MaskDrawMixin
Manages a single segmentation mask and can convert to and from multiple formats including:
bytes_rle - byte encoded run length encoding
array_rle - raw run length encoding
c_mask - c-style binary mask
f_mask - fortran-style binary mask
Example
>>> # xdoc: +REQUIRES(--mask) >>> # a ms-coco style compressed bytes rle segmentation >>> segmentation = {'size': [5, 9], 'counts': ';?1B10O30O4'} >>> mask = Mask(segmentation, 'bytes_rle') >>> # convert to binary numpy representation >>> binary_mask = mask.to_c_mask().data >>> print(ub.repr2(binary_mask.tolist(), nl=1, nobr=1)) [0, 0, 0, 1, 1, 1, 1, 1, 0], [0, 0, 1, 1, 1, 0, 0, 0, 0], [0, 0, 1, 1, 1, 1, 1, 1, 0], [0, 0, 1, 1, 1, 0, 1, 1, 0], [0, 0, 1, 1, 1, 0, 1, 1, 0],
- classmethod random(Mask, rng=None, shape=(32, 32))[source]¶
Create a random binary mask object
- Parameters
rng (int | RandomState | None) – the random seed
shape (Tuple[int, int]) – the height / width of the returned mask
- Returns
the random mask
- Return type
Example
>>> import kwimage >>> mask = kwimage.Mask.random() >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> mask.draw() >>> kwplot.show_if_requested()
- classmethod demo(cls)[source]¶
Demo mask with holes and disjoint shapes
- Returns
the demo mask
- Return type
- copy(self)[source]¶
Performs a deep copy of the mask data
- Returns
the copied mask
- Return type
Example
>>> self = Mask.random(shape=(8, 8), rng=0) >>> other = self.copy() >>> assert other.data is not self.data
- union(self, *others)[source]¶
This can be used as a staticmethod or an instancemethod
- Parameters
*others – multiple input masks to union
- Returns
the unioned mask
- Return type
Example
>>> # xdoc: +REQUIRES(--mask) >>> from kwimage.structs.mask import * # NOQA >>> masks = [Mask.random(shape=(8, 8), rng=i) for i in range(2)] >>> mask = Mask.union(*masks) >>> print(mask.area) >>> masks = [m.to_c_mask() for m in masks] >>> mask = Mask.union(*masks) >>> print(mask.area)
>>> masks = [m.to_bytes_rle() for m in masks] >>> mask = Mask.union(*masks) >>> print(mask.area)
- Benchmark:
import ubelt as ub ti = ub.Timerit(100, bestof=10, verbose=2)
masks = [Mask.random(shape=(172, 172), rng=i) for i in range(2)]
- for timer in ti.reset(‘native rle union’):
masks = [m.to_bytes_rle() for m in masks] with timer:
mask = Mask.union(*masks)
- for timer in ti.reset(‘native cmask union’):
masks = [m.to_c_mask() for m in masks] with timer:
mask = Mask.union(*masks)
- for timer in ti.reset(‘cmask->rle union’):
masks = [m.to_c_mask() for m in masks] with timer:
mask = Mask.union(*[m.to_bytes_rle() for m in masks])
- intersection(self, *others)[source]¶
This can be used as a staticmethod or an instancemethod
- Parameters
*others – multiple input masks to intersect
- Returns
the intersection of the masks
- Return type
Example
>>> n = 3 >>> masks = [Mask.random(shape=(8, 8), rng=i) for i in range(n)] >>> items = masks >>> mask = Mask.intersection(*masks) >>> areas = [item.area for item in items] >>> print('areas = {!r}'.format(areas)) >>> print(mask.area) >>> print(Mask.intersection(*masks).area / Mask.union(*masks).area)
- property area(self)[source]¶
Returns the number of non-zero pixels
- Returns
the number of non-zero pixels
- Return type
Example
>>> self = Mask.demo() >>> self.area 150
- get_patch(self)[source]¶
Extract the patch with non-zero data
Example
>>> # xdoc: +REQUIRES(--mask) >>> from kwimage.structs.mask import * # NOQA >>> self = Mask.random(shape=(8, 8), rng=0) >>> self.get_patch()
- get_xywh(self)[source]¶
Gets the bounding xywh box coordinates of this mask
- Returns
- x, y, w, h: Note we dont use a Boxes object because
a general singular version does not yet exist.
- Return type
ndarray
Example
>>> # xdoc: +REQUIRES(--mask) >>> self = Mask.random(shape=(8, 8), rng=0) >>> self.get_xywh().tolist() >>> self = Mask.random(rng=0).translate((10, 10)) >>> self.get_xywh().tolist()
Example
>>> # test empty case >>> import kwimage >>> self = kwimage.Mask(np.empty((0, 0), dtype=np.uint8), format='c_mask') >>> assert self.get_xywh().tolist() == [0, 0, 0, 0]
- Ignore:
>>> import kwimage >>> self = kwimage.Mask(np.zeros((768, 768), dtype=np.uint8), format='c_mask') >>> x_coords = np.array([621, 752]) >>> y_coords = np.array([366, 292]) >>> self.data[y_coords, x_coords] = 1 >>> self.get_xywh()
>>> # References: >>> # https://stackoverflow.com/questions/33281957/faster-alternative-to-numpy-where >>> # https://answers.opencv.org/question/4183/what-is-the-best-way-to-find-bounding-box-for-binary-mask/ >>> import timerit >>> ti = timerit.Timerit(100, bestof=10, verbose=2) >>> for timer in ti.reset('time'): >>> with timer: >>> y_coords, x_coords = np.where(self.data) >>> # >>> for timer in ti.reset('time'): >>> with timer: >>> cv2.findNonZero(data)
self.data = np.random.rand(800, 700) > 0.5
import timerit ti = timerit.Timerit(100, bestof=10, verbose=2) for timer in ti.reset(‘time’):
- with timer:
y_coords, x_coords = np.where(self.data)
# for timer in ti.reset(‘time’):
- with timer:
data = np.ascontiguousarray(self.data).astype(np.uint8) cv2_coords = cv2.findNonZero(data)
>>> poly = self.to_multi_polygon()
- get_polygon(self)[source]¶
DEPRECATED: USE to_multi_polygon
Returns a list of (x,y)-coordinate lists. The length of the list is equal to the number of disjoint regions in the mask.
- Returns
- polygon around each connected component of the
mask. Each ndarray is an Nx2 array of xy points.
- Return type
List[ndarray]
Note
The returned polygon may not surround points that are only one pixel thick.
Example
>>> # xdoc: +REQUIRES(--mask) >>> from kwimage.structs.mask import * # NOQA >>> self = Mask.random(shape=(8, 8), rng=0) >>> polygons = self.get_polygon() >>> print('polygons = ' + ub.repr2(polygons)) >>> polygons = self.get_polygon() >>> self = self.to_bytes_rle() >>> other = Mask.from_polygons(polygons, self.shape) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> image = np.ones(self.shape) >>> image = self.draw_on(image, color='blue') >>> image = other.draw_on(image, color='red') >>> kwplot.imshow(image)
- polygons = [
np.array([[6, 4],[7, 4]], dtype=np.int32), np.array([[0, 1],[0, 3],[2, 3],[2, 1]], dtype=np.int32),
]
- to_mask(self, dims=None)[source]¶
Converts to a mask object (which does nothing because this already is mask object!)
- Returns
kwimage.Mask
- to_multi_polygon(self)[source]¶
Returns a MultiPolygon object fit around this raster including disjoint pieces and holes.
- Returns
vectorized representation
- Return type
Example
>>> # xdoc: +REQUIRES(--mask) >>> from kwimage.structs.mask import * # NOQA >>> self = Mask.demo() >>> self = self.scale(5) >>> multi_poly = self.to_multi_polygon() >>> # xdoc: +REQUIRES(module:kwplot) >>> # xdoc: +REQUIRES(--show) >>> self.draw(color='red') >>> multi_poly.scale(1.1).draw(color='blue')
>>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> image = np.ones(self.shape) >>> image = self.draw_on(image, color='blue') >>> #image = other.draw_on(image, color='red') >>> kwplot.imshow(image) >>> multi_poly.draw()
Example
>>> import kwimage >>> self = kwimage.Mask(np.empty((0, 0), dtype=np.uint8), format='c_mask') >>> poly = self.to_multi_polygon() >>> poly.to_multi_polygon()
Example
# Corner case, only two pixels are on >>> import kwimage >>> self = kwimage.Mask(np.zeros((768, 768), dtype=np.uint8), format=’c_mask’) >>> x_coords = np.array([621, 752]) >>> y_coords = np.array([366, 292]) >>> self.data[y_coords, x_coords] = 1 >>> poly = self.to_multi_polygon()
poly.to_mask(self.shape).data.sum()
self.to_array_rle().to_c_mask().data.sum() temp.to_c_mask().data.sum()
Example
>>> # TODO: how do we correctly handle the 1 or 2 point to a poly >>> # case? >>> import kwimage >>> data = np.zeros((8, 8), dtype=np.uint8) >>> data[0, 3:5] = 1 >>> data[7, 3:5] = 1 >>> data[3:5, 0:2] = 1 >>> self = kwimage.Mask.coerce(data) >>> polys = self.to_multi_polygon() >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(data) >>> polys.draw(border=True, linewidth=5, alpha=0.5, radius=0.2)
- get_convex_hull(self)[source]¶
Returns a list of xy points around the convex hull of this mask
Note
The returned polygon may not surround points that are only one pixel thick.
Example
>>> # xdoc: +REQUIRES(--mask) >>> self = Mask.random(shape=(8, 8), rng=0) >>> polygons = self.get_convex_hull() >>> print('polygons = ' + ub.repr2(polygons)) >>> other = Mask.from_polygons(polygons, self.shape)
- iou(self, other)[source]¶
The area of intersection over the area of union
Todo
- [ ] Write plural Masks version of this class, which should
be able to perform this operation more efficiently.
- CommandLine:
xdoctest -m kwimage.structs.mask Mask.iou
Example
>>> # xdoc: +REQUIRES(--mask) >>> self = Mask.demo() >>> other = self.translate(1) >>> iou = self.iou(other) >>> print('iou = {:.4f}'.format(iou)) iou = 0.0830 >>> iou2 = self.intersection(other).area / self.union(other).area >>> print('iou2 = {:.4f}'.format(iou2))
- classmethod coerce(Mask, data, dims=None)[source]¶
Attempts to auto-inspect the format of the data and conver to Mask
- Parameters
data – the data to coerce
dims (Tuple) – required for certain formats like polygons height / width of the source image
- Returns
the constructed mask object
- Return type
Example
>>> # xdoc: +REQUIRES(--mask) >>> segmentation = {'size': [5, 9], 'counts': ';?1B10O30O4'} >>> polygon = [ >>> [np.array([[3, 0],[2, 1],[2, 4],[4, 4],[4, 3],[7, 0]])], >>> [np.array([[2, 1],[2, 2],[4, 2],[4, 1]])], >>> ] >>> dims = (9, 5) >>> mask = (np.random.rand(32, 32) > .5).astype(np.uint8) >>> Mask.coerce(polygon, dims).to_bytes_rle() >>> Mask.coerce(segmentation).to_bytes_rle() >>> Mask.coerce(mask).to_bytes_rle()
- to_coco(self, style='orig')[source]¶
Convert the Mask to a COCO json representation based on the current format.
A COCO mask is formatted as a run-length-encoding (RLE), of which there are two variants: (1) a array RLE, which is slightly more readable and extensible, and (2) a bytes RLE, which is slightly more concise. The returned format will depend on the current format of the Mask object. If it is in “bytes_rle” format, it will be returned in that format, otherwise it will be converted to the “array_rle” format and returned as such.
- Parameters
style (str) – Does nothing for this particular method, exists for API compatibility and if alternate encoding styles are implemented in the future.
- Returns
- either a bytes-rle or array-rle encoding, depending
on the current mask format. The keys in this dictionary are as follows:
counts (List[int] | str): the array or bytes rle encoding
- size (Tuple[int]): the height and width of the encoded mask
see note.
- shape (Tuple[int]): only present in array-rle mode. This
is also the height/width of the underlying encoded array. This exists for semantic consistency with other kwimage conventions, and is not part of the original coco spec.
- order (str): only present in array-rle mode.
Either C or F, indicating if counts is aranged in row-major or column-major order. For COCO-compatibility this is always returned in F (column-major) order.
- binary (bool): only present in array-rle mode.
For COCO-compatibility this is always returned as False, indicating the mask only contains binary 0 or 1 values.
- Return type
Note
The output dictionary will contain a key named “size”, this is the only location in kwimage where “size” refers to a tuple in (height/width) order, in order to be backwards compatible with the original coco spec. In all other locations in kwimage a “size” will refer to a (width/height) ordered tuple.
- SeeAlso:
- func
kwimage.im_runlen.encode_run_length - backend function that does array-style run length encoding.
Example
>>> # xdoc: +REQUIRES(--mask) >>> from kwimage.structs.mask import * # NOQA >>> self = Mask.demo() >>> coco_data1 = self.toformat('array_rle').to_coco() >>> coco_data2 = self.toformat('bytes_rle').to_coco() >>> print('coco_data1 = {}'.format(ub.repr2(coco_data1, nl=1))) >>> print('coco_data2 = {}'.format(ub.repr2(coco_data2, nl=1))) coco_data1 = { 'binary': True, 'counts': [47, 5, 3, 1, 14, ... 1, 4, 19, 141], 'order': 'F', 'shape': (23, 32), 'size': (23, 32), } coco_data2 = { 'counts': '_153L;4EL...ON3060L0N060L0Nb0Y4', 'size': [23, 32], }
- class kwimage.structs.mask.MaskList[source]¶
Bases:
kwimage.structs._generic.ObjectList
Store and manipulate multiple masks, usually within the same image
- to_polygon_list(self)[source]¶
Converts all mask objects to multi-polygon objects
- Returns
kwimage.PolygonList
kwimage.structs.points
¶Stores multiple keypoints for a single object. |
|
Stores a list of Points, each item usually corresponds to a different object. |
- class kwimage.structs.points._PointsWarpMixin[source]¶
- _warp_imgaug(self, augmenter, input_dims, inplace=False)[source]¶
Warps by applying an augmenter from the imgaug library
- Parameters
augmenter (imgaug.augmenters.Augmenter)
input_dims (Tuple) – h/w of the input image
inplace (bool, default=False) – if True, modifies data inplace
Example
>>> # xdoctest: +REQUIRES(module:imgaug) >>> from kwimage.structs.points import * # NOQA >>> import imgaug >>> input_dims = (10, 10) >>> self = Points.random(10).scale(input_dims) >>> augmenter = imgaug.augmenters.Fliplr(p=1) >>> new = self._warp_imgaug(augmenter, input_dims)
>>> self = Points(xy=(np.random.rand(10, 2) * 10).astype(int)) >>> augmenter = imgaug.augmenters.Fliplr(p=1) >>> new = self._warp_imgaug(augmenter, input_dims)
>>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> plt = kwplot.autoplt() >>> kwplot.figure(fnum=1, doclf=True) >>> ax = plt.gca() >>> ax.set_xlim(0, 10) >>> ax.set_ylim(0, 10) >>> self.draw(color='red', alpha=.4, radius=0.1) >>> new.draw(color='blue', alpha=.4, radius=0.1)
- to_imgaug(self, input_dims)[source]¶
Example
>>> # xdoctest: +REQUIRES(module:imgaug) >>> from kwimage.structs.points import * # NOQA >>> pts = Points.random(10) >>> input_dims = (10, 10) >>> kpoi = pts.to_imgaug(input_dims)
- warp(self, transform, input_dims=None, output_dims=None, inplace=False)[source]¶
Generalized coordinate transform.
- Parameters
transform (GeometricTransform | ArrayLike | Augmenter | callable) – scikit-image tranform, a 3x3 transformation matrix, an imgaug Augmenter, or generic callable which transforms an NxD ndarray.
input_dims (Tuple) – shape of the image these objects correspond to (only needed / used when transform is an imgaug augmenter)
output_dims (Tuple) – unused, only exists for compatibility
inplace (bool, default=False) – if True, modifies data inplace
Example
>>> from kwimage.structs.points import * # NOQA >>> self = Points.random(10, rng=0) >>> transform = skimage.transform.AffineTransform(scale=(2, 2)) >>> new = self.warp(transform) >>> assert np.all(new.xy == self.scale(2).xy)
- Doctest:
>>> self = Points.random(10, rng=0) >>> assert np.all(self.warp(np.eye(3)).xy == self.xy) >>> assert np.all(self.warp(np.eye(2)).xy == self.xy)
- scale(self, factor, output_dims=None, inplace=False)[source]¶
Scale a points by a factor
- Parameters
factor (float or Tuple[float, float]) – scale factor as either a scalar or a (sf_x, sf_y) tuple.
output_dims (Tuple) – unused in non-raster spatial structures
Example
>>> from kwimage.structs.points import * # NOQA >>> self = Points.random(10, rng=0) >>> new = self.scale(10) >>> assert new.xy.max() <= 10
- translate(self, offset, output_dims=None, inplace=False)[source]¶
Shift the points
- Parameters
factor (float or Tuple[float]) – transation amount as either a scalar or a (t_x, t_y) tuple.
output_dims (Tuple) – unused in non-raster spatial structures
Example
>>> from kwimage.structs.points import * # NOQA >>> self = Points.random(10, rng=0) >>> new = self.translate(10) >>> assert new.xy.min() >= 10 >>> assert new.xy.max() <= 11
- class kwimage.structs.points.Points(data=None, meta=None, datakeys=None, metakeys=None, **kwargs)[source]¶
Bases:
kwimage.structs._generic.Spatial
,_PointsWarpMixin
Stores multiple keypoints for a single object.
This stores both the geometry and the class metadata if available
- Ignore:
- meta = {
“names” = [‘head’, ‘nose’, ‘tail’], “skeleton” = [(0, 1), (0, 2)],
}
Example
>>> from kwimage.structs.points import * # NOQA >>> xy = np.random.rand(10, 2) >>> pts = Points(xy=xy) >>> print('pts = {!r}'.format(pts))
- classmethod random(Points, num=1, classes=None, rng=None)[source]¶
Makes random points; typically for testing purposes
Example
>>> import kwimage >>> self = kwimage.Points.random(classes=[1, 2, 3]) >>> self.data >>> print('self.data = {!r}'.format(self.data))
- tensor(self, device=ub.NoParam)[source]¶
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> from kwimage.structs.points import * # NOQA >>> self = Points.random(10) >>> self.tensor()
- round(self, inplace=False)[source]¶
Rounds data to the nearest integer
- Parameters
inplace (bool, default=False) – if True, modifies this object
Example
>>> import kwimage >>> self = kwimage.Points.random(3).scale(10) >>> self.round()
- numpy(self)[source]¶
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> from kwimage.structs.points import * # NOQA >>> self = Points.random(10) >>> self.tensor().numpy().tensor().numpy()
- draw_on(self, image, color='white', radius=None, copy=False)[source]¶
- CommandLine:
xdoctest -m ~/code/kwimage/kwimage/structs/points.py Points.draw_on –show
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> from kwimage.structs.points import * # NOQA >>> s = 128 >>> image = np.zeros((s, s)) >>> self = Points.random(10).scale(s) >>> image = self.draw_on(image) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.autompl() >>> kwplot.imshow(image) >>> self.draw(radius=3, alpha=.5) >>> kwplot.show_if_requested()
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> from kwimage.structs.points import * # NOQA >>> s = 128 >>> image = np.zeros((s, s)) >>> self = Points.random(10).scale(s) >>> image = self.draw_on(image, radius=3, color='distinct') >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.autompl() >>> kwplot.imshow(image) >>> self.draw(radius=3, alpha=.5, color='classes') >>> kwplot.show_if_requested()
Example
>>> import kwimage >>> s = 32 >>> self = kwimage.Points.random(10).scale(s) >>> color = 'blue' >>> # Test drawong on all channel + dtype combinations >>> im3 = np.zeros((s, s, 3), dtype=np.float32) >>> im_chans = { >>> 'im3': im3, >>> 'im1': kwimage.convert_colorspace(im3, 'rgb', 'gray'), >>> 'im4': kwimage.convert_colorspace(im3, 'rgb', 'rgba'), >>> } >>> inputs = {} >>> for k, im in im_chans.items(): >>> inputs[k + '_01'] = (kwimage.ensure_float01(im.copy()), {'radius': None}) >>> inputs[k + '_255'] = (kwimage.ensure_uint255(im.copy()), {'radius': None}) >>> outputs = {} >>> for k, v in inputs.items(): >>> im, kw = v >>> outputs[k] = self.draw_on(im, color=color, **kw) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.figure(fnum=2, doclf=True) >>> kwplot.autompl() >>> pnum_ = kwplot.PlotNums(nCols=2, nRows=len(inputs)) >>> for k in inputs.keys(): >>> kwplot.imshow(inputs[k][0], fnum=2, pnum=pnum_(), title=k) >>> kwplot.imshow(outputs[k], fnum=2, pnum=pnum_(), title=k) >>> kwplot.show_if_requested()
- draw(self, color='blue', ax=None, alpha=None, radius=1, **kwargs)[source]¶
TODO: can use kwplot.draw_points
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> from kwimage.structs.points import * # NOQA >>> pts = Points.random(10) >>> # xdoc: +REQUIRES(--show) >>> pts.draw(radius=0.01)
>>> from kwimage.structs.points import * # NOQA >>> self = Points.random(10, classes=['a', 'b', 'c']) >>> self.draw(radius=0.01, color='classes')
- compress(self, flags, axis=0, inplace=False)[source]¶
Filters items based on a boolean criterion
Example
>>> from kwimage.structs.points import * # NOQA >>> self = Points.random(4) >>> flags = [1, 0, 1, 1] >>> other = self.compress(flags) >>> assert len(self) == 4 >>> assert len(other) == 3
>>> # xdoctest: +REQUIRES(module:torch) >>> other = self.tensor().compress(flags) >>> assert len(other) == 3
- take(self, indices, axis=0, inplace=False)[source]¶
Takes a subset of items at specific indices
Example
>>> from kwimage.structs.points import * # NOQA >>> self = Points.random(4) >>> indices = [1, 3] >>> other = self.take(indices) >>> assert len(self) == 4 >>> assert len(other) == 2
>>> # xdoctest: +REQUIRES(module:torch) >>> other = self.tensor().take(indices) >>> assert len(other) == 2
- to_coco(self, style='orig')[source]¶
Converts to an mscoco-like representation
Note
items that are usually id-references to other objects may need to be rectified.
- Parameters
style (str) – either orig, new, new-id, or new-name
- Returns
mscoco-like representation
- Return type
Dict
Example
>>> from kwimage.structs.points import * # NOQA >>> self = Points.random(4, classes=['a', 'b']) >>> orig = self._to_coco(style='orig') >>> print('orig = {!r}'.format(orig)) >>> new_name = self._to_coco(style='new-name') >>> print('new_name = {}'.format(ub.repr2(new_name, nl=-1))) >>> # xdoctest: +REQUIRES(module:ndsampler) >>> import ndsampler >>> self.meta['classes'] = ndsampler.CategoryTree.coerce(self.meta['classes']) >>> new_id = self._to_coco(style='new-id') >>> print('new_id = {}'.format(ub.repr2(new_id, nl=-1)))
- classmethod from_coco(cls, coco_kpts, class_idxs=None, classes=None, warn=False)[source]¶
- Parameters
coco_kpts (list | dict) – either the original list keypoint encoding or the new dict keypoint encoding.
class_idxs (list) – only needed if using old style
classes (list | CategoryTree) – list of all keypoint category names
warn (bool, default=False) – if True raise warnings
Example
>>> ## >>> classes = ['mouth', 'left-hand', 'right-hand'] >>> coco_kpts = [ >>> {'xy': (0, 0), 'visible': 2, 'keypoint_category': 'left-hand'}, >>> {'xy': (1, 2), 'visible': 2, 'keypoint_category': 'mouth'}, >>> ] >>> Points.from_coco(coco_kpts, classes=classes) >>> # Test without classes >>> Points.from_coco(coco_kpts) >>> # Test without any category info >>> coco_kpts2 = [ub.dict_diff(d, {'keypoint_category'}) for d in coco_kpts] >>> Points.from_coco(coco_kpts2) >>> # Test without category instead of keypoint_category >>> coco_kpts3 = [ub.map_keys(lambda x: x.replace('keypoint_', ''), d) for d in coco_kpts] >>> Points.from_coco(coco_kpts3) >>> # >>> # Old style >>> coco_kpts = [0, 0, 2, 0, 1, 2] >>> Points.from_coco(coco_kpts) >>> # Fail case >>> coco_kpts4 = [{'xy': [4686.5, 1341.5], 'category': 'dot'}] >>> Points.from_coco(coco_kpts4, classes=[])
Example
>>> # xdoctest: +REQUIRES(module:ndsampler) >>> import ndsampler >>> classes = ndsampler.CategoryTree.from_coco([ >>> {'name': 'mouth', 'id': 2}, {'name': 'left-hand', 'id': 3}, {'name': 'right-hand', 'id': 5} >>> ]) >>> coco_kpts = [ >>> {'xy': (0, 0), 'visible': 2, 'keypoint_category_id': 5}, >>> {'xy': (1, 2), 'visible': 2, 'keypoint_category_id': 2}, >>> ] >>> pts = Points.from_coco(coco_kpts, classes=classes) >>> assert pts.data['class_idxs'].tolist() == [2, 0]
- class kwimage.structs.points.PointsList[source]¶
Bases:
kwimage.structs._generic.ObjectList
Stores a list of Points, each item usually corresponds to a different object.
Notes
# TODO: when the data is homogenous we can use a more efficient # representation, otherwise we have to use heterogenous storage.
kwimage.structs.polygon
¶Represents a single polygon as set of exterior boundary points and a list |
|
Data structure for storing multiple polygons (typically related to the same |
|
Stores and allows manipluation of multiple polygons, usually within the |
|
References |
|
References |
- class kwimage.structs.polygon._PolyArrayBackend[source]¶
- class kwimage.structs.polygon._PolyWarpMixin[source]¶
- _warp_imgaug(self, augmenter, input_dims, inplace=False)[source]¶
Warps by applying an augmenter from the imgaug library
- Parameters
augmenter (imgaug.augmenters.Augmenter)
input_dims (Tuple) – h/w of the input image
inplace (bool, default=False) – if True, modifies data inplace
Example
>>> # xdoctest: +REQUIRES(module:imgaug) >>> from kwimage.structs.polygon import * # NOQA >>> import imgaug >>> input_dims = np.array((10, 10)) >>> self = Polygon.random(10, n_holes=1, rng=0).scale(input_dims) >>> augmenter = imgaug.augmenters.Fliplr(p=1) >>> new = self._warp_imgaug(augmenter, input_dims) >>> assert np.allclose(self.data['exterior'].data[:, 1], new.data['exterior'].data[:, 1]) >>> assert np.allclose(input_dims[0] - self.data['exterior'].data[:, 0], new.data['exterior'].data[:, 0])
>>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.figure(fnum=1, doclf=True) >>> from matplotlib import pyplot as pl >>> ax = plt.gca() >>> ax.set_xlim(0, 10) >>> ax.set_ylim(0, 10) >>> self.draw(color='red', alpha=.4) >>> new.draw(color='blue', alpha=.4)
- warp(self, transform, input_dims=None, output_dims=None, inplace=False)[source]¶
Generalized coordinate transform.
- Parameters
transform (GeometricTransform | ArrayLike | Augmenter | callable) – scikit-image tranform, a 3x3 transformation matrix, an imgaug Augmenter, or generic callable which transforms an NxD ndarray.
input_dims (Tuple) – shape of the image these objects correspond to (only needed / used when transform is an imgaug augmenter)
output_dims (Tuple) – unused, only exists for compatibility
inplace (bool, default=False) – if True, modifies data inplace
Example
>>> from kwimage.structs.polygon import * # NOQA >>> self = Polygon.random() >>> transform = skimage.transform.AffineTransform(scale=(2, 2)) >>> new = self.warp(transform)
- Doctest:
>>> # xdoctest: +REQUIRES(module:imgaug) >>> self = Polygon.random() >>> import imgaug >>> augmenter = imgaug.augmenters.Fliplr(p=1) >>> new = self.warp(augmenter, input_dims=(1, 1)) >>> print('new = {!r}'.format(new.data)) >>> print('self = {!r}'.format(self.data)) >>> #assert np.all(self.warp(np.eye(3)).exterior == self.exterior) >>> #assert np.all(self.warp(np.eye(2)).exterior == self.exterior)
- scale(self, factor, about=None, output_dims=None, inplace=False)[source]¶
Scale a polygon by a factor
- Parameters
factor (float or Tuple[float, float]) – scale factor as either a scalar or a (sf_x, sf_y) tuple.
about (Tuple | None) – if unspecified scales about the origin (0, 0), otherwise the rotation is about this point.
output_dims (Tuple) – unused in non-raster spatial structures
inplace (bool, default=False) – if True, modifies data inplace
Example
>>> from kwimage.structs.polygon import * # NOQA >>> self = Polygon.random(10, rng=0) >>> new = self.scale(10)
Example
>>> from kwimage.structs.polygon import * # NOQA >>> self = Polygon.random(10, rng=0).translate((0.5)) >>> new = self.scale(1.5, about='center') >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.autompl() >>> self.draw(color='red', alpha=0.5) >>> new.draw(color='blue', alpha=0.5, setlim=True)
- translate(self, offset, output_dims=None, inplace=False)[source]¶
Shift the polygon up/down left/right
- Parameters
factor (float or Tuple[float]) – transation amount as either a scalar or a (t_x, t_y) tuple.
output_dims (Tuple) – unused in non-raster spatial structures
inplace (bool, default=False) – if True, modifies data inplace
Example
>>> from kwimage.structs.polygon import * # NOQA >>> self = Polygon.random(10, rng=0) >>> new = self.translate(10)
- rotate(self, theta, about=None, output_dims=None, inplace=False)[source]¶
Rotate the polygon
- Parameters
theta (float) – rotation angle in radians
about (Tuple | None | str) – if unspecified rotates about the origin (0, 0). If “center” then rotate around the center of this polygon. Otherwise the rotation is about a custom specified point.
output_dims (Tuple) – unused in non-raster spatial structures
Example
>>> from kwimage.structs.polygon import * # NOQA >>> self = Polygon.random(10, rng=0) >>> new = self.rotate(np.pi / 2, about='center') >>> new2 = self.rotate(np.pi / 2) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.autompl() >>> self.draw(color='red', alpha=0.5) >>> new.draw(color='blue', alpha=0.5)
- class kwimage.structs.polygon.Polygon(data=None, meta=None, datakeys=None, metakeys=None, **kwargs)[source]¶
Bases:
kwimage.structs._generic.Spatial
,_PolyArrayBackend
,_PolyWarpMixin
,ubelt.NiceRepr
Represents a single polygon as set of exterior boundary points and a list of internal polygons representing holes.
By convention exterior boundaries should be counterclockwise and interior holes should be clockwise.
Example
>>> import kwimage >>> data = { >>> 'exterior': np.array([[13, 1], [13, 19], [25, 19], [25, 1]]), >>> 'interiors': [ >>> np.array([[13, 13], [14, 12], [24, 12], [25, 13], [25, 18], >>> [24, 19], [14, 19], [13, 18]]), >>> np.array([[13, 2], [14, 1], [24, 1], [25, 2], [25, 11], >>> [24, 12], [14, 12], [13, 11]])] >>> } >>> self = kwimage.Polygon(**data) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> self.draw(setlim=True)
Example
>>> import kwimage >>> self = kwimage.Polygon.random( >>> n=5, n_holes=1, convex=False, rng=0) >>> print('self = {}'.format(self)) self = <Polygon({ 'exterior': <Coords(data= array([[0.30371392, 0.97195856], [0.24372304, 0.60568445], [0.21408694, 0.34884262], [0.5799477 , 0.44020379], [0.83720288, 0.78367234]]))>, 'interiors': [<Coords(data= array([[0.50164209, 0.83520279], [0.25835064, 0.40313428], [0.28778562, 0.74758761], [0.30341266, 0.93748088]]))>], })> >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> self.draw(setlim=True)
- classmethod circle(cls, xy, r, resolution=64)[source]¶
Create a circular polygon
Example
>>> xy = (0.5, 0.5) >>> r = .3 >>> poly = Polygon.circle(xy, r)
- classmethod random(cls, n=6, n_holes=0, convex=True, tight=False, rng=None)[source]¶
- Parameters
n (int) – number of points in the polygon (must be 3 or more)
n_holes (int) – number of holes
tight (bool, default=False) – fits the minimum and maximum points between 0 and 1
convex (bool, default=True) – force resulting polygon will be convex (may remove exterior points)
- CommandLine:
xdoctest -m kwimage.structs.polygon Polygon.random
Example
>>> rng = None >>> n = 4 >>> n_holes = 1 >>> cls = Polygon >>> self = Polygon.random(n=n, rng=rng, n_holes=n_holes, convex=1) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.autompl() >>> self.draw()
References
https://gis.stackexchange.com/questions/207731/random-multipolygon https://stackoverflow.com/questions/8997099/random-polygon https://stackoverflow.com/questions/27548363/from-voronoi-tessellation-to-shapely-polygons https://stackoverflow.com/questions/8997099/algorithm-to-generate-random-2d-polygon
- to_mask(self, dims=None)[source]¶
Convert this polygon to a mask
Todo
[ ] currently not efficient
- Parameters
dims (Tuple) – height and width of the output mask
- Returns
kwimage.Mask
Example
>>> from kwimage.structs.polygon import * # NOQA >>> self = Polygon.random(n_holes=1).scale(128) >>> mask = self.to_mask((128, 128)) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.figure(fnum=1, doclf=True) >>> mask.draw(color='blue') >>> mask.to_multi_polygon().draw(color='red', alpha=.5)
- to_relative_mask(self)[source]¶
Returns a translated mask such the mask dimensions are minimal.
In other words, we move the polygon all the way to the top-left and return a mask just big enough to fit the polygon.
- Returns
kwimage.Mask
Example
>>> from kwimage.structs.polygon import * # NOQA >>> self = Polygon.random().scale(8).translate(100, 100) >>> mask = self.to_relative_mask() >>> assert mask.shape <= (8, 8) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.figure(fnum=1, doclf=True) >>> mask.draw(color='blue') >>> mask.to_multi_polygon().draw(color='red', alpha=.5)
- fill(self, image, value=1)[source]¶
Inplace fill in an image based on this polyon.
- Parameters
image (ndarray) – image to draw on
value (int | Tuple[int], default=1) – value fill in with
- Returns
the image that has been modified in place
- Return type
ndarray
- _to_cv_countours(self)[source]¶
OpenCV polygon representation, which is a list of points. Holes are implicitly represented. When another polygon is drawn over an existing polyon via cv2.fillPoly
- Returns
- where each ndarray is of shape [N, 1, 2],
where N is the number of points on the boundary, the middle dimension is always 1, and the trailing dimension represents x and y coordinates respectively.
- Return type
List[ndarray]
- classmethod coerce(Polygon, data)[source]¶
Try to autodetermine format of input polygon and coerce it into a kwimage.Polygon.
- Parameters
data (object) – some type of data that can be interpreted as a polygon.
- Returns
kwimage.Polygon
Example
>>> import kwimage >>> self = kwimage.Polygon.random() >>> self.coerce(self) >>> self.coerce(self.exterior) >>> self.coerce(self.exterior.data) >>> self.coerce(self.data) >>> self.coerce(self.to_geojson())
- classmethod from_shapely(Polygon, geom)[source]¶
Convert a shapely polygon to a kwimage.Polygon
- Parameters
geom (shapely.geometry.polygon.Polygon) – a shapely polygon
- Returns
kwimage.Polygon
- classmethod from_wkt(Polygon, data)[source]¶
Convert a WKT string to a kwimage.Polygon
- Parameters
data (str) – a WKT polygon string
- Returns
kwimage.Polygon
Example
>>> import kwimage >>> data = 'POLYGON ((0.11 0.61, 0.07 0.588, 0.015 0.50, 0.11 0.61))' >>> self = kwimage.Polygon.from_wkt(data) >>> assert len(self.exterior) == 4
- classmethod from_geojson(Polygon, data_geojson)[source]¶
Convert a geojson polygon to a kwimage.Polygon
- Parameters
data_geojson (dict) – geojson data
References
https://geojson.org/geojson-spec.html
Example
>>> from kwimage.structs.polygon import * # NOQA >>> self = Polygon.random(n_holes=2) >>> data_geojson = self.to_geojson() >>> new = Polygon.from_geojson(data_geojson)
- to_shapely(self)[source]¶
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> # xdoc: +REQUIRES(module:shapely) >>> from kwimage.structs.polygon import * # NOQA >>> self = Polygon.random(n_holes=1) >>> self = self.scale(100) >>> geom = self.to_shapely() >>> print('geom = {!r}'.format(geom))
- to_geojson(self)[source]¶
Converts polygon to a geojson structure
- Returns
Dict[str, object]
Example
>>> import kwimage >>> self = kwimage.Polygon.random() >>> print(self.to_geojson())
- to_wkt(self)[source]¶
Convert a kwimage.Polygon to WKT string
Example
>>> import kwimage >>> self = kwimage.Polygon.random() >>> print(self.to_wkt())
- classmethod from_coco(cls, data, dims=None)[source]¶
Accepts either new-style or old-style coco polygons
- bounding_box(self)[source]¶
Returns an axis-aligned bounding box for the segmentation
- Returns
kwimage.Boxes
- bounding_box_polygon(self)[source]¶
Returns an axis-aligned bounding polygon for the segmentation.
Notes
This Polygon will be a Box, not a convex hull! Use shapely for convex hulls.
- Returns
kwimage.Polygon
- clip(self, x_min, y_min, x_max, y_max, inplace=False)[source]¶
Clip polygon to image boundaries.
Example
>>> from kwimage.structs.polygon import * >>> self = Polygon.random().scale(10).translate(-1) >>> self2 = self.clip(1, 1, 3, 3) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> self2.draw(setlim=True)
- draw_on(self, image, color='blue', fill=True, border=False, alpha=1.0, copy=False)[source]¶
Rasterizes a polygon on an image. See draw for a vectorized matplotlib version.
- Parameters
image (ndarray) – image to raster polygon on.
color (str | tuple) – data coercable to a color
fill (bool, default=True) – draw the center mass of the polygon
border (bool, default=False) – draw the border of the polygon
alpha (float, default=1.0) – polygon transparency (setting alpha < 1 makes this function much slower).
copy (bool, default=False) – if False only copies if necessary
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> from kwimage.structs.polygon import * # NOQA >>> self = Polygon.random(n_holes=1).scale(128) >>> image = np.zeros((128, 128), dtype=np.float32) >>> image = self.draw_on(image) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(image, fnum=1)
Example
>>> import kwimage >>> color = 'blue' >>> self = kwimage.Polygon.random(n_holes=1).scale(128) >>> image = np.zeros((128, 128), dtype=np.float32) >>> # Test drawong on all channel + dtype combinations >>> im3 = np.random.rand(128, 128, 3) >>> im_chans = { >>> 'im3': im3, >>> 'im1': kwimage.convert_colorspace(im3, 'rgb', 'gray'), >>> 'im4': kwimage.convert_colorspace(im3, 'rgb', 'rgba'), >>> } >>> inputs = {} >>> for k, im in im_chans.items(): >>> inputs[k + '_01'] = (kwimage.ensure_float01(im.copy()), {'alpha': None}) >>> inputs[k + '_255'] = (kwimage.ensure_uint255(im.copy()), {'alpha': None}) >>> inputs[k + '_01_a'] = (kwimage.ensure_float01(im.copy()), {'alpha': 0.5}) >>> inputs[k + '_255_a'] = (kwimage.ensure_uint255(im.copy()), {'alpha': 0.5}) >>> outputs = {} >>> for k, v in inputs.items(): >>> im, kw = v >>> outputs[k] = self.draw_on(im, color=color, **kw) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.figure(fnum=2, doclf=True) >>> kwplot.autompl() >>> pnum_ = kwplot.PlotNums(nCols=2, nRows=len(inputs)) >>> for k in inputs.keys(): >>> kwplot.imshow(inputs[k][0], fnum=2, pnum=pnum_(), title=k) >>> kwplot.imshow(outputs[k], fnum=2, pnum=pnum_(), title=k) >>> kwplot.show_if_requested()
- draw(self, color='blue', ax=None, alpha=1.0, radius=1, setlim=False, border=False, linewidth=2)[source]¶
Draws polygon in a matplotlib axes. See draw_on for in-memory image modification.
- Parameters
setlim (bool) – if True ensures the limits of the axes contains the polygon
color (str | Tuple) – coercable color
alpha (float) – fill transparency
setlim (bool) – if True, modify the x and y limits of the matplotlib axes such that the polygon is can be seen.
border (bool, default=False) – if True, draws an edge border on the polygon.
linewidth (bool) – width of the border
Todo
[ ] Rework arguments in favor of matplotlib standards
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> from kwimage.structs.polygon import * # NOQA >>> self = Polygon.random(n_holes=1) >>> self = self.scale(100) >>> # xdoc: +REQUIRES(--show) >>> self.draw() >>> import kwplot >>> kwplot.autompl() >>> from matplotlib import pyplot as plt >>> kwplot.figure(fnum=2) >>> self.draw(setlim=True)
- _ensure_vertex_order(self, inplace=False)[source]¶
Fixes vertex ordering so the exterior ring is CCW and the interior rings are CW.
Example
>>> import kwimage >>> self = kwimage.Polygon.random(n=3, n_holes=2, rng=0) >>> print('self = {!r}'.format(self)) >>> new = self._ensure_vertex_order() >>> print('new = {!r}'.format(new))
>>> self = kwimage.Polygon.random(n=3, n_holes=2, rng=0).swap_axes() >>> print('self = {!r}'.format(self)) >>> new = self._ensure_vertex_order() >>> print('new = {!r}'.format(new))
- kwimage.structs.polygon._is_clockwise(verts)[source]¶
References
- Ignore:
verts = poly.data[‘exterior’].data[::-1]
- kwimage.structs.polygon._order_vertices(verts)[source]¶
References
- Ignore:
verts = poly.data[‘exterior’].data[::-1]
- class kwimage.structs.polygon.MultiPolygon[source]¶
Bases:
kwimage.structs._generic.ObjectList
Data structure for storing multiple polygons (typically related to the same underlying but potentitally disjoing object)
- Variables
data (List[Polygon]) –
- classmethod random(self, n=3, n_holes=0, rng=None, tight=False)[source]¶
Create a random MultiPolygon
- Returns
MultiPolygon
- fill(self, image, value=1)[source]¶
Inplace fill in an image based on this multi-polyon.
- Parameters
image (ndarray) – image to draw on (inplace)
value (int | Tuple[int], default=1) – value fill in with
- Returns
the image that has been modified in place
- Return type
ndarray
- bounding_box(self)[source]¶
Return the bounding box of the multi polygon
- Returns
- a Boxes object with one box that encloses all
polygons
- Return type
Example
>>> from kwimage.structs.polygon import * # NOQA >>> self = MultiPolygon.random(rng=0, n=10) >>> boxes = self.to_boxes() >>> sub_boxes = [d.to_boxes() for d in self.data] >>> areas1 = np.array([s.intersection(boxes).area[0] for s in sub_boxes]) >>> areas2 = np.array([s.area[0] for s in sub_boxes]) >>> assert np.allclose(areas1, areas2)
- to_mask(self, dims=None)[source]¶
Returns a mask object indication regions occupied by this multipolygon
Example
>>> from kwimage.structs.polygon import * # NOQA >>> s = 100 >>> self = MultiPolygon.random(rng=0).scale(s) >>> dims = (s, s) >>> mask = self.to_mask(dims)
>>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.figure(fnum=1, doclf=True) >>> from matplotlib import pyplot as pl >>> ax = plt.gca() >>> ax.set_xlim(0, s) >>> ax.set_ylim(0, s) >>> self.draw(color='red', alpha=.4) >>> mask.draw(color='blue', alpha=.4)
- to_relative_mask(self)[source]¶
Returns a translated mask such the mask dimensions are minimal.
In other words, we move the polygon all the way to the top-left and return a mask just big enough to fit the polygon.
- Returns
Mask
- classmethod coerce(cls, data, dims=None)[source]¶
Attempts to construct a MultiPolygon instance from the input data
See Mask.coerce
- to_shapely(self)[source]¶
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> # xdoc: +REQUIRES(module:shapely) >>> from kwimage.structs.polygon import * # NOQA >>> self = MultiPolygon.random(rng=0) >>> geom = self.to_shapely() >>> print('geom = {!r}'.format(geom))
- classmethod from_shapely(MultiPolygon, geom)[source]¶
Convert a shapely polygon or multipolygon to a kwimage.MultiPolygon
- classmethod from_geojson(MultiPolygon, data_geojson)[source]¶
Convert a geojson polygon or multipolygon to a kwimage.MultiPolygon
Example
>>> import kwimage >>> orig = kwimage.MultiPolygon.random() >>> data_geojson = orig.to_geojson() >>> self = kwimage.MultiPolygon.from_geojson(data_geojson)
- classmethod from_coco(cls, data, dims=None)[source]¶
Accepts either new-style or old-style coco multi-polygons
- class kwimage.structs.polygon.PolygonList[source]¶
Bases:
kwimage.structs._generic.ObjectList
Stores and allows manipluation of multiple polygons, usually within the same image.
- to_geojson(self, as_collection=False)[source]¶
Converts a list of polygons/multipolygons to a geojson structure
- Parameters
as_collection (bool) – if True, wraps the polygon geojson items in a geojson feature collection, otherwise just return a list of items.
- Returns
items or geojson data
- Return type
List[Dict] | Dict
Example
>>> import kwimage >>> data = [kwimage.Polygon.random(), >>> kwimage.Polygon.random(n_holes=1), >>> kwimage.MultiPolygon.random(n_holes=1), >>> kwimage.MultiPolygon.random()] >>> self = kwimage.PolygonList(data) >>> geojson = self.to_geojson(as_collection=True) >>> items = self.to_geojson(as_collection=False) >>> print('geojson = {}'.format(ub.repr2(geojson, nl=-2, precision=1))) >>> print('items = {}'.format(ub.repr2(items, nl=-2, precision=1)))
kwimage.structs.segmentation
¶Generic segmentation object that can use either a Mask or (Multi)Polygon backend.
Inherit from this class and define |
|
Either holds a MultiPolygon, Polygon, or Mask |
|
Store and manipulate multiple segmentations (masks or polygons), usually |
|
Attempts to auto-inspect the format of segmentation data |
- class kwimage.structs.segmentation._WrapperObject[source]¶
Bases:
ubelt.NiceRepr
Inherit from this class and define
__nice__
to “nicely” print your objects.Defines
__str__
and__repr__
in terms of__nice__
function Classes that inherit fromNiceRepr
should redefine__nice__
. If the inheriting class has a__len__
, method then the default__nice__
method will return its length.Example
>>> import ubelt as ub >>> class Foo(ub.NiceRepr): ... def __nice__(self): ... return 'info' >>> foo = Foo() >>> assert str(foo) == '<Foo(info)>' >>> assert repr(foo).startswith('<Foo(info) at ')
Example
>>> import ubelt as ub >>> class Bar(ub.NiceRepr): ... pass >>> bar = Bar() >>> import pytest >>> with pytest.warns(RuntimeWarning) as record: >>> assert 'object at' in str(bar) >>> assert 'object at' in repr(bar)
Example
>>> import ubelt as ub >>> class Baz(ub.NiceRepr): ... def __len__(self): ... return 5 >>> baz = Baz() >>> assert str(baz) == '<Baz(5)>'
Example
>>> import ubelt as ub >>> # If your nice message has a bug, it shouldn't bring down the house >>> class Foo(ub.NiceRepr): ... def __nice__(self): ... assert False >>> foo = Foo() >>> import pytest >>> with pytest.warns(RuntimeWarning) as record: >>> print('foo = {!r}'.format(foo)) foo = <...Foo ...>
- class kwimage.structs.segmentation.Segmentation(data, format=None)[source]¶
Bases:
_WrapperObject
Either holds a MultiPolygon, Polygon, or Mask
- Parameters
data (object) – the underlying object
format (str) – either ‘mask’, ‘polygon’, or ‘multipolygon’
- class kwimage.structs.segmentation.SegmentationList[source]¶
Bases:
kwimage.structs._generic.ObjectList
Store and manipulate multiple segmentations (masks or polygons), usually within the same image
- kwimage.structs.segmentation._coerce_coco_segmentation(data, dims=None)[source]¶
Attempts to auto-inspect the format of segmentation data
- Parameters
data – the data to coerce
2D-C-ndarray -> C_MASK 2D-F-ndarray -> F_MASK
Dict(counts=bytes) -> BYTES_RLE Dict(counts=ndarray) -> ARRAY_RLE
Dict(exterior=ndarray) -> ARRAY_RLE
# List[List[int]] -> Polygon List[int] -> Polygon List[Dict] -> MultPolygon
dims (Tuple) – required for certain formats like polygons height / width of the source image
- Returns
Mask | Polygon | MultiPolygon - depending on which is appropriate
Example
>>> segmentation = {'size': [5, 9], 'counts': ';?1B10O30O4'} >>> dims = (9, 5) >>> raw_mask = (np.random.rand(32, 32) > .5).astype(np.uint8) >>> _coerce_coco_segmentation(segmentation) >>> _coerce_coco_segmentation(raw_mask)
>>> coco_polygon = [ >>> np.array([[3, 0],[2, 1],[2, 4],[4, 4],[4, 3],[7, 0]]), >>> np.array([[2, 1],[2, 2],[4, 2],[4, 1]]), >>> ] >>> self = _coerce_coco_segmentation(coco_polygon, dims) >>> print('self = {!r}'.format(self)) >>> coco_polygon = [ >>> np.array([[3, 0],[2, 1],[2, 4],[4, 4],[4, 3],[7, 0]]), >>> ] >>> self = _coerce_coco_segmentation(coco_polygon, dims) >>> print('self = {!r}'.format(self))
Package Contents¶
Converts boxes between different formats as long as the last dimension |
|
A data structure to store n-dimensional coordinate geometry. |
|
Container for holding and manipulating multiple detections. |
|
Keeps track of a downscaled heatmap and how to transform it to overlay the |
|
Manages a single segmentation mask and can convert to and from |
|
Store and manipulate multiple masks, usually within the same image |
|
Stores multiple keypoints for a single object. |
|
Stores a list of Points, each item usually corresponds to a different object. |
|
Data structure for storing multiple polygons (typically related to the same |
|
Represents a single polygon as set of exterior boundary points and a list |
|
Stores and allows manipluation of multiple polygons, usually within the |
|
Either holds a MultiPolygon, Polygon, or Mask |
|
Store and manipulate multiple segmentations (masks or polygons), usually |
|
Smooths the probability map, but preserves the magnitude of the peaks. |
- class kwimage.structs.Boxes(data, format=None, check=True)[source]¶
Bases:
_BoxConversionMixins
,_BoxPropertyMixins
,_BoxTransformMixins
,_BoxDrawMixins
,ubelt.NiceRepr
Converts boxes between different formats as long as the last dimension contains 4 coordinates and the format is specified.
This is a convinience class, and should not not store the data for very long. The general idiom should be create class, convert data, and then get the raw data and let the class be garbage collected. This will help ensure that your code is portable and understandable if this class is not available.
Example
>>> # xdoctest: +IGNORE_WHITESPACE >>> import kwimage >>> import numpy as np >>> # Given an array / tensor that represents one or more boxes >>> data = np.array([[ 0, 0, 10, 10], >>> [ 5, 5, 50, 50], >>> [20, 0, 30, 10]]) >>> # The kwimage.Boxes data structure is a thin fast wrapper >>> # that provides methods for operating on the boxes. >>> # It requires that the user explicitly provide a code that denotes >>> # the format of the boxes (i.e. what each column represents) >>> boxes = kwimage.Boxes(data, 'ltrb') >>> # This means that there is no ambiguity about box format >>> # The representation string of the Boxes object demonstrates this >>> print('boxes = {!r}'.format(boxes)) boxes = <Boxes(ltrb, array([[ 0, 0, 10, 10], [ 5, 5, 50, 50], [20, 0, 30, 10]]))> >>> # if you pass this data around. You can convert to other formats >>> # For docs on available format codes see :class:`BoxFormat`. >>> # In this example we will convert (left, top, right, bottom) >>> # to (left-x, top-y, width, height). >>> boxes.toformat('xywh') <Boxes(xywh, array([[ 0, 0, 10, 10], [ 5, 5, 45, 45], [20, 0, 10, 10]]))> >>> # In addition to format conversion there are other operations >>> # We can quickly (using a C-backend) find IoUs >>> ious = boxes.ious(boxes) >>> print('{}'.format(ub.repr2(ious, nl=1, precision=2, with_dtype=False))) np.array([[1. , 0.01, 0. ], [0.01, 1. , 0.02], [0. , 0.02, 1. ]]) >>> # We can ask for the area of each box >>> print('boxes.area = {}'.format(ub.repr2(boxes.area, nl=0, with_dtype=False))) boxes.area = np.array([[ 100],[2025],[ 100]]) >>> # We can ask for the center of each box >>> print('boxes.center = {}'.format(ub.repr2(boxes.center, nl=1, with_dtype=False))) boxes.center = ( np.array([[ 5. ],[27.5],[25. ]]), np.array([[ 5. ],[27.5],[ 5. ]]), ) >>> # We can translate / scale the boxes >>> boxes.translate((10, 10)).scale(100) <Boxes(ltrb, array([[1000., 1000., 2000., 2000.], [1500., 1500., 6000., 6000.], [3000., 1000., 4000., 2000.]]))> >>> # We can clip the bounding boxes >>> boxes.translate((10, 10)).scale(100).clip(1200, 1200, 1700, 1800) <Boxes(ltrb, array([[1200., 1200., 1700., 1800.], [1500., 1500., 1700., 1800.], [1700., 1200., 1700., 1800.]]))> >>> # We can perform arbitrary warping of the boxes >>> # (note that if the transform is not axis aligned, the axis aligned >>> # bounding box of the transform result will be returned) >>> transform = np.array([[-0.83907153, 0.54402111, 0. ], >>> [-0.54402111, -0.83907153, 0. ], >>> [ 0. , 0. , 1. ]]) >>> boxes.warp(transform) <Boxes(ltrb, array([[ -8.3907153 , -13.8309264 , 5.4402111 , 0. ], [-39.23347095, -69.154632 , 23.00569785, -6.9154632 ], [-25.1721459 , -24.7113486 , -11.3412195 , -10.8804222 ]]))> >>> # Note, that we can transform the box to a Polygon for more >>> # accurate warping. >>> transform = np.array([[-0.83907153, 0.54402111, 0. ], >>> [-0.54402111, -0.83907153, 0. ], >>> [ 0. , 0. , 1. ]]) >>> warped_polys = boxes.to_polygons().warp(transform) >>> print(ub.repr2(warped_polys.data, sv=1)) [ <Polygon({ 'exterior': <Coords(data= array([[ 0. , 0. ], [ 5.4402111, -8.3907153], [ -2.9505042, -13.8309264], [ -8.3907153, -5.4402111], [ 0. , 0. ]]))>, 'interiors': [], })>, <Polygon({ 'exterior': <Coords(data= array([[ -1.4752521 , -6.9154632 ], [ 23.00569785, -44.67368205], [-14.752521 , -69.154632 ], [-39.23347095, -31.39641315], [ -1.4752521 , -6.9154632 ]]))>, 'interiors': [], })>, <Polygon({ 'exterior': <Coords(data= array([[-16.7814306, -10.8804222], [-11.3412195, -19.2711375], [-19.7319348, -24.7113486], [-25.1721459, -16.3206333], [-16.7814306, -10.8804222]]))>, 'interiors': [], })>, ] >>> # The kwimage.Boxes data structure is also convertable to >>> # several alternative data structures, like shapely, coco, and imgaug. >>> print(ub.repr2(boxes.to_shapely(), sv=1)) [ POLYGON ((0 0, 0 10, 10 10, 10 0, 0 0)), POLYGON ((5 5, 5 50, 50 50, 50 5, 5 5)), POLYGON ((20 0, 20 10, 30 10, 30 0, 20 0)), ] >>> # xdoctest: +REQUIRES(module:imgaug) >>> print(ub.repr2(boxes[0:1].to_imgaug(shape=(100, 100)), sv=1)) BoundingBoxesOnImage([BoundingBox(x1=0.0000, y1=0.0000, x2=10.0000, y2=10.0000, label=None)], shape=(100, 100)) >>> # xdoctest: -REQUIRES(module:imgaug) >>> print(ub.repr2(list(boxes.to_coco()), sv=1)) [ [0, 0, 10, 10], [5, 5, 45, 45], [20, 0, 10, 10], ] >>> # Finally, when you are done with your boxes object, you can >>> # unwrap the raw data by using the ``.data`` attribute >>> # all operations are done on this data, which gives the >>> # kwiamge.Boxes data structure almost no overhead when >>> # inserted into existing code. >>> print('boxes.data =\n{}'.format(ub.repr2(boxes.data, nl=1))) boxes.data = np.array([[ 0, 0, 10, 10], [ 5, 5, 50, 50], [20, 0, 30, 10]], dtype=np.int64) >>> # xdoctest: +REQUIRES(module:torch) >>> # This data structure was designed for use with both torch >>> # and numpy, the underlying data can be either an array or tensor. >>> boxes.tensor() <Boxes(ltrb, tensor([[ 0, 0, 10, 10], [ 5, 5, 50, 50], [20, 0, 30, 10]]))> >>> boxes.numpy() <Boxes(ltrb, array([[ 0, 0, 10, 10], [ 5, 5, 50, 50], [20, 0, 30, 10]]))>
Example
>>> # xdoctest: +IGNORE_WHITESPACE >>> from kwimage.structs.boxes import * # NOQA >>> # Demo of conversion methods >>> import kwimage >>> kwimage.Boxes([[25, 30, 15, 10]], 'xywh') <Boxes(xywh, array([[25, 30, 15, 10]]))> >>> kwimage.Boxes([[25, 30, 15, 10]], 'xywh').to_xywh() <Boxes(xywh, array([[25, 30, 15, 10]]))> >>> kwimage.Boxes([[25, 30, 15, 10]], 'xywh').to_cxywh() <Boxes(cxywh, array([[32.5, 35. , 15. , 10. ]]))> >>> kwimage.Boxes([[25, 30, 15, 10]], 'xywh').to_ltrb() <Boxes(ltrb, array([[25, 30, 40, 40]]))> >>> kwimage.Boxes([[25, 30, 15, 10]], 'xywh').scale(2).to_ltrb() <Boxes(ltrb, array([[50., 60., 80., 80.]]))> >>> # xdoctest: +REQUIRES(module:torch) >>> kwimage.Boxes(torch.FloatTensor([[25, 30, 15, 20]]), 'xywh').scale(.1).to_ltrb() <Boxes(ltrb, tensor([[ 2.5000, 3.0000, 4.0000, 5.0000]]))>
Notes
In the following examples we show cases where
Boxes
can hold a single 1-dimensional box array. This is a holdover from an older codebase, and some functions may assume that the input is at least 2-D. Thus when representing a single bounding box it is best practice to view it as a list of 1 box. While many function will work in the 1-D case, not all functions have been tested and thus we cannot gaurentee correctness.Example
>>> # xdoctest: +IGNORE_WHITESPACE >>> Boxes([25, 30, 15, 10], 'xywh') <Boxes(xywh, array([25, 30, 15, 10]))> >>> Boxes([25, 30, 15, 10], 'xywh').to_xywh() <Boxes(xywh, array([25, 30, 15, 10]))> >>> Boxes([25, 30, 15, 10], 'xywh').to_cxywh() <Boxes(cxywh, array([32.5, 35. , 15. , 10. ]))> >>> Boxes([25, 30, 15, 10], 'xywh').to_ltrb() <Boxes(ltrb, array([25, 30, 40, 40]))> >>> Boxes([25, 30, 15, 10], 'xywh').scale(2).to_ltrb() <Boxes(ltrb, array([50., 60., 80., 80.]))> >>> # xdoctest: +REQUIRES(module:torch) >>> Boxes(torch.FloatTensor([[25, 30, 15, 20]]), 'xywh').scale(.1).to_ltrb() <Boxes(ltrb, tensor([[ 2.5000, 3.0000, 4.0000, 5.0000]]))>
Example
>>> datas = [ >>> [1, 2, 3, 4], >>> [[1, 2, 3, 4], [4, 5, 6, 7]], >>> [[[1, 2, 3, 4], [4, 5, 6, 7]]], >>> ] >>> formats = BoxFormat.cannonical >>> for format1 in formats: >>> for data in datas: >>> self = box1 = Boxes(data, format1) >>> for format2 in formats: >>> box2 = box1.toformat(format2) >>> back = box2.toformat(format1) >>> assert box1 == back
- __getitem__(self, index)¶
- __eq__(self, other)¶
Tests equality of two Boxes objects
Example
>>> box0 = box1 = Boxes([[1, 2, 3, 4]], 'xywh') >>> box2 = Boxes(box0.data, 'ltrb') >>> box3 = Boxes([[0, 2, 3, 4]], box0.format) >>> box4 = Boxes(box0.data, box2.format) >>> assert box0 == box1 >>> assert not box0 == box2 >>> assert not box2 == box3 >>> assert box2 == box4
- __len__(self)¶
- __nice__(self)¶
- __repr__(self)¶
Return repr(self).
- classmethod random(Boxes, num=1, scale=1.0, format=BoxFormat.XYWH, anchors=None, anchor_std=1.0 / 6, tensor=False, rng=None)¶
Makes random boxes; typically for testing purposes
- Parameters
num (int) – number of boxes to generate
scale (float | Tuple[float, float]) – size of imgdims
format (str) – format of boxes to be created (e.g. ltrb, xywh)
anchors (ndarray) – normalized width / heights of anchor boxes to perterb and randomly place. (must be in range 0-1)
anchor_std (float) – magnitude of noise applied to anchor shapes
tensor (bool) – if True, returns boxes in tensor format
rng (None | int | RandomState) – initial random seed
Example
>>> # xdoctest: +IGNORE_WHITESPACE >>> Boxes.random(3, rng=0, scale=100) <Boxes(xywh, array([[54, 54, 6, 17], [42, 64, 1, 25], [79, 38, 17, 14]]))> >>> # xdoctest: +REQUIRES(module:torch) >>> Boxes.random(3, rng=0, scale=100).tensor() <Boxes(xywh, tensor([[ 54, 54, 6, 17], [ 42, 64, 1, 25], [ 79, 38, 17, 14]]))> >>> anchors = np.array([[.5, .5], [.3, .3]]) >>> Boxes.random(3, rng=0, scale=100, anchors=anchors) <Boxes(xywh, array([[ 2, 13, 51, 51], [32, 51, 32, 36], [36, 28, 23, 26]]))>
Example
>>> # Boxes position/shape within 0-1 space should be uniform. >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> fig = kwplot.figure(fnum=1, doclf=True) >>> fig.gca().set_xlim(0, 128) >>> fig.gca().set_ylim(0, 128) >>> import kwimage >>> kwimage.Boxes.random(num=10).scale(128).draw()
- copy(self)¶
- classmethod concatenate(cls, boxes, axis=0)¶
Concatenates multiple boxes together
- Parameters
boxes (Sequence[Boxes]) – list of boxes to concatenate
axis (int, default=0) – axis to stack on
- Returns
stacked boxes
- Return type
Example
>>> boxes = [Boxes.random(3) for _ in range(3)] >>> new = Boxes.concatenate(boxes) >>> assert len(new) == 9 >>> assert np.all(new.data[3:6] == boxes[1].data)
Example
>>> boxes = [Boxes.random(3) for _ in range(3)] >>> boxes[0].data = boxes[0].data[0] >>> boxes[1].data = boxes[0].data[0:0] >>> new = Boxes.concatenate(boxes) >>> assert len(new) == 4 >>> # xdoctest: +REQUIRES(module:torch) >>> new = Boxes.concatenate([b.tensor() for b in boxes]) >>> assert len(new) == 4
- compress(self, flags, axis=0, inplace=False)¶
Filters boxes based on a boolean criterion
- Parameters
flags (ArrayLike[bool]) – true for items to be kept
axis (int) – you usually want this to be 0
inplace (bool) – if True, modifies this object
Example
>>> self = Boxes([[25, 30, 15, 10]], 'ltrb') >>> self.compress([True]) <Boxes(ltrb, array([[25, 30, 15, 10]]))> >>> self.compress([False]) <Boxes(ltrb, array([], shape=(0, 4), dtype=int64))>
- take(self, idxs, axis=0, inplace=False)¶
Takes a subset of items at specific indices
- Parameters
indices (ArrayLike[int]) – indexes of items to take
axis (int) – you usually want this to be 0
inplace (bool) – if True, modifies this object
Example
>>> self = Boxes([[25, 30, 15, 10]], 'ltrb') >>> self.take([0]) <Boxes(ltrb, array([[25, 30, 15, 10]]))> >>> self.take([]) <Boxes(ltrb, array([], shape=(0, 4), dtype=int64))>
- is_tensor(self)¶
is the backend fueled by torch?
- is_numpy(self)¶
is the backend fueled by numpy?
- _impl(self)¶
returns the kwarray.ArrayAPI implementation for the data
Example
>>> assert Boxes.random().numpy()._impl.is_numpy >>> # xdoctest: +REQUIRES(module:torch) >>> assert Boxes.random().tensor()._impl.is_tensor
- property device(self)¶
If the backend is torch returns the data device, otherwise None
- astype(self, dtype)¶
Changes the type of the internal array used to represent the boxes
Notes
this operation is not inplace
Example
>>> # xdoctest: +IGNORE_WHITESPACE >>> # xdoctest: +REQUIRES(module:torch) >>> Boxes.random(3, 100, rng=0).tensor().astype('int32') <Boxes(xywh, tensor([[54, 54, 6, 17], [42, 64, 1, 25], [79, 38, 17, 14]], dtype=torch.int32))> >>> Boxes.random(3, 100, rng=0).numpy().astype('int32') <Boxes(xywh, array([[54, 54, 6, 17], [42, 64, 1, 25], [79, 38, 17, 14]], dtype=int32))> >>> Boxes.random(3, 100, rng=0).tensor().astype('float32') >>> Boxes.random(3, 100, rng=0).numpy().astype('float32')
- round(self, inplace=False)¶
Rounds data coordinates to the nearest integer.
This operation is applied directly to the box coordinates, so its output will depend on the format the boxes are stored in.
- Parameters
inplace (bool, default=False) – if True, modifies this object
- SeeAlso:
Example
>>> import kwimage >>> self = kwimage.Boxes.random(3, rng=0).scale(10) >>> new = self.round() >>> print('self = {!r}'.format(self)) >>> print('new = {!r}'.format(new)) self = <Boxes(xywh, array([[5.48813522, 5.44883192, 0.53949833, 1.70306146], [4.23654795, 6.4589411 , 0.13932407, 2.45878875], [7.91725039, 3.83441508, 1.71937704, 1.45453393]]))> new = <Boxes(xywh, array([[5., 5., 1., 2.], [4., 6., 0., 2.], [8., 4., 2., 1.]]))>
- quantize(self, inplace=False, dtype=np.int32)¶
Converts the box to integer coordinates.
This operation takes the floor of the left side and the ceil of the right side. Thus the area of the box will never decreases.
- Parameters
inplace (bool, default=False) – if True, modifies this object
dtype (type) – type to cast as
- SeeAlso:
Example
>>> import kwimage >>> self = kwimage.Boxes.random(3, rng=0).scale(10) >>> new = self.quantize() >>> print('self = {!r}'.format(self)) >>> print('new = {!r}'.format(new)) self = <Boxes(xywh, array([[5.48813522, 5.44883192, 0.53949833, 1.70306146], [4.23654795, 6.4589411 , 0.13932407, 2.45878875], [7.91725039, 3.83441508, 1.71937704, 1.45453393]]))> new = <Boxes(xywh, array([[5, 5, 2, 3], [4, 6, 1, 3], [7, 3, 3, 3]], dtype=int32))>
Example
>>> import kwimage >>> self = kwimage.Boxes.random(3, rng=0) >>> orig = self.copy() >>> self.quantize(inplace=True) >>> assert np.any(self.data != orig.data)
- numpy(self)¶
Converts tensors to numpy. Does not change memory if possible.
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> self = Boxes.random(3).tensor() >>> newself = self.numpy() >>> self.data[0, 0] = 0 >>> assert newself.data[0, 0] == 0 >>> self.data[0, 0] = 1 >>> assert self.data[0, 0] == 1
- tensor(self, device=ub.NoParam)¶
Converts numpy to tensors. Does not change memory if possible.
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> self = Boxes.random(3) >>> # xdoctest: +REQUIRES(module:torch) >>> newself = self.tensor() >>> self.data[0, 0] = 0 >>> assert newself.data[0, 0] == 0 >>> self.data[0, 0] = 1 >>> assert self.data[0, 0] == 1
- ious(self, other, bias=0, impl='auto', mode=None)¶
Intersection over union.
Compute IOUs (intersection area over union area) between these boxes and another set of boxes. This is a symmetric measure of similarity between boxes.
Todo
- [ ] Add pairwise flag to toggle between one-vs-one and all-vs-all
computation. I.E. Add option for componentwise calculation.
- Parameters
other (Boxes) – boxes to compare IoUs against
bias (int, default=0) – either 0 or 1, does TL=BR have area of 0 or 1?
impl (str, default=’auto’) – code to specify implementation used to ious. Can be either torch, py, c, or auto. Efficiency and the exact result will vary by implementation, but they will always be close. Some implementations only accept certain data types (e.g. impl=’c’, only accepts float32 numpy arrays). See ~/code/kwimage/dev/bench_bbox.py for benchmark details. On my system the torch impl was fastest (when the data was on the GPU).
mode – depricated, use impl
- SeeAlso:
iooas - for a measure of coverage between boxes
Examples
>>> import kwimage >>> self = kwimage.Boxes(np.array([[ 0, 0, 10, 10], >>> [10, 0, 20, 10], >>> [20, 0, 30, 10]]), 'ltrb') >>> other = kwimage.Boxes(np.array([6, 2, 20, 10]), 'ltrb') >>> overlaps = self.ious(other, bias=1).round(2) >>> assert np.all(np.isclose(overlaps, [0.21, 0.63, 0.04])), repr(overlaps)
Examples
>>> import kwimage >>> boxes1 = kwimage.Boxes(np.array([[ 0, 0, 10, 10], >>> [10, 0, 20, 10], >>> [20, 0, 30, 10]]), 'ltrb') >>> other = kwimage.Boxes(np.array([[6, 2, 20, 10], >>> [100, 200, 300, 300]]), 'ltrb') >>> overlaps = boxes1.ious(other) >>> print('{}'.format(ub.repr2(overlaps, precision=2, nl=1))) np.array([[0.18, 0. ], [0.61, 0. ], [0. , 0. ]]...)
Examples
>>> # xdoctest: +IGNORE_WHITESPACE >>> Boxes(np.empty(0), 'xywh').ious(Boxes(np.empty(4), 'xywh')).shape (0,) >>> #Boxes(np.empty(4), 'xywh').ious(Boxes(np.empty(0), 'xywh')).shape >>> Boxes(np.empty((0, 4)), 'xywh').ious(Boxes(np.empty((0, 4)), 'xywh')).shape (0, 0) >>> Boxes(np.empty((1, 4)), 'xywh').ious(Boxes(np.empty((0, 4)), 'xywh')).shape (1, 0) >>> Boxes(np.empty((0, 4)), 'xywh').ious(Boxes(np.empty((1, 4)), 'xywh')).shape (0, 1)
Examples
>>> # xdoctest: +REQUIRES(module:torch) >>> formats = BoxFormat.cannonical >>> istensors = [False, True] >>> results = {} >>> for format in formats: >>> for tensor in istensors: >>> boxes1 = Boxes.random(5, scale=10.0, rng=0, format=format, tensor=tensor) >>> boxes2 = Boxes.random(7, scale=10.0, rng=1, format=format, tensor=tensor) >>> ious = boxes1.ious(boxes2) >>> results[(format, tensor)] = ious >>> results = {k: v.numpy() if torch.is_tensor(v) else v for k, v in results.items() } >>> results = {k: v.tolist() for k, v in results.items()} >>> print(ub.repr2(results, sk=True, precision=3, nl=2)) >>> from functools import partial >>> assert ub.allsame(results.values(), partial(np.allclose, atol=1e-07))
- Ignore:
>>> # does this work with backprop? >>> # xdoctest: +REQUIRES(module:torch) >>> import torch >>> import kwimage >>> num = 1000 >>> true_boxes = kwimage.Boxes.random(num).tensor() >>> inputs = torch.rand(num, 10) >>> regress = torch.nn.Linear(10, 4) >>> energy = regress(inputs) >>> energy.retain_grad() >>> outputs = energy.sigmoid() >>> outputs.retain_grad() >>> out_boxes = kwimage.Boxes(outputs, 'cxywh') >>> ious = out_boxes.ious(true_boxes) >>> loss = ious.sum() >>> loss.backward()
- iooas(self, other, bias=0)¶
Intersection over other area.
This is an asymetric measure of coverage. How much of the “other” boxes are covered by these boxes. It is the area of intersection between each pair of boxes and the area of the “other” boxes.
- SeeAlso:
ious - for a measure of similarity between boxes
- Parameters
other (Boxes) – boxes to compare IoOA against
bias (int, default=0) – either 0 or 1, does TL=BR have area of 0 or 1?
Examples
>>> self = Boxes(np.array([[ 0, 0, 10, 10], >>> [10, 0, 20, 10], >>> [20, 0, 30, 10]]), 'ltrb') >>> other = Boxes(np.array([[6, 2, 20, 10], [0, 0, 0, 3]]), 'xywh') >>> coverage = self.iooas(other, bias=0).round(2) >>> print('coverage = {!r}'.format(coverage))
- isect_area(self, other, bias=0)¶
Intersection part of intersection over union computation
Examples
>>> # xdoctest: +IGNORE_WHITESPACE >>> self = Boxes.random(5, scale=10.0, rng=0, format='ltrb') >>> other = Boxes.random(3, scale=10.0, rng=1, format='ltrb') >>> isect = self.isect_area(other, bias=0) >>> ious_v1 = isect / ((self.area + other.area.T) - isect) >>> ious_v2 = self.ious(other, bias=0) >>> assert np.allclose(ious_v1, ious_v2)
- intersection(self, other)¶
Componentwise intersection between two sets of Boxes
intersections of boxes are always boxes, so this works
- Returns
intersected boxes
- Return type
Examples
>>> # xdoctest: +IGNORE_WHITESPACE >>> from kwimage.structs.boxes import * # NOQA >>> self = Boxes.random(5, rng=0).scale(10.) >>> other = self.translate(1) >>> new = self.intersection(other) >>> new_area = np.nan_to_num(new.area).ravel() >>> alt_area = np.diag(self.isect_area(other)) >>> close = np.isclose(new_area, alt_area) >>> assert np.all(close)
- union_hull(self, other)¶
Componentwise hull union between two sets of Boxes
NOTE: convert to polygon to do a real union.
- Returns
unioned boxes
- Return type
Examples
>>> # xdoctest: +IGNORE_WHITESPACE >>> from kwimage.structs.boxes import * # NOQA >>> self = Boxes.random(5, rng=0).scale(10.) >>> other = self.translate(1) >>> new = self.union_hull(other) >>> new_area = np.nan_to_num(new.area).ravel()
- bounding_box(self)¶
Returns the box that bounds all of the contained boxes
- Returns
a single box
- Return type
Examples
>>> # xdoctest: +IGNORE_WHITESPACE >>> from kwimage.structs.boxes import * # NOQA >>> self = Boxes.random(5, rng=0).scale(10.) >>> other = self.translate(1) >>> new = self.union_hull(other) >>> new_area = np.nan_to_num(new.area).ravel()
- contains(self, other)¶
Determine of points are completely contained by these boxes
- Parameters
other (Points) – points to test for containment. TODO: support generic data types
- Returns
- N x M boolean matrix indicating which box
contains which points, where N is the number of boxes and M is the number of points.
- Return type
flags (ArrayLike)
Examples
>>> import kwimage >>> self = kwimage.Boxes.random(10).scale(10).round() >>> other = kwimage.Points.random(10).scale(10).round() >>> flags = self.contains(other) >>> flags = self.contains(self.xy_center) >>> assert np.all(np.diag(flags))
- view(self, *shape)¶
Passthrough method to view or reshape
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> self = Boxes.random(6, scale=10.0, rng=0, format='xywh').tensor() >>> assert list(self.view(3, 2, 4).data.shape) == [3, 2, 4] >>> self = Boxes.random(6, scale=10.0, rng=0, format='ltrb').tensor() >>> assert list(self.view(3, 2, 4).data.shape) == [3, 2, 4]
- class kwimage.structs.Coords(data=None, meta=None)[source]¶
Bases:
kwimage.structs._generic.Spatial
,ubelt.NiceRepr
A data structure to store n-dimensional coordinate geometry.
Currently it is up to the user to maintain what coordinate system this geometry belongs to.
Note
This class was designed to hold coordinates in r/c format, but in general this class is anostic to dimension ordering as long as you are consistent. However, there are two places where this matters:
(1) drawing and (2) gdal/imgaug-warping. In these places we will assume x/y for legacy reasons. This may change in the future.
The term axes with resepct to
Coords
always refers to the final numpy axis. In other words the final numpy-axis represents ALL of the coordinate-axes.- CommandLine:
xdoctest -m kwimage.structs.coords Coords
Example
>>> from kwimage.structs.coords import * # NOQA >>> import kwarray >>> rng = kwarray.ensure_rng(0) >>> self = Coords.random(num=4, dim=3, rng=rng) >>> print('self = {}'.format(self)) self = <Coords(data= array([[0.5488135 , 0.71518937, 0.60276338], [0.54488318, 0.4236548 , 0.64589411], [0.43758721, 0.891773 , 0.96366276], [0.38344152, 0.79172504, 0.52889492]]))> >>> matrix = rng.rand(4, 4) >>> self.warp(matrix) <Coords(data= array([[0.71037426, 1.25229659, 1.39498435], [0.60799503, 1.26483447, 1.42073131], [0.72106004, 1.39057144, 1.38757508], [0.68384299, 1.23914654, 1.29258196]]))> >>> self.translate(3, inplace=True) <Coords(data= array([[3.5488135 , 3.71518937, 3.60276338], [3.54488318, 3.4236548 , 3.64589411], [3.43758721, 3.891773 , 3.96366276], [3.38344152, 3.79172504, 3.52889492]]))> >>> self.translate(3, inplace=True) <Coords(data= array([[6.5488135 , 6.71518937, 6.60276338], [6.54488318, 6.4236548 , 6.64589411], [6.43758721, 6.891773 , 6.96366276], [6.38344152, 6.79172504, 6.52889492]]))> >>> self.scale(2) <Coords(data= array([[13.09762701, 13.43037873, 13.20552675], [13.08976637, 12.8473096 , 13.29178823], [12.87517442, 13.783546 , 13.92732552], [12.76688304, 13.58345008, 13.05778984]]))> >>> # xdoctest: +REQUIRES(module:torch) >>> self.tensor() >>> self.tensor().tensor().numpy().numpy() >>> self.numpy() >>> #self.draw_on()
- __repr__¶
- __nice__(self)¶
- __len__(self)¶
- property dtype(self)¶
- property dim(self)¶
- property shape(self)¶
- copy(self)¶
- classmethod random(Coords, num=1, dim=2, rng=None, meta=None)¶
Makes random coordinates; typically for testing purposes
- is_numpy(self)¶
- is_tensor(self)¶
- compress(self, flags, axis=0, inplace=False)¶
Filters items based on a boolean criterion
- Parameters
flags (ArrayLike[bool]) – true for items to be kept
axis (int) – you usually want this to be 0
inplace (bool, default=False) – if True, modifies this object
- Returns
filtered coords
- Return type
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10, rng=0) >>> self.compress([True] * len(self)) >>> self.compress([False] * len(self)) <Coords(data=array([], shape=(0, 2), dtype=float64))> >>> # xdoctest: +REQUIRES(module:torch) >>> self = self.tensor() >>> self.compress([True] * len(self)) >>> self.compress([False] * len(self))
- take(self, indices, axis=0, inplace=False)¶
Takes a subset of items at specific indices
- Parameters
indices (ArrayLike[int]) – indexes of items to take
axis (int) – you usually want this to be 0
inplace (bool, default=False) – if True, modifies this object
- Returns
filtered coords
- Return type
Example
>>> self = Coords(np.array([[25, 30, 15, 10]])) >>> self.take([0]) <Coords(data=array([[25, 30, 15, 10]]))> >>> self.take([]) <Coords(data=array([], shape=(0, 4), dtype=int64))>
- astype(self, dtype, inplace=False)¶
Changes the data type
- Parameters
dtype – new type
inplace (bool, default=False) – if True, modifies this object
- Returns
modified coordinates
- Return type
- round(self, inplace=False)¶
Rounds data to the nearest integer
- Parameters
inplace (bool, default=False) – if True, modifies this object
Example
>>> import kwimage >>> self = kwimage.Coords.random(3).scale(10) >>> self.round()
- view(self, *shape)¶
Passthrough method to view or reshape
- Parameters
*shape – new shape of the data
- Returns
modified coordinates
- Return type
Example
>>> self = Coords.random(6, dim=4).numpy() >>> assert list(self.view(3, 2, 4).data.shape) == [3, 2, 4] >>> # xdoctest: +REQUIRES(module:torch) >>> self = Coords.random(6, dim=4).tensor() >>> assert list(self.view(3, 2, 4).data.shape) == [3, 2, 4]
- classmethod concatenate(cls, coords, axis=0)¶
Concatenates lists of coordinates together
- Parameters
coords (Sequence[Coords]) – list of coords to concatenate
axis (int, default=0) – axis to stack on
- Returns
stacked coords
- Return type
- CommandLine:
xdoctest -m kwimage.structs.coords Coords.concatenate
Example
>>> coords = [Coords.random(3) for _ in range(3)] >>> new = Coords.concatenate(coords) >>> assert len(new) == 9 >>> assert np.all(new.data[3:6] == coords[1].data)
- property device(self)¶
If the backend is torch returns the data device, otherwise None
- property _impl(self)¶
Returns the internal tensor/numpy ArrayAPI implementation
- tensor(self, device=ub.NoParam)¶
Converts numpy to tensors. Does not change memory if possible.
- Returns
modified coordinates
- Return type
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> self = Coords.random(3).numpy() >>> newself = self.tensor() >>> self.data[0, 0] = 0 >>> assert newself.data[0, 0] == 0 >>> self.data[0, 0] = 1 >>> assert self.data[0, 0] == 1
- numpy(self)¶
Converts tensors to numpy. Does not change memory if possible.
- Returns
modified coordinates
- Return type
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> self = Coords.random(3).tensor() >>> newself = self.numpy() >>> self.data[0, 0] = 0 >>> assert newself.data[0, 0] == 0 >>> self.data[0, 0] = 1 >>> assert self.data[0, 0] == 1
- reorder_axes(self, new_order, inplace=False)¶
Change the ordering of the coordinate axes.
- Parameters
new_order (Tuple[int]) –
new_order[i]
should specify which axes in the original coordinates should be mapped to thei-th
position in the returned axes.inplace (bool, default=False) – if True, modifies data inplace
- Returns
modified coordinates
- Return type
Note
This is the ordering of the “columns” in final numpy axis, not the numpy axes themselves.
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords(data=np.array([ >>> [7, 11], >>> [13, 17], >>> [21, 23], >>> ])) >>> new = self.reorder_axes((1, 0)) >>> print('new = {!r}'.format(new)) new = <Coords(data= array([[11, 7], [17, 13], [23, 21]]))>
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10, rng=0) >>> new = self.reorder_axes((1, 0)) >>> # Remapping using 1, 0 reverses the axes >>> assert np.all(new.data[:, 0] == self.data[:, 1]) >>> assert np.all(new.data[:, 1] == self.data[:, 0]) >>> # Remapping using 0, 1 does nothing >>> eye = self.reorder_axes((0, 1)) >>> assert np.all(eye.data == self.data) >>> # Remapping using 0, 0, destroys the 1-th column >>> bad = self.reorder_axes((0, 0)) >>> assert np.all(bad.data[:, 0] == self.data[:, 0]) >>> assert np.all(bad.data[:, 1] == self.data[:, 0])
- warp(self, transform, input_dims=None, output_dims=None, inplace=False)¶
Generalized coordinate transform.
- Parameters
transform (GeometricTransform | ArrayLike | Augmenter | callable) – scikit-image tranform, a 3x3 transformation matrix, an imgaug Augmenter, or generic callable which transforms an NxD ndarray.
input_dims (Tuple) – shape of the image these objects correspond to (only needed / used when transform is an imgaug augmenter)
output_dims (Tuple) – unused in non-raster structures, only exists for compatibility.
inplace (bool, default=False) – if True, modifies data inplace
- Returns
modified coordinates
- Return type
Notes
Let D = self.dims
- transformation matrices can be either:
(D + 1) x (D + 1) # for homog
D x D # for scale / rotate
D x (D + 1) # for affine
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10, rng=0) >>> transform = skimage.transform.AffineTransform(scale=(2, 2)) >>> new = self.warp(transform) >>> assert np.all(new.data == self.scale(2).data)
- Doctest:
>>> self = Coords.random(10, rng=0) >>> assert np.all(self.warp(np.eye(3)).data == self.data) >>> assert np.all(self.warp(np.eye(2)).data == self.data)
- Doctest:
>>> # xdoctest: +REQUIRES(module:osgeo) >>> from osgeo import osr >>> wgs84_crs = osr.SpatialReference() >>> wgs84_crs.ImportFromEPSG(4326) >>> dst_crs = osr.SpatialReference() >>> dst_crs.ImportFromEPSG(2927) >>> transform = osr.CoordinateTransformation(wgs84_crs, dst_crs) >>> self = Coords.random(10, rng=0) >>> new = self.warp(transform) >>> assert np.all(new.data != self.data)
>>> # Alternative using generic func >>> def _gdal_coord_tranform(pts): ... return np.array([transform.TransformPoint(x, y, 0)[0:2] ... for x, y in pts]) >>> alt = self.warp(_gdal_coord_tranform) >>> assert np.all(alt.data != self.data) >>> assert np.all(alt.data == new.data)
- Doctest:
>>> # can use a generic function >>> def func(xy): ... return np.zeros_like(xy) >>> self = Coords.random(10, rng=0) >>> assert np.all(self.warp(func).data == 0)
- _warp_imgaug(self, augmenter, input_dims, inplace=False)¶
Warps by applying an augmenter from the imgaug library
Note
We are assuming you are using X/Y coordinates here.
- Parameters
augmenter (imgaug.augmenters.Augmenter)
input_dims (Tuple) – h/w of the input image
inplace (bool, default=False) – if True, modifies data inplace
- CommandLine:
xdoctest -m ~/code/kwimage/kwimage/structs/coords.py Coords._warp_imgaug
Example
>>> # xdoctest: +REQUIRES(module:imgaug) >>> from kwimage.structs.coords import * # NOQA >>> import imgaug >>> input_dims = (10, 10) >>> self = Coords.random(10).scale(input_dims) >>> augmenter = imgaug.augmenters.Fliplr(p=1) >>> new = self._warp_imgaug(augmenter, input_dims) >>> # y coordinate should not change >>> assert np.allclose(self.data[:, 1], new.data[:, 1]) >>> assert np.allclose(input_dims[0] - self.data[:, 0], new.data[:, 0])
>>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.figure(fnum=1, doclf=True) >>> from matplotlib import pyplot as pl >>> ax = plt.gca() >>> ax.set_xlim(0, input_dims[0]) >>> ax.set_ylim(0, input_dims[1]) >>> self.draw(color='red', alpha=.4, radius=0.1) >>> new.draw(color='blue', alpha=.4, radius=0.1)
Example
>>> # xdoctest: +REQUIRES(module:imgaug) >>> from kwimage.structs.coords import * # NOQA >>> import imgaug >>> input_dims = (32, 32) >>> inplace = 0 >>> self = Coords.random(1000, rng=142).scale(input_dims).scale(.8) >>> self.data = self.data.astype(np.int32).astype(np.float32) >>> augmenter = imgaug.augmenters.CropAndPad(px=(-4, 4), keep_size=1).to_deterministic() >>> new = self._warp_imgaug(augmenter, input_dims) >>> # Change should be linear >>> norm1 = (self.data - self.data.min(axis=0)) / (self.data.max(axis=0) - self.data.min(axis=0)) >>> norm2 = (new.data - new.data.min(axis=0)) / (new.data.max(axis=0) - new.data.min(axis=0)) >>> diff = norm1 - norm2 >>> assert np.allclose(diff, 0, atol=1e-6, rtol=1e-4) >>> #assert np.allclose(self.data[:, 1], new.data[:, 1]) >>> #assert np.allclose(input_dims[0] - self.data[:, 0], new.data[:, 0]) >>> # xdoc: +REQUIRES(--show) >>> import kwimage >>> im = kwimage.imresize(kwimage.grab_test_image(), dsize=input_dims[::-1]) >>> new_im = augmenter.augment_image(im) >>> import kwplot >>> plt = kwplot.autoplt() >>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.imshow(im, pnum=(1, 2, 1), fnum=1) >>> self.draw(color='red', alpha=.8, radius=0.5) >>> kwplot.imshow(new_im, pnum=(1, 2, 2), fnum=1) >>> new.draw(color='blue', alpha=.8, radius=0.5, coord_axes=[1, 0])
- to_imgaug(self, input_dims)¶
Translate to an imgaug object
- Returns
imgaug data structure
- Return type
imgaug.KeypointsOnImage
Example
>>> # xdoctest: +REQUIRES(module:imgaug) >>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10) >>> input_dims = (10, 10) >>> kpoi = self.to_imgaug(input_dims) >>> new = Coords.from_imgaug(kpoi) >>> assert np.allclose(new.data, self.data)
- classmethod from_imgaug(cls, kpoi)¶
- scale(self, factor, about=None, output_dims=None, inplace=False)¶
Scale coordinates by a factor
- Parameters
factor (float or Tuple[float, float]) – scale factor as either a scalar or per-dimension tuple.
about (Tuple | None) – if unspecified scales about the origin (0, 0), otherwise the rotation is about this point.
output_dims (Tuple) – unused in non-raster spatial structures
inplace (bool, default=False) – if True, modifies data inplace
- Returns
modified coordinates
- Return type
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10, rng=0) >>> new = self.scale(10) >>> assert new.data.max() <= 10
>>> self = Coords.random(10, rng=0) >>> self.data = (self.data * 10).astype(int) >>> new = self.scale(10) >>> assert new.data.dtype.kind == 'i' >>> new = self.scale(10.0) >>> assert new.data.dtype.kind == 'f'
- translate(self, offset, output_dims=None, inplace=False)¶
Shift the coordinates
- Parameters
offset (float or Tuple[float]) – transation offset as either a scalar or a per-dimension tuple.
output_dims (Tuple) – unused in non-raster spatial structures
inplace (bool, default=False) – if True, modifies data inplace
- Returns
modified coordinates
- Return type
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10, dim=3, rng=0) >>> new = self.translate(10) >>> assert new.data.min() >= 10 >>> assert new.data.max() <= 11 >>> Coords.random(3, dim=3, rng=0) >>> Coords.random(3, dim=3, rng=0).translate((1, 2, 3))
- rotate(self, theta, about=None, output_dims=None, inplace=False)¶
Rotate the coordinates about a point.
- Parameters
theta (float) – rotation angle in radians
about (Tuple | None) – if unspecified rotates about the origin (0, 0), otherwise the rotation is about this point.
output_dims (Tuple) – unused in non-raster spatial structures
inplace (bool, default=False) – if True, modifies data inplace
- Returns
modified coordinates
- Return type
Todo
[ ] Generalized ND Rotations?
References
https://math.stackexchange.com/questions/197772/gen-rot-matrix
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10, dim=2, rng=0) >>> theta = np.pi / 2 >>> new = self.rotate(theta)
>>> # Test rotate agrees with warp >>> sin_ = np.sin(theta) >>> cos_ = np.cos(theta) >>> rot_ = np.array([[cos_, -sin_], [sin_, cos_]]) >>> new2 = self.warp(rot_) >>> assert np.allclose(new.data, new2.data)
>>> # >>> # Rotate about a custom point >>> theta = np.pi / 2 >>> new3 = self.rotate(theta, about=(0.5, 0.5)) >>> # >>> # Rotate about the center of mass >>> about = self.data.mean(axis=0) >>> new4 = self.rotate(theta, about=about) >>> # xdoc: +REQUIRES(--show) >>> # xdoc: +REQUIRES(module:kwplot) >>> import kwplot >>> kwplot.figure(fnum=1, doclf=True) >>> plt = kwplot.autoplt() >>> self.draw(radius=0.01, color='blue', alpha=.5, coord_axes=[1, 0], setlim='grow') >>> plt.gca().set_aspect('equal') >>> new3.draw(radius=0.01, color='red', alpha=.5, coord_axes=[1, 0], setlim='grow')
- _rectify_about(self, about)¶
Ensures that about returns a specified point. Allows for special keys like center to be used.
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10, dim=2, rng=0)
- fill(self, image, value, coord_axes=None, interp='bilinear')¶
Sets sub-coordinate locations in a grid to a particular value
- Parameters
coord_axes (Tuple) – specify which image axes each coordinate dim corresponds to. For 2D images, if you are storing r/c data, set to [0,1], if you are storing x/y data, set to [1,0].
- Returns
image with coordinates rasterized on it
- Return type
ndarray
- soft_fill(self, image, coord_axes=None, radius=5)¶
Used for drawing keypoint truth in heatmaps
- Parameters
coord_axes (Tuple) – specify which image axes each coordinate dim corresponds to. For 2D images, if you are storing r/c data, set to [0,1], if you are storing x/y data, set to [1,0].
In other words the i-th entry in coord_axes specifies which row-major spatial dimension the i-th column of a coordinate corresponds to. The index is the coordinate dimension and the value is the axes dimension.
- Returns
image with coordinates rasterized on it
- Return type
ndarray
References
https://stackoverflow.com/questions/54726703/generating-keypoint-heatmaps-in-tensorflow
Example
>>> from kwimage.structs.coords import * # NOQA >>> s = 64 >>> self = Coords.random(10, meta={'shape': (s, s)}).scale(s) >>> # Put points on edges to to verify "edge cases" >>> self.data[1] = [0, 0] # top left >>> self.data[2] = [s, s] # bottom right >>> self.data[3] = [0, s + 10] # bottom left >>> self.data[4] = [-3, s // 2] # middle left >>> self.data[5] = [s + 1, -1] # top right >>> # Put points in the middle to verify overlap blending >>> self.data[6] = [32.5, 32.5] # middle >>> self.data[7] = [34.5, 34.5] # middle >>> fill_value = 1 >>> coord_axes = [1, 0] >>> radius = 10 >>> image1 = np.zeros((s, s)) >>> self.soft_fill(image1, coord_axes=coord_axes, radius=radius) >>> radius = 3.0 >>> image2 = np.zeros((s, s)) >>> self.soft_fill(image2, coord_axes=coord_axes, radius=radius) >>> # xdoc: +REQUIRES(--show) >>> # xdoc: +REQUIRES(module:kwplot) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(image1, pnum=(1, 2, 1)) >>> kwplot.imshow(image2, pnum=(1, 2, 2))
- draw_on(self, image=None, fill_value=1, coord_axes=[1, 0], interp='bilinear')¶
Note
unlike other methods, the defaults assume x/y internal data
- Parameters
coord_axes (Tuple) – specify which image axes each coordinate dim corresponds to. For 2D images, if you are storing r/c data, set to [0,1], if you are storing x/y data, set to [1,0].
In other words the i-th entry in coord_axes specifies which row-major spatial dimension the i-th column of a coordinate corresponds to. The index is the coordinate dimension and the value is the axes dimension.
- Returns
image with coordinates drawn on it
- Return type
ndarray
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> from kwimage.structs.coords import * # NOQA >>> s = 256 >>> self = Coords.random(10, meta={'shape': (s, s)}).scale(s) >>> self.data[0] = [10, 10] >>> self.data[1] = [20, 40] >>> image = np.zeros((s, s)) >>> fill_value = 1 >>> image = self.draw_on(image, fill_value, coord_axes=[1, 0], interp='bilinear') >>> # image = self.draw_on(image, fill_value, coord_axes=[0, 1], interp='nearest') >>> # image = self.draw_on(image, fill_value, coord_axes=[1, 0], interp='bilinear') >>> # image = self.draw_on(image, fill_value, coord_axes=[1, 0], interp='nearest') >>> # xdoc: +REQUIRES(--show) >>> # xdoc: +REQUIRES(module:kwplot) >>> import kwplot >>> kwplot.autompl() >>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.imshow(image) >>> self.draw(radius=3, alpha=.5, coord_axes=[1, 0])
- draw(self, color='blue', ax=None, alpha=None, coord_axes=[1, 0], radius=1, setlim=False)¶
Note
unlike other methods, the defaults assume x/y internal data
- Parameters
setlim (bool) – if True ensures the limits of the axes contains the polygon
coord_axes (Tuple) – specify which image axes each coordinate dim corresponds to. For 2D images,
if you are storing r/c data, set to [0,1], if you are storing x/y data, set to [1,0].
- Returns
drawn matplotlib objects
- Return type
List[mpl.collections.PatchCollection]
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10) >>> # xdoc: +REQUIRES(--show) >>> self.draw(radius=3.0, setlim=True) >>> import kwplot >>> kwplot.autompl() >>> self.draw(radius=3.0)
- class kwimage.structs.Detections(data=None, meta=None, datakeys=None, metakeys=None, checks=True, **kwargs)[source]¶
Bases:
ubelt.NiceRepr
,_DetAlgoMixin
,_DetDrawMixin
Container for holding and manipulating multiple detections.
- Variables
data (Dict) –
dictionary containing corresponding lists. The length of each list is the number of detections. This contains the bounding boxes, confidence scores, and class indices. Details of the most common keys and types are as follows:
boxes (kwimage.Boxes[ArrayLike]): multiple bounding boxes scores (ArrayLike): associated scores class_idxs (ArrayLike): associated class indices segmentations (ArrayLike): segmentations masks for each box,
members can be
Mask
orMultiPolygon
.- keypoints (ArrayLike): keypoints for each box. Members should
be
Points
.
Additional custom keys may be specified as long as (a) the values are array-like and the first axis corresponds to the standard data values and (b) are custom keys are listed in the datakeys kwargs when constructing the Detections.
meta (Dict) – This contains contextual information about the detections. This includes the class names, which can be indexed into via the class indexes.
Example
>>> import kwimage >>> dets = kwimage.Detections( >>> # there are expected keys that do not need registration >>> boxes=kwimage.Boxes.random(3), >>> class_idxs=[0, 1, 1], >>> classes=['a', 'b'], >>> # custom data attrs must align with boxes >>> myattr1=np.random.rand(3), >>> myattr2=np.random.rand(3, 2, 8), >>> # there are no restrictions on metadata >>> mymeta='a custom metadata string', >>> # Note that any key not in kwimage.Detections.__datakeys__ or >>> # kwimage.Detections.__metakeys__ must be registered at the >>> # time of construction. >>> datakeys=['myattr1', 'myattr2'], >>> metakeys=['mymeta'], >>> checks=True, >>> ) >>> print('dets = {}'.format(dets)) dets = <Detections(3)>
- __datakeys__ = ['boxes', 'scores', 'class_idxs', 'probs', 'weights', 'keypoints', 'segmentations']¶
- __metakeys__ = ['classes']¶
- __nice__(self)¶
- __len__(self)¶
- copy(self)¶
Returns a deep copy of this Detections object
- classmethod coerce(cls, data=None, **kwargs)¶
The “try-anything to get what I want” constructor
- Parameters
data
**kwargs – currently boxes and cnames
Example
>>> from kwimage.structs.detections import * # NOQA >>> import kwimage >>> kwargs = dict( >>> boxes=kwimage.Boxes.random(4), >>> cnames=['a', 'b', 'c', 'c'], >>> ) >>> data = {} >>> self = kwimage.Detections.coerce(data, **kwargs)
- classmethod from_coco_annots(cls, anns, cats=None, classes=None, kp_classes=None, shape=None, dset=None)¶
Create a Detections object from a list of coco-like annotations.
- Parameters
anns (List[Dict]) – list of coco-like annotation objects
dset (CocoDataset) – if specified, cats, classes, and kp_classes can are ignored.
cats (List[Dict]) – coco-format category information. Used only if dset is not specified.
classes (ndsampler.CategoryTree) – category tree with coco class info. Used only if dset is not specified.
kp_classes (ndsampler.CategoryTree) – keypoint category tree with coco keypoint class info. Used only if dset is not specified.
shape (tuple) – shape of parent image
- Returns
a detections object
- Return type
Example
>>> from kwimage.structs.detections import * # NOQA >>> # xdoctest: +REQUIRES(--module:ndsampler) >>> anns = [{ >>> 'id': 0, >>> 'image_id': 1, >>> 'category_id': 2, >>> 'bbox': [2, 3, 10, 10], >>> 'keypoints': [4.5, 4.5, 2], >>> 'segmentation': { >>> 'counts': '_11a04M2O0O20N101N3L_5', >>> 'size': [20, 20], >>> }, >>> }] >>> dataset = { >>> 'images': [], >>> 'annotations': [], >>> 'categories': [ >>> {'id': 0, 'name': 'background'}, >>> {'id': 2, 'name': 'class1', 'keypoints': ['spot']} >>> ] >>> } >>> #import ndsampler >>> #dset = ndsampler.CocoDataset(dataset) >>> cats = dataset['categories'] >>> dets = Detections.from_coco_annots(anns, cats)
Example
>>> # xdoctest: +REQUIRES(--module:ndsampler) >>> # Test case with no category information >>> from kwimage.structs.detections import * # NOQA >>> anns = [{ >>> 'id': 0, >>> 'image_id': 1, >>> 'category_id': None, >>> 'bbox': [2, 3, 10, 10], >>> 'prob': [.1, .9], >>> }] >>> cats = [ >>> {'id': 0, 'name': 'background'}, >>> {'id': 2, 'name': 'class1'} >>> ] >>> dets = Detections.from_coco_annots(anns, cats)
Example
>>> import kwimage >>> # xdoctest: +REQUIRES(--module:ndsampler) >>> import ndsampler >>> sampler = ndsampler.CocoSampler.demo('photos') >>> iminfo, anns = sampler.load_image_with_annots(1) >>> shape = iminfo['imdata'].shape[0:2] >>> kp_classes = sampler.dset.keypoint_categories() >>> dets = kwimage.Detections.from_coco_annots( >>> anns, sampler.dset.dataset['categories'], sampler.catgraph, >>> kp_classes, shape=shape)
- to_coco(self, cname_to_cat=None, style='orig', image_id=None, dset=None)¶
Converts this set of detections into coco-like annotation dictionaries.
Notes
Not all aspects of the MS-COCO format can be accurately represented, so some liberties are taken. The MS-COCO standard defines that annotations should specifiy a category_id field, but in some cases this information is not available so we will populate a ‘category_name’ field if possible and in the worst case fall back to ‘category_index’.
Additionally, detections may contain additional information beyond the MS-COCO standard, and this information (e.g. weight, prob, score) is added as forign fields.
- Parameters
cname_to_cat – currently ignored.
style (str, default=’orig’) – either ‘orig’ (for the original coco format) or ‘new’ for the more general kwcoco-style coco format.
image_id (int, default=None) – if specified, populates the image_id field of each image
dset (CocoDataset, default=None) – if specified, attempts to populate the category_id field to be compatible with this coco dataset.
- Yields
dict – coco-like annotation structures
Example
>>> # xdoctest: +REQUIRES(module:ndsampler) >>> from kwimage.structs.detections import * >>> self = Detections.demo()[0] >>> cname_to_cat = None >>> list(self.to_coco())
- property boxes(self)¶
- property class_idxs(self)¶
- property scores(self)¶
typically only populated for predicted detections
- property probs(self)¶
typically only populated for predicted detections
- property weights(self)¶
typically only populated for groundtruth detections
- property classes(self)¶
- num_boxes(self)¶
- warp(self, transform, input_dims=None, output_dims=None, inplace=False)¶
Spatially warp the detections.
Example
>>> import skimage >>> transform = skimage.transform.AffineTransform(scale=(2, 3), translation=(4, 5)) >>> self = Detections.random(2) >>> new = self.warp(transform) >>> assert new.boxes == self.boxes.warp(transform) >>> assert new != self
- scale(self, factor, output_dims=None, inplace=False)¶
Spatially warp the detections.
Example
>>> import skimage >>> transform = skimage.transform.AffineTransform(scale=(2, 3), translation=(4, 5)) >>> self = Detections.random(2) >>> new = self.warp(transform) >>> assert new.boxes == self.boxes.warp(transform) >>> assert new != self
- translate(self, offset, output_dims=None, inplace=False)¶
Spatially warp the detections.
Example
>>> import skimage >>> self = Detections.random(2) >>> new = self.translate(10)
- classmethod concatenate(cls, dets)¶
- Parameters
boxes (Sequence[Detections]) – list of detections to concatenate
- Returns
stacked detections
- Return type
Example
>>> self = Detections.random(2) >>> other = Detections.random(3) >>> dets = [self, other] >>> new = Detections.concatenate(dets) >>> assert new.num_boxes() == 5
>>> self = Detections.random(2, segmentations=True) >>> other = Detections.random(3, segmentations=True) >>> dets = [self, other] >>> new = Detections.concatenate(dets) >>> assert new.num_boxes() == 5
- argsort(self, reverse=True)¶
Sorts detection indices by descending (or ascending) scores
- Returns
sorted indices
- Return type
ndarray[int]
- sort(self, reverse=True)¶
Sorts detections by descending (or ascending) scores
- Returns
sorted copy of self
- Return type
- compress(self, flags, axis=0)¶
Returns a subset where corresponding locations are True.
- Parameters
flags (ndarray[bool]) – mask marking selected items
- Returns
subset of self
- Return type
- CommandLine:
xdoctest -m kwimage.structs.detections Detections.compress
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> import kwimage >>> dets = kwimage.Detections.random(keypoints='dense') >>> flags = np.random.rand(len(dets)) > 0.5 >>> subset = dets.compress(flags) >>> assert len(subset) == flags.sum() >>> subset = dets.tensor().compress(flags) >>> assert len(subset) == flags.sum()
- take(self, indices, axis=0)¶
Returns a subset specified by indices
- Parameters
indices (ndarray[int]) – indices to select
- Returns
subset of self
- Return type
Example
>>> import kwimage >>> dets = kwimage.Detections(boxes=kwimage.Boxes.random(10)) >>> subset = dets.take([2, 3, 5, 7]) >>> assert len(subset) == 4 >>> # xdoctest: +REQUIRES(module:torch) >>> subset = dets.tensor().take([2, 3, 5, 7]) >>> assert len(subset) == 4
- __getitem__(self, index)¶
Fancy slicing / subset / indexing.
Note: scalar indices are always coerced into index lists of length 1.
Example
>>> import kwimage >>> import kwarray >>> dets = kwimage.Detections(boxes=kwimage.Boxes.random(10)) >>> indices = [2, 3, 5, 7] >>> flags = kwarray.boolmask(indices, len(dets)) >>> assert dets[flags].data == dets[indices].data
- property device(self)¶
If the backend is torch returns the data device, otherwise None
- is_tensor(self)¶
is the backend fueled by torch?
- is_numpy(self)¶
is the backend fueled by numpy?
- numpy(self)¶
Converts tensors to numpy. Does not change memory if possible.
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> self = Detections.random(3).tensor() >>> newself = self.numpy() >>> self.scores[0] = 0 >>> assert newself.scores[0] == 0 >>> self.scores[0] = 1 >>> assert self.scores[0] == 1 >>> self.numpy().numpy()
- property dtype(self)¶
- tensor(self, device=ub.NoParam)¶
Converts numpy to tensors. Does not change memory if possible.
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> from kwimage.structs.detections import * >>> self = Detections.random(3) >>> newself = self.tensor() >>> self.scores[0] = 0 >>> assert newself.scores[0] == 0 >>> self.scores[0] = 1 >>> assert self.scores[0] == 1 >>> self.tensor().tensor()
- classmethod demo(Detections)¶
- classmethod random(cls, num=10, scale=1.0, classes=3, keypoints=False, segmentations=False, tensor=False, rng=None)¶
Creates dummy data, suitable for use in tests and benchmarks
- Parameters
num (int) – number of boxes
scale (float | tuple, default=1.0) – bounding image size
classes (int | Sequence) – list of class labels or number of classes
keypoints (bool, default=False) – if True include random keypoints for each box.
segmentations (bool, default=False) – if True include random segmentations for each box.
tensor (bool, default=False) – determines backend. DEPRECATED. Call tensor on resulting object instead.
rng (np.random.RandomState) – random state
Example
>>> import kwimage >>> dets = kwimage.Detections.random(keypoints='jagged') >>> dets.data['keypoints'].data[0].data >>> dets.data['keypoints'].meta >>> dets = kwimage.Detections.random(keypoints='dense') >>> dets = kwimage.Detections.random(keypoints='dense', segmentations=True).scale(1000) >>> # xdoctest:+REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> dets.draw(setlim=True)
Example
>>> import kwimage >>> dets = kwimage.Detections.random( >>> keypoints='jagged', segmentations=True, rng=0).scale(1000) >>> print('dets = {}'.format(dets)) dets = <Detections(10)> >>> dets.data['boxes'].quantize(inplace=True) >>> print('dets.data = {}'.format(ub.repr2( >>> dets.data, nl=1, with_dtype=False, strvals=True))) dets.data = { 'boxes': <Boxes(xywh, array([[548, 544, 55, 172], [423, 645, 15, 247], [791, 383, 173, 146], [ 71, 87, 498, 839], [ 20, 832, 759, 39], [461, 780, 518, 20], [118, 639, 26, 306], [264, 414, 258, 361], [ 18, 568, 439, 50], [612, 616, 332, 66]], dtype=int32))>, 'class_idxs': [1, 2, 0, 0, 2, 0, 0, 0, 0, 0], 'keypoints': <PointsList(n=10)>, 'scores': [0.3595079 , 0.43703195, 0.6976312 , 0.06022547, 0.66676672, 0.67063787,0.21038256, 0.1289263 , 0.31542835, 0.36371077], 'segmentations': <SegmentationList(n=10)>, } >>> # xdoctest:+REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> dets.draw(setlim=True)
Example
>>> # Boxes position/shape within 0-1 space should be uniform. >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> fig = kwplot.figure(fnum=1, doclf=True) >>> fig.gca().set_xlim(0, 128) >>> fig.gca().set_ylim(0, 128) >>> import kwimage >>> kwimage.Detections.random(num=10, segmentations=True).scale(128).draw()
- class kwimage.structs.Heatmap(data=None, meta=None, **kwargs)[source]¶
Bases:
kwimage.structs._generic.Spatial
,_HeatmapDrawMixin
,_HeatmapWarpMixin
,_HeatmapAlgoMixin
Keeps track of a downscaled heatmap and how to transform it to overlay the original input image. Heatmaps generally are used to estimate class probabilites at each pixel. This data struction additionally contains logic to augment pixel with offset (dydx) and scale (diamter) information.
- Variables
data (Dict[str, ArrayLike]) –
dictionary containing spatially aligned heatmap data. Valid keys are as follows.
- class_probs (ArrayLike[C, H, W] | ArrayLike[C, D, H, W]):
A probability map for each class. C is the number of classes.
- offset (ArrayLike[2, H, W] | ArrayLike[3, D, H, W], optional):
object center position offset in y,x / t,y,x coordinates
- diamter (ArrayLike[2, H, W] | ArrayLike[3, D, H, W], optional):
object bounding box sizes in h,w / d,h,w coordinates
- keypoints (ArrayLike[2, K, H, W] | ArrayLike[3, K, D, H, W], optional):
y/x offsets for K different keypoint classes
dictionary containing miscellanious metadata about the heatmap data. Valid keys are as follows.
- img_dims (Tuple[H, W] | Tuple[D, H, W]):
original image dimension
- tf_data_to_image (skimage.transform._geometric.GeometricTransform):
transformation matrix (typically similarity or affine) that projects the given, heatmap onto the image dimensions such that the image and heatmap are spatially aligned.
- classes (List[str] | ndsampler.CategoryTree):
information about which index in data[‘class_probs’] corresponds to which semantic class.
dims (Tuple) – dimensions of the heatmap (See `image_dims) for the original image dimensions.
**kwargs – any key that is accepted by the data or meta dictionaries can be specified as a keyword argument to this class and it will be properly placed in the appropriate internal dictionary.
- CommandLine:
xdoctest -m ~/code/kwimage/kwimage/structs/heatmap.py Heatmap –show
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> from kwimage.structs.heatmap import * # NOQA >>> import kwimage >>> class_probs = kwimage.grab_test_image(dsize=(32, 32), space='gray')[None, ] / 255.0 >>> img_dims = (220, 220) >>> tf_data_to_img = skimage.transform.AffineTransform(translation=(-18, -18), scale=(8, 8)) >>> self = Heatmap(class_probs=class_probs, img_dims=img_dims, >>> tf_data_to_img=tf_data_to_img) >>> aligned = self.upscale() >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(aligned[0]) >>> kwplot.show_if_requested()
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> import kwimage >>> self = Heatmap.random() >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> self.draw()
- __datakeys__ = ['class_probs', 'offset', 'diameter', 'keypoints', 'class_idx', 'class_energy']¶
- __metakeys__ = ['img_dims', 'tf_data_to_img', 'classes', 'kp_classes']¶
- __spatialkeys__ = ['offset', 'diameter', 'keypoints']¶
- __nice__(self)¶
- __getitem__(self, index)¶
- __len__(self)¶
- property shape(self)¶
- property bounds(self)¶
- property dims(self)¶
space-time dimensions of this heatmap
- is_numpy(self)¶
- is_tensor(self)¶
- property _impl(self)¶
Returns the internal tensor/numpy ArrayAPI implementation
- Returns
kwarray.ArrayAPI
- classmethod random(cls, dims=(10, 10), classes=3, diameter=True, offset=True, keypoints=False, img_dims=None, dets=None, nblips=10, noise=0.0, rng=None)¶
Creates dummy data, suitable for use in tests and benchmarks
- Parameters
dims (Tuple) – dimensions of the heatmap
img_dims (Tuple) – dimensions of the image the heatmap corresponds to
Example
>>> from kwimage.structs.heatmap import * # NOQA >>> self = Heatmap.random((128, 128), img_dims=(200, 200), >>> classes=3, nblips=10, rng=0, noise=0.1) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(self.colorize(0, imgspace=0), fnum=1, pnum=(1, 4, 1), doclf=1) >>> kwplot.imshow(self.colorize(1, imgspace=0), fnum=1, pnum=(1, 4, 2)) >>> kwplot.imshow(self.colorize(2, imgspace=0), fnum=1, pnum=(1, 4, 3)) >>> kwplot.imshow(self.colorize(3, imgspace=0), fnum=1, pnum=(1, 4, 4))
- Ignore:
self.detect(0).sort().non_max_supress()[-np.arange(1, 4)].draw() from kwimage.structs.heatmap import * # NOQA import xdev globals().update(xdev.get_func_kwargs(Heatmap.random))
Example
>>> # xdoctest: +REQUIRES(module:ndsampler) >>> import kwimage >>> self = kwimage.Heatmap.random(dims=(50, 200), dets='coco', >>> keypoints=True) >>> image = np.zeros(self.img_dims) >>> # xdoctest: +REQUIRES(module:kwplot) >>> toshow = self.draw_on(image, 1, vecs=True, kpts=0, with_alpha=0.85) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.imshow(toshow)
- Ignore:
>>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.imshow(image) >>> dets.draw() >>> dets.data['keypoints'].draw(radius=6) >>> dets.data['segmentations'].draw()
>>> self.draw()
- property class_probs(self)¶
- property offset(self)¶
- property diameter(self)¶
- property img_dims(self)¶
- property tf_data_to_img(self)¶
- property classes(self)¶
- numpy(self)¶
Converts underlying data to numpy arrays
- tensor(self, device=ub.NoParam)¶
Converts underlying data to torch tensors
- kwimage.structs.smooth_prob(prob, k=3, inplace=False, eps=1e-09)[source]¶
Smooths the probability map, but preserves the magnitude of the peaks.
Notes
even if inplace is true, we still need to make a copy of the input array, however, we do ensure that it is cleaned up before we leave the function scope.
sigma=0.8 @ k=3, sigma=1.1 @ k=5, sigma=1.4 @ k=7
- class kwimage.structs.Mask(data=None, format=None)[source]¶
Bases:
ubelt.NiceRepr
,_MaskConversionMixin
,_MaskConstructorMixin
,_MaskTransformMixin
,_MaskDrawMixin
Manages a single segmentation mask and can convert to and from multiple formats including:
bytes_rle - byte encoded run length encoding
array_rle - raw run length encoding
c_mask - c-style binary mask
f_mask - fortran-style binary mask
Example
>>> # xdoc: +REQUIRES(--mask) >>> # a ms-coco style compressed bytes rle segmentation >>> segmentation = {'size': [5, 9], 'counts': ';?1B10O30O4'} >>> mask = Mask(segmentation, 'bytes_rle') >>> # convert to binary numpy representation >>> binary_mask = mask.to_c_mask().data >>> print(ub.repr2(binary_mask.tolist(), nl=1, nobr=1)) [0, 0, 0, 1, 1, 1, 1, 1, 0], [0, 0, 1, 1, 1, 0, 0, 0, 0], [0, 0, 1, 1, 1, 1, 1, 1, 0], [0, 0, 1, 1, 1, 0, 1, 1, 0], [0, 0, 1, 1, 1, 0, 1, 1, 0],
- property dtype(self)¶
- __nice__(self)¶
- classmethod random(Mask, rng=None, shape=(32, 32))¶
Create a random binary mask object
- Parameters
rng (int | RandomState | None) – the random seed
shape (Tuple[int, int]) – the height / width of the returned mask
- Returns
the random mask
- Return type
Example
>>> import kwimage >>> mask = kwimage.Mask.random() >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> mask.draw() >>> kwplot.show_if_requested()
- classmethod demo(cls)¶
Demo mask with holes and disjoint shapes
- Returns
the demo mask
- Return type
- copy(self)¶
Performs a deep copy of the mask data
- Returns
the copied mask
- Return type
Example
>>> self = Mask.random(shape=(8, 8), rng=0) >>> other = self.copy() >>> assert other.data is not self.data
- union(self, *others)¶
This can be used as a staticmethod or an instancemethod
- Parameters
*others – multiple input masks to union
- Returns
the unioned mask
- Return type
Example
>>> # xdoc: +REQUIRES(--mask) >>> from kwimage.structs.mask import * # NOQA >>> masks = [Mask.random(shape=(8, 8), rng=i) for i in range(2)] >>> mask = Mask.union(*masks) >>> print(mask.area) >>> masks = [m.to_c_mask() for m in masks] >>> mask = Mask.union(*masks) >>> print(mask.area)
>>> masks = [m.to_bytes_rle() for m in masks] >>> mask = Mask.union(*masks) >>> print(mask.area)
- Benchmark:
import ubelt as ub ti = ub.Timerit(100, bestof=10, verbose=2)
masks = [Mask.random(shape=(172, 172), rng=i) for i in range(2)]
- for timer in ti.reset(‘native rle union’):
masks = [m.to_bytes_rle() for m in masks] with timer:
mask = Mask.union(*masks)
- for timer in ti.reset(‘native cmask union’):
masks = [m.to_c_mask() for m in masks] with timer:
mask = Mask.union(*masks)
- for timer in ti.reset(‘cmask->rle union’):
masks = [m.to_c_mask() for m in masks] with timer:
mask = Mask.union(*[m.to_bytes_rle() for m in masks])
- intersection(self, *others)¶
This can be used as a staticmethod or an instancemethod
- Parameters
*others – multiple input masks to intersect
- Returns
the intersection of the masks
- Return type
Example
>>> n = 3 >>> masks = [Mask.random(shape=(8, 8), rng=i) for i in range(n)] >>> items = masks >>> mask = Mask.intersection(*masks) >>> areas = [item.area for item in items] >>> print('areas = {!r}'.format(areas)) >>> print(mask.area) >>> print(Mask.intersection(*masks).area / Mask.union(*masks).area)
- property shape(self)¶
- property area(self)¶
Returns the number of non-zero pixels
- Returns
the number of non-zero pixels
- Return type
Example
>>> self = Mask.demo() >>> self.area 150
- get_patch(self)¶
Extract the patch with non-zero data
Example
>>> # xdoc: +REQUIRES(--mask) >>> from kwimage.structs.mask import * # NOQA >>> self = Mask.random(shape=(8, 8), rng=0) >>> self.get_patch()
- get_xywh(self)¶
Gets the bounding xywh box coordinates of this mask
- Returns
- x, y, w, h: Note we dont use a Boxes object because
a general singular version does not yet exist.
- Return type
ndarray
Example
>>> # xdoc: +REQUIRES(--mask) >>> self = Mask.random(shape=(8, 8), rng=0) >>> self.get_xywh().tolist() >>> self = Mask.random(rng=0).translate((10, 10)) >>> self.get_xywh().tolist()
Example
>>> # test empty case >>> import kwimage >>> self = kwimage.Mask(np.empty((0, 0), dtype=np.uint8), format='c_mask') >>> assert self.get_xywh().tolist() == [0, 0, 0, 0]
- Ignore:
>>> import kwimage >>> self = kwimage.Mask(np.zeros((768, 768), dtype=np.uint8), format='c_mask') >>> x_coords = np.array([621, 752]) >>> y_coords = np.array([366, 292]) >>> self.data[y_coords, x_coords] = 1 >>> self.get_xywh()
>>> # References: >>> # https://stackoverflow.com/questions/33281957/faster-alternative-to-numpy-where >>> # https://answers.opencv.org/question/4183/what-is-the-best-way-to-find-bounding-box-for-binary-mask/ >>> import timerit >>> ti = timerit.Timerit(100, bestof=10, verbose=2) >>> for timer in ti.reset('time'): >>> with timer: >>> y_coords, x_coords = np.where(self.data) >>> # >>> for timer in ti.reset('time'): >>> with timer: >>> cv2.findNonZero(data)
self.data = np.random.rand(800, 700) > 0.5
import timerit ti = timerit.Timerit(100, bestof=10, verbose=2) for timer in ti.reset(‘time’):
- with timer:
y_coords, x_coords = np.where(self.data)
# for timer in ti.reset(‘time’):
- with timer:
data = np.ascontiguousarray(self.data).astype(np.uint8) cv2_coords = cv2.findNonZero(data)
>>> poly = self.to_multi_polygon()
- get_polygon(self)¶
DEPRECATED: USE to_multi_polygon
Returns a list of (x,y)-coordinate lists. The length of the list is equal to the number of disjoint regions in the mask.
- Returns
- polygon around each connected component of the
mask. Each ndarray is an Nx2 array of xy points.
- Return type
List[ndarray]
Note
The returned polygon may not surround points that are only one pixel thick.
Example
>>> # xdoc: +REQUIRES(--mask) >>> from kwimage.structs.mask import * # NOQA >>> self = Mask.random(shape=(8, 8), rng=0) >>> polygons = self.get_polygon() >>> print('polygons = ' + ub.repr2(polygons)) >>> polygons = self.get_polygon() >>> self = self.to_bytes_rle() >>> other = Mask.from_polygons(polygons, self.shape) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> image = np.ones(self.shape) >>> image = self.draw_on(image, color='blue') >>> image = other.draw_on(image, color='red') >>> kwplot.imshow(image)
- polygons = [
np.array([[6, 4],[7, 4]], dtype=np.int32), np.array([[0, 1],[0, 3],[2, 3],[2, 1]], dtype=np.int32),
]
- to_mask(self, dims=None)¶
Converts to a mask object (which does nothing because this already is mask object!)
- Returns
kwimage.Mask
- to_boxes(self)¶
Returns the bounding box of the mask.
- Returns
kwimage.Boxes
- to_multi_polygon(self)¶
Returns a MultiPolygon object fit around this raster including disjoint pieces and holes.
- Returns
vectorized representation
- Return type
Example
>>> # xdoc: +REQUIRES(--mask) >>> from kwimage.structs.mask import * # NOQA >>> self = Mask.demo() >>> self = self.scale(5) >>> multi_poly = self.to_multi_polygon() >>> # xdoc: +REQUIRES(module:kwplot) >>> # xdoc: +REQUIRES(--show) >>> self.draw(color='red') >>> multi_poly.scale(1.1).draw(color='blue')
>>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> image = np.ones(self.shape) >>> image = self.draw_on(image, color='blue') >>> #image = other.draw_on(image, color='red') >>> kwplot.imshow(image) >>> multi_poly.draw()
Example
>>> import kwimage >>> self = kwimage.Mask(np.empty((0, 0), dtype=np.uint8), format='c_mask') >>> poly = self.to_multi_polygon() >>> poly.to_multi_polygon()
Example
# Corner case, only two pixels are on >>> import kwimage >>> self = kwimage.Mask(np.zeros((768, 768), dtype=np.uint8), format=’c_mask’) >>> x_coords = np.array([621, 752]) >>> y_coords = np.array([366, 292]) >>> self.data[y_coords, x_coords] = 1 >>> poly = self.to_multi_polygon()
poly.to_mask(self.shape).data.sum()
self.to_array_rle().to_c_mask().data.sum() temp.to_c_mask().data.sum()
Example
>>> # TODO: how do we correctly handle the 1 or 2 point to a poly >>> # case? >>> import kwimage >>> data = np.zeros((8, 8), dtype=np.uint8) >>> data[0, 3:5] = 1 >>> data[7, 3:5] = 1 >>> data[3:5, 0:2] = 1 >>> self = kwimage.Mask.coerce(data) >>> polys = self.to_multi_polygon() >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(data) >>> polys.draw(border=True, linewidth=5, alpha=0.5, radius=0.2)
- get_convex_hull(self)¶
Returns a list of xy points around the convex hull of this mask
Note
The returned polygon may not surround points that are only one pixel thick.
Example
>>> # xdoc: +REQUIRES(--mask) >>> self = Mask.random(shape=(8, 8), rng=0) >>> polygons = self.get_convex_hull() >>> print('polygons = ' + ub.repr2(polygons)) >>> other = Mask.from_polygons(polygons, self.shape)
- iou(self, other)¶
The area of intersection over the area of union
Todo
- [ ] Write plural Masks version of this class, which should
be able to perform this operation more efficiently.
- CommandLine:
xdoctest -m kwimage.structs.mask Mask.iou
Example
>>> # xdoc: +REQUIRES(--mask) >>> self = Mask.demo() >>> other = self.translate(1) >>> iou = self.iou(other) >>> print('iou = {:.4f}'.format(iou)) iou = 0.0830 >>> iou2 = self.intersection(other).area / self.union(other).area >>> print('iou2 = {:.4f}'.format(iou2))
- classmethod coerce(Mask, data, dims=None)¶
Attempts to auto-inspect the format of the data and conver to Mask
- Parameters
data – the data to coerce
dims (Tuple) – required for certain formats like polygons height / width of the source image
- Returns
the constructed mask object
- Return type
Example
>>> # xdoc: +REQUIRES(--mask) >>> segmentation = {'size': [5, 9], 'counts': ';?1B10O30O4'} >>> polygon = [ >>> [np.array([[3, 0],[2, 1],[2, 4],[4, 4],[4, 3],[7, 0]])], >>> [np.array([[2, 1],[2, 2],[4, 2],[4, 1]])], >>> ] >>> dims = (9, 5) >>> mask = (np.random.rand(32, 32) > .5).astype(np.uint8) >>> Mask.coerce(polygon, dims).to_bytes_rle() >>> Mask.coerce(segmentation).to_bytes_rle() >>> Mask.coerce(mask).to_bytes_rle()
- _to_coco(self)¶
use to_coco instead
- to_coco(self, style='orig')¶
Convert the Mask to a COCO json representation based on the current format.
A COCO mask is formatted as a run-length-encoding (RLE), of which there are two variants: (1) a array RLE, which is slightly more readable and extensible, and (2) a bytes RLE, which is slightly more concise. The returned format will depend on the current format of the Mask object. If it is in “bytes_rle” format, it will be returned in that format, otherwise it will be converted to the “array_rle” format and returned as such.
- Parameters
style (str) – Does nothing for this particular method, exists for API compatibility and if alternate encoding styles are implemented in the future.
- Returns
- either a bytes-rle or array-rle encoding, depending
on the current mask format. The keys in this dictionary are as follows:
counts (List[int] | str): the array or bytes rle encoding
- size (Tuple[int]): the height and width of the encoded mask
see note.
- shape (Tuple[int]): only present in array-rle mode. This
is also the height/width of the underlying encoded array. This exists for semantic consistency with other kwimage conventions, and is not part of the original coco spec.
- order (str): only present in array-rle mode.
Either C or F, indicating if counts is aranged in row-major or column-major order. For COCO-compatibility this is always returned in F (column-major) order.
- binary (bool): only present in array-rle mode.
For COCO-compatibility this is always returned as False, indicating the mask only contains binary 0 or 1 values.
- Return type
Note
The output dictionary will contain a key named “size”, this is the only location in kwimage where “size” refers to a tuple in (height/width) order, in order to be backwards compatible with the original coco spec. In all other locations in kwimage a “size” will refer to a (width/height) ordered tuple.
- SeeAlso:
- func
kwimage.im_runlen.encode_run_length - backend function that does array-style run length encoding.
Example
>>> # xdoc: +REQUIRES(--mask) >>> from kwimage.structs.mask import * # NOQA >>> self = Mask.demo() >>> coco_data1 = self.toformat('array_rle').to_coco() >>> coco_data2 = self.toformat('bytes_rle').to_coco() >>> print('coco_data1 = {}'.format(ub.repr2(coco_data1, nl=1))) >>> print('coco_data2 = {}'.format(ub.repr2(coco_data2, nl=1))) coco_data1 = { 'binary': True, 'counts': [47, 5, 3, 1, 14, ... 1, 4, 19, 141], 'order': 'F', 'shape': (23, 32), 'size': (23, 32), } coco_data2 = { 'counts': '_153L;4EL...ON3060L0N060L0Nb0Y4', 'size': [23, 32], }
- class kwimage.structs.MaskList[source]¶
Bases:
kwimage.structs._generic.ObjectList
Store and manipulate multiple masks, usually within the same image
- to_polygon_list(self)¶
Converts all mask objects to multi-polygon objects
- Returns
kwimage.PolygonList
- to_segmentation_list(self)¶
Converts all items to segmentation objects
- Returns
kwimage.SegmentationList
- to_mask_list(self)¶
returns this object
- Returns
kwimage.MaskList
- class kwimage.structs.Points(data=None, meta=None, datakeys=None, metakeys=None, **kwargs)[source]¶
Bases:
kwimage.structs._generic.Spatial
,_PointsWarpMixin
Stores multiple keypoints for a single object.
This stores both the geometry and the class metadata if available
- Ignore:
- meta = {
“names” = [‘head’, ‘nose’, ‘tail’], “skeleton” = [(0, 1), (0, 2)],
}
Example
>>> from kwimage.structs.points import * # NOQA >>> xy = np.random.rand(10, 2) >>> pts = Points(xy=xy) >>> print('pts = {!r}'.format(pts))
- __datakeys__ = ['xy', 'class_idxs', 'visible']¶
- __metakeys__ = ['classes']¶
- __repr__¶
- __nice__(self)¶
- __len__(self)¶
- property shape(self)¶
- property xy(self)¶
- classmethod random(Points, num=1, classes=None, rng=None)¶
Makes random points; typically for testing purposes
Example
>>> import kwimage >>> self = kwimage.Points.random(classes=[1, 2, 3]) >>> self.data >>> print('self.data = {!r}'.format(self.data))
- is_numpy(self)¶
- is_tensor(self)¶
- _impl(self)¶
- tensor(self, device=ub.NoParam)¶
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> from kwimage.structs.points import * # NOQA >>> self = Points.random(10) >>> self.tensor()
- round(self, inplace=False)¶
Rounds data to the nearest integer
- Parameters
inplace (bool, default=False) – if True, modifies this object
Example
>>> import kwimage >>> self = kwimage.Points.random(3).scale(10) >>> self.round()
- numpy(self)¶
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> from kwimage.structs.points import * # NOQA >>> self = Points.random(10) >>> self.tensor().numpy().tensor().numpy()
- draw_on(self, image, color='white', radius=None, copy=False)¶
- CommandLine:
xdoctest -m ~/code/kwimage/kwimage/structs/points.py Points.draw_on –show
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> from kwimage.structs.points import * # NOQA >>> s = 128 >>> image = np.zeros((s, s)) >>> self = Points.random(10).scale(s) >>> image = self.draw_on(image) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.autompl() >>> kwplot.imshow(image) >>> self.draw(radius=3, alpha=.5) >>> kwplot.show_if_requested()
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> from kwimage.structs.points import * # NOQA >>> s = 128 >>> image = np.zeros((s, s)) >>> self = Points.random(10).scale(s) >>> image = self.draw_on(image, radius=3, color='distinct') >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.autompl() >>> kwplot.imshow(image) >>> self.draw(radius=3, alpha=.5, color='classes') >>> kwplot.show_if_requested()
Example
>>> import kwimage >>> s = 32 >>> self = kwimage.Points.random(10).scale(s) >>> color = 'blue' >>> # Test drawong on all channel + dtype combinations >>> im3 = np.zeros((s, s, 3), dtype=np.float32) >>> im_chans = { >>> 'im3': im3, >>> 'im1': kwimage.convert_colorspace(im3, 'rgb', 'gray'), >>> 'im4': kwimage.convert_colorspace(im3, 'rgb', 'rgba'), >>> } >>> inputs = {} >>> for k, im in im_chans.items(): >>> inputs[k + '_01'] = (kwimage.ensure_float01(im.copy()), {'radius': None}) >>> inputs[k + '_255'] = (kwimage.ensure_uint255(im.copy()), {'radius': None}) >>> outputs = {} >>> for k, v in inputs.items(): >>> im, kw = v >>> outputs[k] = self.draw_on(im, color=color, **kw) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.figure(fnum=2, doclf=True) >>> kwplot.autompl() >>> pnum_ = kwplot.PlotNums(nCols=2, nRows=len(inputs)) >>> for k in inputs.keys(): >>> kwplot.imshow(inputs[k][0], fnum=2, pnum=pnum_(), title=k) >>> kwplot.imshow(outputs[k], fnum=2, pnum=pnum_(), title=k) >>> kwplot.show_if_requested()
- draw(self, color='blue', ax=None, alpha=None, radius=1, **kwargs)¶
TODO: can use kwplot.draw_points
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> from kwimage.structs.points import * # NOQA >>> pts = Points.random(10) >>> # xdoc: +REQUIRES(--show) >>> pts.draw(radius=0.01)
>>> from kwimage.structs.points import * # NOQA >>> self = Points.random(10, classes=['a', 'b', 'c']) >>> self.draw(radius=0.01, color='classes')
- compress(self, flags, axis=0, inplace=False)¶
Filters items based on a boolean criterion
Example
>>> from kwimage.structs.points import * # NOQA >>> self = Points.random(4) >>> flags = [1, 0, 1, 1] >>> other = self.compress(flags) >>> assert len(self) == 4 >>> assert len(other) == 3
>>> # xdoctest: +REQUIRES(module:torch) >>> other = self.tensor().compress(flags) >>> assert len(other) == 3
- take(self, indices, axis=0, inplace=False)¶
Takes a subset of items at specific indices
Example
>>> from kwimage.structs.points import * # NOQA >>> self = Points.random(4) >>> indices = [1, 3] >>> other = self.take(indices) >>> assert len(self) == 4 >>> assert len(other) == 2
>>> # xdoctest: +REQUIRES(module:torch) >>> other = self.tensor().take(indices) >>> assert len(other) == 2
- classmethod concatenate(cls, points, axis=0)¶
- to_coco(self, style='orig')¶
Converts to an mscoco-like representation
Note
items that are usually id-references to other objects may need to be rectified.
- Parameters
style (str) – either orig, new, new-id, or new-name
- Returns
mscoco-like representation
- Return type
Dict
Example
>>> from kwimage.structs.points import * # NOQA >>> self = Points.random(4, classes=['a', 'b']) >>> orig = self._to_coco(style='orig') >>> print('orig = {!r}'.format(orig)) >>> new_name = self._to_coco(style='new-name') >>> print('new_name = {}'.format(ub.repr2(new_name, nl=-1))) >>> # xdoctest: +REQUIRES(module:ndsampler) >>> import ndsampler >>> self.meta['classes'] = ndsampler.CategoryTree.coerce(self.meta['classes']) >>> new_id = self._to_coco(style='new-id') >>> print('new_id = {}'.format(ub.repr2(new_id, nl=-1)))
- _to_coco(self, style='orig')¶
See to_coco
- classmethod coerce(cls, data)¶
Attempt to coerce data into a Points object
- classmethod _from_coco(cls, coco_kpts, class_idxs=None, classes=None)¶
- classmethod from_coco(cls, coco_kpts, class_idxs=None, classes=None, warn=False)¶
- Parameters
coco_kpts (list | dict) – either the original list keypoint encoding or the new dict keypoint encoding.
class_idxs (list) – only needed if using old style
classes (list | CategoryTree) – list of all keypoint category names
warn (bool, default=False) – if True raise warnings
Example
>>> ## >>> classes = ['mouth', 'left-hand', 'right-hand'] >>> coco_kpts = [ >>> {'xy': (0, 0), 'visible': 2, 'keypoint_category': 'left-hand'}, >>> {'xy': (1, 2), 'visible': 2, 'keypoint_category': 'mouth'}, >>> ] >>> Points.from_coco(coco_kpts, classes=classes) >>> # Test without classes >>> Points.from_coco(coco_kpts) >>> # Test without any category info >>> coco_kpts2 = [ub.dict_diff(d, {'keypoint_category'}) for d in coco_kpts] >>> Points.from_coco(coco_kpts2) >>> # Test without category instead of keypoint_category >>> coco_kpts3 = [ub.map_keys(lambda x: x.replace('keypoint_', ''), d) for d in coco_kpts] >>> Points.from_coco(coco_kpts3) >>> # >>> # Old style >>> coco_kpts = [0, 0, 2, 0, 1, 2] >>> Points.from_coco(coco_kpts) >>> # Fail case >>> coco_kpts4 = [{'xy': [4686.5, 1341.5], 'category': 'dot'}] >>> Points.from_coco(coco_kpts4, classes=[])
Example
>>> # xdoctest: +REQUIRES(module:ndsampler) >>> import ndsampler >>> classes = ndsampler.CategoryTree.from_coco([ >>> {'name': 'mouth', 'id': 2}, {'name': 'left-hand', 'id': 3}, {'name': 'right-hand', 'id': 5} >>> ]) >>> coco_kpts = [ >>> {'xy': (0, 0), 'visible': 2, 'keypoint_category_id': 5}, >>> {'xy': (1, 2), 'visible': 2, 'keypoint_category_id': 2}, >>> ] >>> pts = Points.from_coco(coco_kpts, classes=classes) >>> assert pts.data['class_idxs'].tolist() == [2, 0]
- class kwimage.structs.PointsList[source]¶
Bases:
kwimage.structs._generic.ObjectList
Stores a list of Points, each item usually corresponds to a different object.
Notes
# TODO: when the data is homogenous we can use a more efficient # representation, otherwise we have to use heterogenous storage.
- class kwimage.structs.MultiPolygon[source]¶
Bases:
kwimage.structs._generic.ObjectList
Data structure for storing multiple polygons (typically related to the same underlying but potentitally disjoing object)
- Variables
data (List[Polygon]) –
- classmethod random(self, n=3, n_holes=0, rng=None, tight=False)¶
Create a random MultiPolygon
- Returns
MultiPolygon
- fill(self, image, value=1)¶
Inplace fill in an image based on this multi-polyon.
- Parameters
image (ndarray) – image to draw on (inplace)
value (int | Tuple[int], default=1) – value fill in with
- Returns
the image that has been modified in place
- Return type
ndarray
- to_multi_polygon(self)¶
- to_boxes(self)¶
Deprecated: lossy conversion use ‘bounding_box’ instead
- bounding_box(self)¶
Return the bounding box of the multi polygon
- Returns
- a Boxes object with one box that encloses all
polygons
- Return type
Example
>>> from kwimage.structs.polygon import * # NOQA >>> self = MultiPolygon.random(rng=0, n=10) >>> boxes = self.to_boxes() >>> sub_boxes = [d.to_boxes() for d in self.data] >>> areas1 = np.array([s.intersection(boxes).area[0] for s in sub_boxes]) >>> areas2 = np.array([s.area[0] for s in sub_boxes]) >>> assert np.allclose(areas1, areas2)
- to_mask(self, dims=None)¶
Returns a mask object indication regions occupied by this multipolygon
Example
>>> from kwimage.structs.polygon import * # NOQA >>> s = 100 >>> self = MultiPolygon.random(rng=0).scale(s) >>> dims = (s, s) >>> mask = self.to_mask(dims)
>>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.figure(fnum=1, doclf=True) >>> from matplotlib import pyplot as pl >>> ax = plt.gca() >>> ax.set_xlim(0, s) >>> ax.set_ylim(0, s) >>> self.draw(color='red', alpha=.4) >>> mask.draw(color='blue', alpha=.4)
- to_relative_mask(self)¶
Returns a translated mask such the mask dimensions are minimal.
In other words, we move the polygon all the way to the top-left and return a mask just big enough to fit the polygon.
- Returns
Mask
- classmethod coerce(cls, data, dims=None)¶
Attempts to construct a MultiPolygon instance from the input data
See Mask.coerce
- to_shapely(self)¶
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> # xdoc: +REQUIRES(module:shapely) >>> from kwimage.structs.polygon import * # NOQA >>> self = MultiPolygon.random(rng=0) >>> geom = self.to_shapely() >>> print('geom = {!r}'.format(geom))
- classmethod from_shapely(MultiPolygon, geom)¶
Convert a shapely polygon or multipolygon to a kwimage.MultiPolygon
- classmethod from_geojson(MultiPolygon, data_geojson)¶
Convert a geojson polygon or multipolygon to a kwimage.MultiPolygon
Example
>>> import kwimage >>> orig = kwimage.MultiPolygon.random() >>> data_geojson = orig.to_geojson() >>> self = kwimage.MultiPolygon.from_geojson(data_geojson)
- to_geojson(self)¶
Converts polygon to a geojson structure
- classmethod from_coco(cls, data, dims=None)¶
Accepts either new-style or old-style coco multi-polygons
- _to_coco(self, style='orig')¶
- to_coco(self, style='orig')¶
Example
>>> from kwimage.structs.polygon import * # NOQA >>> self = MultiPolygon.random(1, rng=0) >>> self.to_coco()
- swap_axes(self, inplace=False)¶
- class kwimage.structs.Polygon(data=None, meta=None, datakeys=None, metakeys=None, **kwargs)[source]¶
Bases:
kwimage.structs._generic.Spatial
,_PolyArrayBackend
,_PolyWarpMixin
,ubelt.NiceRepr
Represents a single polygon as set of exterior boundary points and a list of internal polygons representing holes.
By convention exterior boundaries should be counterclockwise and interior holes should be clockwise.
Example
>>> import kwimage >>> data = { >>> 'exterior': np.array([[13, 1], [13, 19], [25, 19], [25, 1]]), >>> 'interiors': [ >>> np.array([[13, 13], [14, 12], [24, 12], [25, 13], [25, 18], >>> [24, 19], [14, 19], [13, 18]]), >>> np.array([[13, 2], [14, 1], [24, 1], [25, 2], [25, 11], >>> [24, 12], [14, 12], [13, 11]])] >>> } >>> self = kwimage.Polygon(**data) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> self.draw(setlim=True)
Example
>>> import kwimage >>> self = kwimage.Polygon.random( >>> n=5, n_holes=1, convex=False, rng=0) >>> print('self = {}'.format(self)) self = <Polygon({ 'exterior': <Coords(data= array([[0.30371392, 0.97195856], [0.24372304, 0.60568445], [0.21408694, 0.34884262], [0.5799477 , 0.44020379], [0.83720288, 0.78367234]]))>, 'interiors': [<Coords(data= array([[0.50164209, 0.83520279], [0.25835064, 0.40313428], [0.28778562, 0.74758761], [0.30341266, 0.93748088]]))>], })> >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> self.draw(setlim=True)
- __datakeys__ = ['exterior', 'interiors']¶
- __metakeys__ = ['classes']¶
- property exterior(self)¶
- property interiors(self)¶
- __nice__(self)¶
- classmethod circle(cls, xy, r, resolution=64)¶
Create a circular polygon
Example
>>> xy = (0.5, 0.5) >>> r = .3 >>> poly = Polygon.circle(xy, r)
- classmethod random(cls, n=6, n_holes=0, convex=True, tight=False, rng=None)¶
- Parameters
n (int) – number of points in the polygon (must be 3 or more)
n_holes (int) – number of holes
tight (bool, default=False) – fits the minimum and maximum points between 0 and 1
convex (bool, default=True) – force resulting polygon will be convex (may remove exterior points)
- CommandLine:
xdoctest -m kwimage.structs.polygon Polygon.random
Example
>>> rng = None >>> n = 4 >>> n_holes = 1 >>> cls = Polygon >>> self = Polygon.random(n=n, rng=rng, n_holes=n_holes, convex=1) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.autompl() >>> self.draw()
References
https://gis.stackexchange.com/questions/207731/random-multipolygon https://stackoverflow.com/questions/8997099/random-polygon https://stackoverflow.com/questions/27548363/from-voronoi-tessellation-to-shapely-polygons https://stackoverflow.com/questions/8997099/algorithm-to-generate-random-2d-polygon
- _impl(self)¶
- to_mask(self, dims=None)¶
Convert this polygon to a mask
Todo
[ ] currently not efficient
- Parameters
dims (Tuple) – height and width of the output mask
- Returns
kwimage.Mask
Example
>>> from kwimage.structs.polygon import * # NOQA >>> self = Polygon.random(n_holes=1).scale(128) >>> mask = self.to_mask((128, 128)) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.figure(fnum=1, doclf=True) >>> mask.draw(color='blue') >>> mask.to_multi_polygon().draw(color='red', alpha=.5)
- to_relative_mask(self)¶
Returns a translated mask such the mask dimensions are minimal.
In other words, we move the polygon all the way to the top-left and return a mask just big enough to fit the polygon.
- Returns
kwimage.Mask
Example
>>> from kwimage.structs.polygon import * # NOQA >>> self = Polygon.random().scale(8).translate(100, 100) >>> mask = self.to_relative_mask() >>> assert mask.shape <= (8, 8) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.figure(fnum=1, doclf=True) >>> mask.draw(color='blue') >>> mask.to_multi_polygon().draw(color='red', alpha=.5)
- fill(self, image, value=1)¶
Inplace fill in an image based on this polyon.
- Parameters
image (ndarray) – image to draw on
value (int | Tuple[int], default=1) – value fill in with
- Returns
the image that has been modified in place
- Return type
ndarray
- _to_cv_countours(self)¶
OpenCV polygon representation, which is a list of points. Holes are implicitly represented. When another polygon is drawn over an existing polyon via cv2.fillPoly
- Returns
- where each ndarray is of shape [N, 1, 2],
where N is the number of points on the boundary, the middle dimension is always 1, and the trailing dimension represents x and y coordinates respectively.
- Return type
List[ndarray]
- classmethod coerce(Polygon, data)¶
Try to autodetermine format of input polygon and coerce it into a kwimage.Polygon.
- Parameters
data (object) – some type of data that can be interpreted as a polygon.
- Returns
kwimage.Polygon
Example
>>> import kwimage >>> self = kwimage.Polygon.random() >>> self.coerce(self) >>> self.coerce(self.exterior) >>> self.coerce(self.exterior.data) >>> self.coerce(self.data) >>> self.coerce(self.to_geojson())
- classmethod from_shapely(Polygon, geom)¶
Convert a shapely polygon to a kwimage.Polygon
- Parameters
geom (shapely.geometry.polygon.Polygon) – a shapely polygon
- Returns
kwimage.Polygon
- classmethod from_wkt(Polygon, data)¶
Convert a WKT string to a kwimage.Polygon
- Parameters
data (str) – a WKT polygon string
- Returns
kwimage.Polygon
Example
>>> import kwimage >>> data = 'POLYGON ((0.11 0.61, 0.07 0.588, 0.015 0.50, 0.11 0.61))' >>> self = kwimage.Polygon.from_wkt(data) >>> assert len(self.exterior) == 4
- classmethod from_geojson(Polygon, data_geojson)¶
Convert a geojson polygon to a kwimage.Polygon
- Parameters
data_geojson (dict) – geojson data
References
https://geojson.org/geojson-spec.html
Example
>>> from kwimage.structs.polygon import * # NOQA >>> self = Polygon.random(n_holes=2) >>> data_geojson = self.to_geojson() >>> new = Polygon.from_geojson(data_geojson)
- to_shapely(self)¶
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> # xdoc: +REQUIRES(module:shapely) >>> from kwimage.structs.polygon import * # NOQA >>> self = Polygon.random(n_holes=1) >>> self = self.scale(100) >>> geom = self.to_shapely() >>> print('geom = {!r}'.format(geom))
- to_geojson(self)¶
Converts polygon to a geojson structure
- Returns
Dict[str, object]
Example
>>> import kwimage >>> self = kwimage.Polygon.random() >>> print(self.to_geojson())
- to_wkt(self)¶
Convert a kwimage.Polygon to WKT string
Example
>>> import kwimage >>> self = kwimage.Polygon.random() >>> print(self.to_wkt())
- classmethod from_coco(cls, data, dims=None)¶
Accepts either new-style or old-style coco polygons
- _to_coco(self, style='orig')¶
- to_coco(self, style='orig')¶
- Returns
coco-style polygons
- Return type
List | Dict
- to_multi_polygon(self)¶
- to_boxes(self)¶
Deprecated: lossy conversion use ‘bounding_box’ instead
- property centroid(self)¶
- bounding_box(self)¶
Returns an axis-aligned bounding box for the segmentation
- Returns
kwimage.Boxes
- bounding_box_polygon(self)¶
Returns an axis-aligned bounding polygon for the segmentation.
Notes
This Polygon will be a Box, not a convex hull! Use shapely for convex hulls.
- Returns
kwimage.Polygon
- copy(self)¶
- clip(self, x_min, y_min, x_max, y_max, inplace=False)¶
Clip polygon to image boundaries.
Example
>>> from kwimage.structs.polygon import * >>> self = Polygon.random().scale(10).translate(-1) >>> self2 = self.clip(1, 1, 3, 3) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> self2.draw(setlim=True)
- draw_on(self, image, color='blue', fill=True, border=False, alpha=1.0, copy=False)¶
Rasterizes a polygon on an image. See draw for a vectorized matplotlib version.
- Parameters
image (ndarray) – image to raster polygon on.
color (str | tuple) – data coercable to a color
fill (bool, default=True) – draw the center mass of the polygon
border (bool, default=False) – draw the border of the polygon
alpha (float, default=1.0) – polygon transparency (setting alpha < 1 makes this function much slower).
copy (bool, default=False) – if False only copies if necessary
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> from kwimage.structs.polygon import * # NOQA >>> self = Polygon.random(n_holes=1).scale(128) >>> image = np.zeros((128, 128), dtype=np.float32) >>> image = self.draw_on(image) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(image, fnum=1)
Example
>>> import kwimage >>> color = 'blue' >>> self = kwimage.Polygon.random(n_holes=1).scale(128) >>> image = np.zeros((128, 128), dtype=np.float32) >>> # Test drawong on all channel + dtype combinations >>> im3 = np.random.rand(128, 128, 3) >>> im_chans = { >>> 'im3': im3, >>> 'im1': kwimage.convert_colorspace(im3, 'rgb', 'gray'), >>> 'im4': kwimage.convert_colorspace(im3, 'rgb', 'rgba'), >>> } >>> inputs = {} >>> for k, im in im_chans.items(): >>> inputs[k + '_01'] = (kwimage.ensure_float01(im.copy()), {'alpha': None}) >>> inputs[k + '_255'] = (kwimage.ensure_uint255(im.copy()), {'alpha': None}) >>> inputs[k + '_01_a'] = (kwimage.ensure_float01(im.copy()), {'alpha': 0.5}) >>> inputs[k + '_255_a'] = (kwimage.ensure_uint255(im.copy()), {'alpha': 0.5}) >>> outputs = {} >>> for k, v in inputs.items(): >>> im, kw = v >>> outputs[k] = self.draw_on(im, color=color, **kw) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.figure(fnum=2, doclf=True) >>> kwplot.autompl() >>> pnum_ = kwplot.PlotNums(nCols=2, nRows=len(inputs)) >>> for k in inputs.keys(): >>> kwplot.imshow(inputs[k][0], fnum=2, pnum=pnum_(), title=k) >>> kwplot.imshow(outputs[k], fnum=2, pnum=pnum_(), title=k) >>> kwplot.show_if_requested()
- draw(self, color='blue', ax=None, alpha=1.0, radius=1, setlim=False, border=False, linewidth=2)¶
Draws polygon in a matplotlib axes. See draw_on for in-memory image modification.
- Parameters
setlim (bool) – if True ensures the limits of the axes contains the polygon
color (str | Tuple) – coercable color
alpha (float) – fill transparency
setlim (bool) – if True, modify the x and y limits of the matplotlib axes such that the polygon is can be seen.
border (bool, default=False) – if True, draws an edge border on the polygon.
linewidth (bool) – width of the border
Todo
[ ] Rework arguments in favor of matplotlib standards
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> from kwimage.structs.polygon import * # NOQA >>> self = Polygon.random(n_holes=1) >>> self = self.scale(100) >>> # xdoc: +REQUIRES(--show) >>> self.draw() >>> import kwplot >>> kwplot.autompl() >>> from matplotlib import pyplot as plt >>> kwplot.figure(fnum=2) >>> self.draw(setlim=True)
- _ensure_vertex_order(self, inplace=False)¶
Fixes vertex ordering so the exterior ring is CCW and the interior rings are CW.
Example
>>> import kwimage >>> self = kwimage.Polygon.random(n=3, n_holes=2, rng=0) >>> print('self = {!r}'.format(self)) >>> new = self._ensure_vertex_order() >>> print('new = {!r}'.format(new))
>>> self = kwimage.Polygon.random(n=3, n_holes=2, rng=0).swap_axes() >>> print('self = {!r}'.format(self)) >>> new = self._ensure_vertex_order() >>> print('new = {!r}'.format(new))
- class kwimage.structs.PolygonList[source]¶
Bases:
kwimage.structs._generic.ObjectList
Stores and allows manipluation of multiple polygons, usually within the same image.
- to_mask_list(self, dims=None)¶
Converts all items to masks
- to_polygon_list(self)¶
- to_segmentation_list(self)¶
Converts all items to segmentation objects
- swap_axes(self, inplace=False)¶
- to_geojson(self, as_collection=False)¶
Converts a list of polygons/multipolygons to a geojson structure
- Parameters
as_collection (bool) – if True, wraps the polygon geojson items in a geojson feature collection, otherwise just return a list of items.
- Returns
items or geojson data
- Return type
List[Dict] | Dict
Example
>>> import kwimage >>> data = [kwimage.Polygon.random(), >>> kwimage.Polygon.random(n_holes=1), >>> kwimage.MultiPolygon.random(n_holes=1), >>> kwimage.MultiPolygon.random()] >>> self = kwimage.PolygonList(data) >>> geojson = self.to_geojson(as_collection=True) >>> items = self.to_geojson(as_collection=False) >>> print('geojson = {}'.format(ub.repr2(geojson, nl=-2, precision=1))) >>> print('items = {}'.format(ub.repr2(items, nl=-2, precision=1)))
- class kwimage.structs.Segmentation(data, format=None)[source]¶
Bases:
_WrapperObject
Either holds a MultiPolygon, Polygon, or Mask
- Parameters
data (object) – the underlying object
format (str) – either ‘mask’, ‘polygon’, or ‘multipolygon’
- classmethod random(cls, rng=None)¶
Example
>>> self = Segmentation.random() >>> print('self = {!r}'.format(self)) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.figure(fnum=1, doclf=True) >>> self.draw() >>> kwplot.show_if_requested()
- to_multi_polygon(self)¶
- to_mask(self, dims=None)¶
- property meta(self)¶
- classmethod coerce(cls, data, dims=None)¶
- class kwimage.structs.SegmentationList[source]¶
Bases:
kwimage.structs._generic.ObjectList
Store and manipulate multiple segmentations (masks or polygons), usually within the same image
- to_polygon_list(self)¶
Converts all mask objects to multi-polygon objects
- to_mask_list(self, dims=None)¶
Converts all mask objects to multi-polygon objects
- to_segmentation_list(self)¶
- classmethod coerce(cls, data)¶
Interpret data as a list of Segmentations
Submodules¶
kwimage.im_alphablend
¶
Module Contents¶
|
Stacks a sequences of layers on top of one another. The first item is the |
|
Places img1 on top of img2 respecting alpha channels. |
|
|
|
Core alpha blending algorithm |
|
Uglier but faster(? maybe not) version of the core alpha blending algorithm |
|
Alternative. Not well optimized |
|
Alternative. Not well optimized |
|
Returns the input image with 4 channels. |
- kwimage.im_alphablend.overlay_alpha_layers(layers, keepalpha=True, dtype=np.float32)[source]¶
Stacks a sequences of layers on top of one another. The first item is the topmost layer and the last item is the bottommost layer.
- Parameters
layers (Sequence[ndarray]) – stack of images
keepalpha (bool) – if False, the alpha channel is removed after blending
dtype (np.dtype) – format for blending computation (defaults to float32)
- Returns
raster: the blended images
- Return type
ndarray
References
http://stackoverflow.com/questions/25182421/overlay-numpy-alpha https://en.wikipedia.org/wiki/Alpha_compositing#Alpha_blending
Example
>>> import kwimage >>> keys = ['astro', 'carl', 'stars'] >>> layers = [kwimage.grab_test_image(k, dsize=(100, 100)) for k in keys] >>> layers = [kwimage.ensure_alpha_channel(g, alpha=.5) for g in layers] >>> stacked = overlay_alpha_layers(layers) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(stacked) >>> kwplot.show_if_requested()
- kwimage.im_alphablend.overlay_alpha_images(img1, img2, keepalpha=True, dtype=np.float32, impl='inplace')[source]¶
Places img1 on top of img2 respecting alpha channels. Works like the Photoshop layers with opacity.
- Parameters
img1 (ndarray) – top image to overlay over img2
img2 (ndarray) – base image to superimpose on
keepalpha (bool) – if False, the alpha channel is removed after blending
dtype (np.dtype) – format for blending computation (defaults to float32)
impl (str, default=inplace) – code specifying the backend implementation
- Returns
raster: the blended images
- Return type
ndarray
Todo
[ ] Make fast C++ version of this function
References
http://stackoverflow.com/questions/25182421/overlay-numpy-alpha https://en.wikipedia.org/wiki/Alpha_compositing#Alpha_blending
Example
>>> import kwimage >>> img1 = kwimage.grab_test_image('astro', dsize=(100, 100)) >>> img2 = kwimage.grab_test_image('carl', dsize=(100, 100)) >>> img1 = kwimage.ensure_alpha_channel(img1, alpha=.5) >>> img3 = overlay_alpha_images(img1, img2) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(img3) >>> kwplot.show_if_requested()
- kwimage.im_alphablend._alpha_blend_simple(rgb1, alpha1, rgb2, alpha2)[source]¶
Core alpha blending algorithm
- SeeAlso:
_alpha_blend_inplace - alternative implementation
- kwimage.im_alphablend._alpha_blend_inplace(rgb1, alpha1, rgb2, alpha2)[source]¶
Uglier but faster(? maybe not) version of the core alpha blending algorithm using preallocation and in-place computation where possible.
- SeeAlso:
_alpha_blend_simple - alternative implementation
Example
>>> rng = np.random.RandomState(0) >>> rgb1, rgb2 = rng.rand(10, 10, 3), rng.rand(10, 10, 3) >>> alpha1, alpha2 = rng.rand(10, 10), rng.rand(10, 10) >>> f1, f2 = _alpha_blend_inplace(rgb1, alpha1, rgb2, alpha2) >>> s1, s2 = _alpha_blend_simple(rgb1, alpha1, rgb2, alpha2) >>> assert np.all(f1 == s1) and np.all(f2 == s2) >>> alpha1, alpha2 = np.zeros((10, 10)), np.zeros((10, 10)) >>> f1, f2 = _alpha_blend_inplace(rgb1, alpha1, rgb2, alpha2) >>> s1, s2 = _alpha_blend_simple(rgb1, alpha1, rgb2, alpha2) >>> assert np.all(f1 == s1) and np.all(f2 == s2)
- kwimage.im_alphablend._alpha_blend_numexpr1(rgb1, alpha1, rgb2, alpha2)[source]¶
Alternative. Not well optimized
- kwimage.im_alphablend._alpha_blend_numexpr2(rgb1, alpha1, rgb2, alpha2)[source]¶
Alternative. Not well optimized
- kwimage.im_alphablend.ensure_alpha_channel(img, alpha=1.0, dtype=np.float32, copy=False)[source]¶
Returns the input image with 4 channels.
- Parameters
img (ndarray) – an image with shape [H, W], [H, W, 1], [H, W, 3], or [H, W, 4].
alpha (float, default=1.0) – default value for missing alpha channel
dtype (type, default=np.float32) – a numpy floating type
copy (bool, default=False) – always copy if True, else copy if needed.
- Returns
an image with specified dtype with shape [H, W, 4].
- Raises
ValueError - if the input image does not have 1, 3, or 4 input channels – or if the image cannot be converted into a float01 representation
kwimage.im_color
¶
Module Contents¶
Used for converting a single color between spaces and encodings. |
|
|
|
Uses colormath to convert colors |
- kwimage.im_color._colormath_convert(src_color, src_space, dst_space)[source]¶
Uses colormath to convert colors
Example
>>> # xdoctest: +REQUIRES(module:colormath) >>> import kwimage >>> src_color = kwimage.Color('turquoise').as01() >>> print('src_color = {}'.format(ub.repr2(src_color, nl=0, precision=2))) >>> src_space = 'rgb' >>> dst_space = 'lab' >>> lab_color = _colormath_convert(src_color, src_space, dst_space) >>> print('lab_color = {}'.format(ub.repr2(lab_color, nl=0, precision=2))) lab_color = (78.11, -70.09, -9.33) >>> rgb_color = _colormath_convert(lab_color, 'lab', 'rgb') >>> print('rgb_color = {}'.format(ub.repr2(rgb_color, nl=0, precision=2))) rgb_color = (0.29, 0.88, 0.81) >>> hsv_color = _colormath_convert(lab_color, 'lab', 'hsv') >>> print('hsv_color = {}'.format(ub.repr2(hsv_color, nl=0, precision=2))) hsv_color = (175.39, 1.00, 0.88)
- class kwimage.im_color.Color(color, alpha=None, space=None)[source]¶
Bases:
ubelt.NiceRepr
Used for converting a single color between spaces and encodings. This should only be used when handling small numbers of colors(e.g. 1), don’t use this to represent an image.
move to colorutil?
- Parameters
space (str) – colorspace of wrapped color. Assume RGB if not specified and it cannot be inferred
- CommandLine:
xdoctest -m ~/code/kwimage/kwimage/im_color.py Color
Example
>>> print(Color('g')) >>> print(Color('orangered')) >>> print(Color('#AAAAAA').as255()) >>> print(Color([0, 255, 0])) >>> print(Color([1, 1, 1.])) >>> print(Color([1, 1, 1])) >>> print(Color(Color([1, 1, 1])).as255()) >>> print(Color(Color([1., 0, 1, 0])).ashex()) >>> print(Color([1, 1, 1], alpha=255)) >>> print(Color([1, 1, 1], alpha=255, space='lab'))
- _forimage(self, image, space='rgb')[source]¶
Experimental function.
Create a numeric color tuple that agrees with the format of the input image (i.e. float or int, with 3 or 4 channels).
- Parameters
image (ndarray) – image to return color for
space (str, default=rgb) – colorspace of the input image.
Example
>>> img_f3 = np.zeros([8, 8, 3], dtype=np.float32) >>> img_u3 = np.zeros([8, 8, 3], dtype=np.uint8) >>> img_f4 = np.zeros([8, 8, 4], dtype=np.float32) >>> img_u4 = np.zeros([8, 8, 4], dtype=np.uint8) >>> Color('red')._forimage(img_f3) (1.0, 0.0, 0.0) >>> Color('red')._forimage(img_f4) (1.0, 0.0, 0.0, 1.0) >>> Color('red')._forimage(img_u3) (255, 0, 0) >>> Color('red')._forimage(img_u4) (255, 0, 0, 255) >>> Color('red', alpha=0.5)._forimage(img_f4) (1.0, 0.0, 0.0, 0.5) >>> Color('red', alpha=0.5)._forimage(img_u4) (255, 0, 0, 127)
- classmethod _is_base255(Color, channels)[source]¶
there is a one corner case where all pixels are 1 or less
- classmethod _string_to_01(Color, color)[source]¶
mplutil.Color._string_to_01(‘green’) mplutil.Color._string_to_01(‘red’)
- classmethod named_colors(cls)[source]¶
- Returns
names of colors that Color accepts
- Return type
List[str]
Example
>>> import kwimage >>> named_colors = kwimage.Color.named_colors() >>> color_lut = {name: kwimage.Color(name).as01() for name in named_colors} >>> # xdoctest: +REQUIRES(module:kwplot) >>> import kwplot >>> kwplot.autompl() >>> canvas = kwplot.make_legend_img(color_lut) >>> kwplot.imshow(canvas)
kwimage.im_core
¶
Not sure how to best classify these functions
Module Contents¶
|
Returns the number of color channels in an image. |
|
Ensure that an image is encoded using a float32 properly |
|
Ensure that an image is encoded using a uint8 properly. Either |
|
Broadcasts image arrays so they can have elementwise operations applied |
|
helper for make_channels_comparable |
|
Ensures that there are 3 channels in the image |
|
Rebalance pixel intensities via contrast stretching. |
|
Normalize data intensities using heuristics to help put sensor data with |
|
Allows slices with out-of-bound coordinates. Any out of bounds coordinate |
|
Applies requested padding to an extracted data slice. |
|
Embeds a “padded-slice” inside known data dimension. |
- kwimage.im_core.num_channels(img)[source]¶
Returns the number of color channels in an image.
Assumes images are 2D and the the channels are the trailing dimension. Returns 1 in the case with no trailing channel dimension, otherwise simply returns
img.shape[2]
.- Parameters
img (ndarray) – an image with 2 or 3 dimensions.
- Returns
the number of color channels (1, 3, or 4)
- Return type
Example
>>> H = W = 3 >>> assert num_channels(np.empty((W, H))) == 1 >>> assert num_channels(np.empty((W, H, 1))) == 1 >>> assert num_channels(np.empty((W, H, 3))) == 3 >>> assert num_channels(np.empty((W, H, 4))) == 4 >>> assert num_channels(np.empty((W, H, 2))) == 2
- kwimage.im_core.ensure_float01(img, dtype=np.float32, copy=True)[source]¶
Ensure that an image is encoded using a float32 properly
- Parameters
img (ndarray) – an image in uint255 or float01 format. Other formats will raise errors.
dtype (type, default=np.float32) – a numpy floating type
copy (bool, default=False) – always copy if True, else copy if needed.
- Returns
an array of floats in the range 0-1
- Return type
ndarray
- Raises
ValueError – if the image type is integer and not in [0-255]
Example
>>> ensure_float01(np.array([[0, .5, 1.0]])) array([[0. , 0.5, 1. ]], dtype=float32) >>> ensure_float01(np.array([[0, 1, 200]])) array([[0..., 0.0039..., 0.784...]], dtype=float32)
- kwimage.im_core.ensure_uint255(img, copy=True)[source]¶
Ensure that an image is encoded using a uint8 properly. Either
- Parameters
img (ndarray) – an image in uint255 or float01 format. Other formats will raise errors.
copy (bool, default=False) – always copy if True, else copy if needed.
- Returns
an array of bytes in the range 0-255
- Return type
ndarray
- Raises
ValueError – if the image type is float and not in [0-1]
ValueError – if the image type is integer and not in [0-255]
Example
>>> ensure_uint255(np.array([[0, .5, 1.0]])) array([[ 0, 127, 255]], dtype=uint8) >>> ensure_uint255(np.array([[0, 1, 200]])) array([[ 0, 1, 200]], dtype=uint8)
- kwimage.im_core.make_channels_comparable(img1, img2, atleast3d=False)[source]¶
Broadcasts image arrays so they can have elementwise operations applied
- Parameters
img1 (ndarray) – first image
img2 (ndarray) – second image
atleast3d (bool, default=False) – if true we ensure that the channel dimension exists (only relevant for 1-channel images)
Example
>>> import itertools as it >>> wh_basis = [(5, 5), (3, 5), (5, 3), (1, 1), (1, 3), (3, 1)] >>> for w, h in wh_basis: >>> shape_basis = [(w, h), (w, h, 1), (w, h, 3)] >>> # Test all permutations of shap inputs >>> for shape1, shape2 in it.product(shape_basis, shape_basis): >>> print('* input shapes: %r, %r' % (shape1, shape2)) >>> img1 = np.empty(shape1) >>> img2 = np.empty(shape2) >>> img1, img2 = make_channels_comparable(img1, img2) >>> print('... output shapes: %r, %r' % (img1.shape, img2.shape)) >>> elem = (img1 + img2) >>> print('... elem(+) shape: %r' % (elem.shape,)) >>> assert elem.size == img1.size, 'outputs should have same size' >>> assert img1.size == img2.size, 'new imgs should have same size' >>> print('--------')
- kwimage.im_core.atleast_3channels(arr, copy=True)[source]¶
Ensures that there are 3 channels in the image
- Parameters
arr (ndarray[N, M, …]) – the image
copy (bool) – Always copies if True, if False, then copies only when the size of the array must change.
- Returns
with shape (N, M, C), where C in {3, 4}
- Return type
ndarray
- Doctest:
>>> assert atleast_3channels(np.zeros((10, 10))).shape[-1] == 3 >>> assert atleast_3channels(np.zeros((10, 10, 1))).shape[-1] == 3 >>> assert atleast_3channels(np.zeros((10, 10, 3))).shape[-1] == 3 >>> assert atleast_3channels(np.zeros((10, 10, 4))).shape[-1] == 4
- kwimage.im_core.normalize(arr, mode='linear', alpha=None, beta=None, out=None)[source]¶
Rebalance pixel intensities via contrast stretching.
By default linearly stretches pixel intensities to minimum and maximum values.
Notes
DEPRECATED: this function has been MOVED to
kwarray.normalize
- kwimage.im_core.normalize_intensity(imdata, return_info=False, nodata=None, axis=None, dtype=np.float32)[source]¶
Normalize data intensities using heuristics to help put sensor data with extremely high or low contrast into a visible range.
This function is designed with an emphasis on getting something that is reasonable for visualization.
- Parameters
imdata (ndarray) – raw intensity data
return_info (bool, default=False) – if True, return information about the chosen normalization heuristic.
nodata – A value representing nodata to leave unchanged during normalization, for example 0
dtype – can be float32 or float64
- Returns
a floating point array with values between 0 and 1.
- Return type
ndarray
Example
>>> from kwimage.im_core import * # NOQA >>> import ubelt as ub >>> import kwimage >>> import kwarray >>> s = 512 >>> bit_depth = 11 >>> dtype = np.uint16 >>> max_val = int(2 ** bit_depth) >>> min_val = int(0) >>> rng = kwarray.ensure_rng(0) >>> background = np.random.randint(min_val, max_val, size=(s, s), dtype=dtype) >>> poly1 = kwimage.Polygon.random(rng=rng).scale(s / 2) >>> poly2 = kwimage.Polygon.random(rng=rng).scale(s / 2).translate(s / 2) >>> forground = np.zeros_like(background, dtype=np.uint8) >>> forground = poly1.fill(forground, value=255) >>> forground = poly2.fill(forground, value=122) >>> forground = (kwimage.ensure_float01(forground) * max_val).astype(dtype) >>> imdata = background + forground >>> normed, info = normalize_intensity(imdata, return_info=True) >>> print('info = {}'.format(ub.repr2(info, nl=1))) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(imdata, pnum=(1, 2, 1), fnum=1) >>> kwplot.imshow(normed, pnum=(1, 2, 2), fnum=1)
Example
>>> from kwimage.im_core import * # NOQA >>> import ubelt as ub >>> import kwimage >>> # Test on an image that is already normalized to test how it >>> # degrades >>> imdata = kwimage.grab_test_image() >>> normed, info = normalize_intensity(imdata, return_info=True) >>> print('info = {}'.format(ub.repr2(info, nl=1))) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(imdata, pnum=(1, 2, 1), fnum=1) >>> kwplot.imshow(normed, pnum=(1, 2, 2), fnum=1)
- kwimage.im_core.padded_slice(data, in_slice, pad=None, padkw=None, return_info=False)[source]¶
Allows slices with out-of-bound coordinates. Any out of bounds coordinate will be sampled via padding.
DEPRECATED FOR THE VERSION IN KWARRAY (slices are more array-ish than image-ish)
Note
Negative slices have a different meaning here then they usually do. Normally, they indicate a wrap-around or a reversed stride, but here they index into out-of-bounds space (which depends on the pad mode). For example a slice of -2:1 literally samples two pixels to the left of the data and one pixel from the data, so you get two padded values and one data value.
- Parameters
data (Sliceable[T]) – data to slice into. Any channels must be the last dimension.
in_slice (slice | Tuple[slice, …]) – slice for each dimensions
ndim (int) – number of spatial dimensions
pad (List[int|Tuple]) – additional padding of the slice
padkw (Dict) – if unspecified defaults to
{'mode': 'constant'}
return_info (bool, default=False) – if True, return extra information about the transform.
- SeeAlso:
_padded_slice_embed - finds the embedded slice and padding _padded_slice_apply - applies padding to sliced data
- Returns
- data_sliced: subregion of the input data (possibly with padding,
depending on if the original slice went out of bounds)
- Tuple[Sliceable, Dict] :
data_sliced : as above
transform : information on how to return to the original coordinates
- Currently a dict containing:
- st_dims: a list indicating the low and high space-time
coordinate values of the returned data slice.
The structure of this dictionary mach change in the future
- Return type
Sliceable
Example
>>> data = np.arange(5) >>> in_slice = [slice(-2, 7)]
>>> data_sliced = padded_slice(data, in_slice) >>> print(ub.repr2(data_sliced, with_dtype=False)) np.array([0, 0, 0, 1, 2, 3, 4, 0, 0])
>>> data_sliced = padded_slice(data, in_slice, pad=(3, 3)) >>> print(ub.repr2(data_sliced, with_dtype=False)) np.array([0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 0, 0, 0, 0, 0])
>>> data_sliced = padded_slice(data, slice(3, 4), pad=[(1, 0)]) >>> print(ub.repr2(data_sliced, with_dtype=False)) np.array([2, 3])
- kwimage.im_core._padded_slice_apply(data_clipped, data_slice, extra_padding, padkw=None)[source]¶
Applies requested padding to an extracted data slice.
- kwimage.im_core._padded_slice_embed(in_slice, data_dims, pad=None)[source]¶
Embeds a “padded-slice” inside known data dimension.
Returns the valid data portion of the slice with extra padding for regions outside of the available dimension.
Given a slices for each dimension, image dimensions, and a padding get the corresponding slice from the image and any extra padding needed to achieve the requested window size.
- Parameters
in_slice (Tuple[slice]) – a tuple of slices for to apply to data data dimension.
data_dims (Tuple[int]) – n-dimension data sizes (e.g. 2d height, width)
pad (tuple) – (List[int|Tuple]): extra pad applied to (left and right) / (both) sides of each slice dim
- Returns
- data_slice - Tuple[slice] a slice that can be applied to an array
with with shape data_dims. This slice will not correspond to the full window size if the requested slice is out of bounds.
- extra_padding - extra padding needed after slicing to achieve
the requested window size.
- Return type
Tuple
Example
>>> # Case where slice is inside the data dims on left edge >>> from kwimage.im_core import * # NOQA >>> in_slice = (slice(0, 10), slice(0, 10)) >>> data_dims = [300, 300] >>> pad = [10, 5] >>> a, b = _padded_slice_embed(in_slice, data_dims, pad) >>> print('data_slice = {!r}'.format(a)) >>> print('extra_padding = {!r}'.format(b)) data_slice = (slice(0, 20, None), slice(0, 15, None)) extra_padding = [(10, 0), (5, 0)]
Example
>>> # Case where slice is bigger than the image >>> in_slice = (slice(-10, 400), slice(-10, 400)) >>> data_dims = [300, 300] >>> pad = [10, 5] >>> a, b = _padded_slice_embed(in_slice, data_dims, pad) >>> print('data_slice = {!r}'.format(a)) >>> print('extra_padding = {!r}'.format(b)) data_slice = (slice(0, 300, None), slice(0, 300, None)) extra_padding = [(20, 110), (15, 105)]
Example
>>> # Case where slice is inside than the image >>> in_slice = (slice(10, 40), slice(10, 40)) >>> data_dims = [300, 300] >>> pad = None >>> a, b = _padded_slice_embed(in_slice, data_dims, pad) >>> print('data_slice = {!r}'.format(a)) >>> print('extra_padding = {!r}'.format(b)) data_slice = (slice(10, 40, None), slice(10, 40, None)) extra_padding = [(0, 0), (0, 0)]
kwimage.im_cv2
¶
Wrappers around cv2 functions
Note: all functions in kwimage work with RGB input by default instead of BGR.
Module Contents¶
|
Converts interpolation into flags suitable cv2 functions |
|
Converts border_mode into flags suitable cv2 functions |
|
DEPRECATED and removed: use imresize instead |
|
Crop an image about a specified point, padding if necessary. |
|
Resize an image based on a scale factor, final size, or size and aspect |
|
Converts colorspace of img. |
|
|
|
Creates a 2D gaussian patch with a specific size and sigma |
|
Applies an affine transformation to an image with optional antialiasing. |
|
Helper for warp_affine |
|
Split an image into pieces smaller than cv2’s limit, perform cv2.warpAffine on each piece, |
|
Does a partial downscale with antialiasing and prepares for a final |
|
Compute a gaussian to mitigate aliasing for a requested downsample |
|
Downsamples by (2 ** k)x with antialiasing |
- kwimage.im_cv2._coerce_interpolation(interpolation, default=cv2.INTER_LANCZOS4, grow_default=cv2.INTER_LANCZOS4, shrink_default=cv2.INTER_AREA, scale=None)[source]¶
Converts interpolation into flags suitable cv2 functions
- Parameters
interpolation (int or str) – string or cv2-style interpolation type
default (int) – cv2 flag to use if interpolation is None and scale is None.
grow_default (int) – cv2 flag to use if interpolation is None and scale is greater than or equal to 1.
shrink_default (int) – cv2 flag to use if interpolation is None and scale is less than 1.
scale (float) – indicate if the interpolation will be used to scale the image.
- Returns
- flag specifying interpolation type that can be passed to
functions like cv2.resize, cv2.warpAffine, etc…
- Return type
Example
>>> flag = _coerce_interpolation('linear') >>> assert flag == cv2.INTER_LINEAR >>> flag = _coerce_interpolation(cv2.INTER_LINEAR) >>> assert flag == cv2.INTER_LINEAR >>> flag = _coerce_interpolation('auto', default='lanczos') >>> assert flag == cv2.INTER_LANCZOS4 >>> flag = _coerce_interpolation(None, default='lanczos') >>> assert flag == cv2.INTER_LANCZOS4 >>> flag = _coerce_interpolation('auto', shrink_default='area', scale=0.1) >>> assert flag == cv2.INTER_AREA >>> flag = _coerce_interpolation('auto', grow_default='cubic', scale=10.) >>> assert flag == cv2.INTER_CUBIC >>> # xdoctest: +REQUIRES(module:pytest) >>> import pytest >>> with pytest.raises(TypeError): >>> _coerce_interpolation(3.4) >>> import pytest >>> with pytest.raises(KeyError): >>> _coerce_interpolation('foobar')
- kwimage.im_cv2._coerce_border(border_mode, default=cv2.BORDER_CONSTANT)[source]¶
Converts border_mode into flags suitable cv2 functions
- Parameters
border_mode (int or str) – string or cv2-style interpolation type
- Returns
- flag specifying borderMode type that can be passed to
functions like cv2.warpAffine, etc…
- Return type
Example
>>> flag = _coerce_border('constant') >>> assert flag == cv2.BORDER_CONSTANT >>> flag = _coerce_border(cv2.BORDER_CONSTANT) >>> assert flag == cv2.BORDER_CONSTANT >>> flag = _coerce_border(None, default='reflect') >>> assert flag == cv2.BORDER_REFLECT >>> # xdoctest: +REQUIRES(module:pytest) >>> import pytest >>> with pytest.raises(TypeError): >>> _coerce_border(3.4) >>> import pytest >>> with pytest.raises(KeyError): >>> _coerce_border('foobar')
- kwimage.im_cv2.imscale(img, scale, interpolation=None, return_scale=False)[source]¶
DEPRECATED and removed: use imresize instead
- kwimage.im_cv2.imcrop(img, dsize, about=None, origin=None, border_value=None, interpolation='nearest')[source]¶
Crop an image about a specified point, padding if necessary.
This is like PIL.Image.Image.crop with more convenient arguments, or cv2.getRectSubPix without the baked-in bilinear interpolation.
- Parameters
img (ndarray) – image to crop
dsize (Tuple[None | int, None | int]) – the desired width and height of the new image. If a dimension is None, then it is automatically computed to preserve aspect ratio. This can be larger than the original dims; if so, the cropped image is padded with border_value.
about (Tuple[str | int, str | int]) – the location to crop about. Mutually exclusive with origin. Defaults to top left. If ints (w,h) are provided, that will be the center of the cropped image. There are also string codes available: ‘lt’: make the top left point of the image the top left point of
the cropped image. This is equivalent to img[:dsize[1], :dsize[0]], plus padding.
- ‘rb’: make the bottom right point of the image the bottom right
point of the cropped image. This is equivalent to img[-dsize[1]:, -dsize[0]:], plus padding.
‘cc’: make the center of the image the center of the cropped image. Any combination of these codes can be used, ex. ‘lb’, ‘ct’, (‘r’, 200), …
origin (Tuple[int, int] | None) – the origin of the crop in (x,y) order (same order as dsize/about). Mutually exclusive with about. Defaults to top left.
border_value (Numeric | Tuple | str, default=0) – any border border_value accepted by cv2.copyMakeBorder, ex. [255, 0, 0] (blue). Default is 0.
interpolation (str, default=’nearest’) – Can be ‘nearest’, in which case integral cropping is used. Can also be ‘linear’, in which case cv2.getRectSubPix is used.
- Returns
the cropped image
- Return type
ndarray
- SeeAlso:
kwarray.padded_slice()
- a similar function for working with“negative slices”.
Example
>>> import kwimage >>> import numpy as np >>> # >>> img = kwimage.grab_test_image('astro', dsize=(32, 32)) >>> # >>> # regular crop >>> new_img1 = kwimage.imcrop(img, dsize=(5,6)) >>> assert new_img1.shape == (6, 5, 3) >>> # >>> # padding for coords outside the image bounds >>> new_img2 = kwimage.imcrop(img, dsize=(5,6), >>> origin=(-1,0), border_value=[1, 0, 0]) >>> assert np.all(new_img2[:, 0] == [1, 0, 0]) >>> # >>> # codes for corner- and edge-centered cropping >>> new_img3 = kwimage.imcrop(img, dsize=(5,6), >>> about='cb') >>> # >>> # special code for bilinear interpolation >>> # with floating-point coordinates >>> new_img4 = kwimage.imcrop(img, dsize=(5,6), >>> about=(5.5, 8.5), interpolation='linear') >>> # >>> # use with bounding boxes >>> bbox = kwimage.Boxes.random(scale=5, rng=132).to_xywh().quantize() >>> origin, dsize = np.split(bbox.data[0], 2) >>> new_img5 = kwimage.imcrop(img, dsize=dsize, >>> origin=origin) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> pnum_ = kwplot.PlotNums(nSubplots=6) >>> kwplot.imshow(img, pnum=pnum_()) >>> kwplot.imshow(new_img1, pnum=pnum_()) >>> kwplot.imshow(new_img2, pnum=pnum_()) >>> kwplot.imshow(new_img3, pnum=pnum_()) >>> kwplot.imshow(new_img4, pnum=pnum_()) >>> kwplot.imshow(new_img5, pnum=pnum_()) >>> kwplot.show_if_requested()
- kwimage.im_cv2.imresize(img, scale=None, dsize=None, max_dim=None, min_dim=None, interpolation=None, grow_interpolation=None, letterbox=False, return_info=False, antialias=False)[source]¶
Resize an image based on a scale factor, final size, or size and aspect ratio.
Slightly more general than cv2.resize, allows for specification of either a scale factor, a final size, or the final size for a particular dimension.
- Parameters
img (ndarray) – image to resize
scale (float or Tuple[float, float]) – Desired floating point scale factor. If a tuple, the dimension ordering is x,y. Mutually exclusive with dsize, max_dim, and min_dim.
dsize (Tuple[int] | None) – The desired with and height of the new image. If a dimension is None, then it is automatically computed to preserve aspect ratio. Mutually exclusive with size, max_dim, and min_dim.
max_dim (int) – New size of the maximum dimension, the other dimension is scaled to maintain aspect ratio. Mutually exclusive with size, dsize, and min_dim.
min_dim (int) – New size of the minimum dimension, the other dimension is scaled to maintain aspect ratio.Mutually exclusive with size, dsize, and max_dim.
interpolation (str | int) – The interpolation key or code (e.g. linear lanczos). By default “area” is used if the image is shrinking and “lanczos” is used if the image is growing. Note, if this is explicitly set, then it will be used regardless of if the image is growing or shrinking. Set
grow_interpolation
to change the default for an enlarging interpolation.grow_interpolation (str | int, default=”lanczos”) – The interpolation key or code to use when the image is being enlarged. Does nothing if “interpolation” is explicitly given. If “interpolation” is not specified “area” is used when shrinking.
letterbox (bool, default=False) – If used in conjunction with dsize, then the image is scaled and translated to fit in the center of the new image while maintaining aspect ratio. Zero padding is added if necessary.
return_info (bool, default=False) – if True returns information about the final transformation in a dictionary. If there is an offset, the scale is applied before the offset when transforming to the new resized space.
antialias (bool, default=False) – if True blurs to anti-alias before downsampling.
- Returns
the new image and optionally an info dictionary if return_info=True
- Return type
ndarray | Tuple[ndarray, Dict]
Example
>>> import kwimage >>> import numpy as np >>> # Test scale >>> img = np.zeros((16, 10, 3), dtype=np.uint8) >>> new_img, info = kwimage.imresize(img, scale=.85, >>> interpolation='area', >>> return_info=True) >>> print('info = {!r}'.format(info)) >>> assert info['scale'].tolist() == [.8, 0.875] >>> # Test dsize without None >>> new_img, info = kwimage.imresize(img, dsize=(5, 12), >>> interpolation='area', >>> return_info=True) >>> print('info = {!r}'.format(info)) >>> assert info['scale'].tolist() == [0.5 , 0.75] >>> # Test dsize with None >>> new_img, info = kwimage.imresize(img, dsize=(6, None), >>> interpolation='area', >>> return_info=True) >>> print('info = {!r}'.format(info)) >>> assert info['scale'].tolist() == [0.6, 0.625] >>> # Test max_dim >>> new_img, info = kwimage.imresize(img, max_dim=6, >>> interpolation='area', >>> return_info=True) >>> print('info = {!r}'.format(info)) >>> assert info['scale'].tolist() == [0.4 , 0.375] >>> # Test min_dim >>> new_img, info = kwimage.imresize(img, min_dim=6, >>> interpolation='area', >>> return_info=True) >>> print('info = {!r}'.format(info)) >>> assert info['scale'].tolist() == [0.6 , 0.625]
Example
>>> import kwimage >>> import numpy as np >>> # Test letterbox resize >>> img = np.ones((5, 10, 3), dtype=np.float32) >>> new_img, info = kwimage.imresize(img, dsize=(19, 19), >>> letterbox=True, >>> return_info=True) >>> print('info = {!r}'.format(info)) >>> assert info['offset'].tolist() == [0, 4] >>> img = np.ones((10, 5, 3), dtype=np.float32) >>> new_img, info = kwimage.imresize(img, dsize=(19, 19), >>> letterbox=True, >>> return_info=True) >>> print('info = {!r}'.format(info)) >>> assert info['offset'].tolist() == [4, 0]
>>> import kwimage >>> import numpy as np >>> # Test letterbox resize >>> img = np.random.rand(100, 200) >>> new_img, info = kwimage.imresize(img, dsize=(300, 300), letterbox=True, return_info=True)
Example
>>> # Check aliasing >>> import kwimage >>> img = kwimage.grab_test_image('checkerboard') >>> img = kwimage.grab_test_image('astro') >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> dsize = (14, 14) >>> dsize = (64, 64) >>> # When we set "grow_interpolation" for a "shrinking" resize it should >>> # still do the "area" interpolation to antialias the results. But if we >>> # use explicit interpolation it should alias. >>> pnum_ = kwplot.PlotNums(nSubplots=12, nCols=4) >>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=True, interpolation='area'), pnum=pnum_(), title='resize aa area') >>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=True, interpolation='linear'), pnum=pnum_(), title='resize aa linear') >>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=True, interpolation='nearest'), pnum=pnum_(), title='resize aa nearest') >>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=True, interpolation='cubic'), pnum=pnum_(), title='resize aa cubic')
>>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=True, grow_interpolation='area'), pnum=pnum_(), title='resize aa grow area') >>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=True, grow_interpolation='linear'), pnum=pnum_(), title='resize aa grow linear') >>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=True, grow_interpolation='nearest'), pnum=pnum_(), title='resize aa grow nearest') >>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=True, grow_interpolation='cubic'), pnum=pnum_(), title='resize aa grow cubic')
>>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=False, interpolation='area'), pnum=pnum_(), title='resize no-aa area') >>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=False, interpolation='linear'), pnum=pnum_(), title='resize no-aa linear') >>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=False, interpolation='nearest'), pnum=pnum_(), title='resize no-aa nearest') >>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=False, interpolation='cubic'), pnum=pnum_(), title='resize no-aa cubic')
Todo
- [X] When interpolation is area and the number of channels > 4
cv2.resize will error but it is fine for linear interpolation
[ ] TODO: add padding options when letterbox=True
- kwimage.im_cv2.convert_colorspace(img, src_space, dst_space, copy=False, implicit=False, dst=None)[source]¶
Converts colorspace of img. Convenience function around cv2.cvtColor
- Parameters
img (ndarray) – image data with float32 or uint8 precision
src_space (str) – input image colorspace. (e.g. BGR, GRAY)
dst_space (str) – desired output colorspace. (e.g. RGB, HSV, LAB)
implicit (bool) –
- if False, the user must correctly specify if the input/output
colorspaces contain alpha channels.
- If True and the input image has an alpha channel, we modify
src_space and dst_space to ensure they both end with “A”.
dst (ndarray[uint8_t, ndim=2], optional) – inplace-output array.
- Returns
img - image data
- Return type
ndarray
Note
Note the LAB and HSV colorspaces in float do not go into the 0-1 range.
- For HSV the floating point range is:
0:360, 0:1, 0:1
- For LAB the floating point range is:
0:100, -86.1875:98.234375, -107.859375:94.46875 (Note, that some extreme combinations of a and b are not valid)
Example
>>> import numpy as np >>> convert_colorspace(np.array([[[0, 0, 1]]], dtype=np.float32), 'RGB', 'LAB') >>> convert_colorspace(np.array([[[0, 1, 0]]], dtype=np.float32), 'RGB', 'LAB') >>> convert_colorspace(np.array([[[1, 0, 0]]], dtype=np.float32), 'RGB', 'LAB') >>> convert_colorspace(np.array([[[1, 1, 1]]], dtype=np.float32), 'RGB', 'LAB') >>> convert_colorspace(np.array([[[0, 0, 1]]], dtype=np.float32), 'RGB', 'HSV')
- Ignore:
# Check LAB output ranges import itertools as it s = 1 _iter = it.product(range(0, 256, s), range(0, 256, s), range(0, 256, s)) minvals = np.full(3, np.inf) maxvals = np.full(3, -np.inf) for r, g, b in ub.ProgIter(_iter, total=(256 // s) ** 3):
img255 = np.array([[[r, g, b]]], dtype=np.uint8) img01 = (img255 / 255.0).astype(np.float32) lab = convert_colorspace(img01, ‘rgb’, ‘lab’) np.minimum(lab[0, 0], minvals, out=minvals) np.maximum(lab[0, 0], maxvals, out=maxvals)
print(‘minvals = {}’.format(ub.repr2(minvals, nl=0))) print(‘maxvals = {}’.format(ub.repr2(maxvals, nl=0)))
- kwimage.im_cv2.gaussian_patch(shape=(7, 7), sigma=None)[source]¶
Creates a 2D gaussian patch with a specific size and sigma
- Parameters
shape (Tuple[int, int]) – patch height and width
sigma (float | Tuple[float, float]) – Gaussian standard deviation
References
http://docs.opencv.org/modules/imgproc/doc/filtering.html#getgaussiankernel
Todo
[ ] Look into this C-implementation
https://kwgitlab.kitware.com/computer-vision/heatmap/blob/master/heatmap/heatmap.c
- CommandLine:
xdoctest -m kwimage.im_cv2 gaussian_patch –show
Example
>>> import numpy as np >>> shape = (88, 24) >>> sigma = None # 1.0 >>> gausspatch = gaussian_patch(shape, sigma) >>> sum_ = gausspatch.sum() >>> assert np.all(np.isclose(sum_, 1.0)) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> norm = (gausspatch - gausspatch.min()) / (gausspatch.max() - gausspatch.min()) >>> kwplot.imshow(norm) >>> kwplot.show_if_requested()
Example
>>> import numpy as np >>> shape = (24, 24) >>> sigma = 3.0 >>> gausspatch = gaussian_patch(shape, sigma) >>> sum_ = gausspatch.sum() >>> assert np.all(np.isclose(sum_, 1.0)) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> norm = (gausspatch - gausspatch.min()) / (gausspatch.max() - gausspatch.min()) >>> kwplot.imshow(norm) >>> kwplot.show_if_requested()
- kwimage.im_cv2.warp_affine(image, transform, dsize=None, antialias=False, interpolation='linear', border_mode=None, border_value=0, large_warp_dim=None, return_info=False)[source]¶
Applies an affine transformation to an image with optional antialiasing.
- Parameters
image (ndarray) – the input image as a numpy array. Note: this is passed directly to cv2, so it is best to ensure that it is contiguous and using a dtype that cv2 can handle.
transform (ndarray | Affine) – a coercable affine matrix. See
kwimage.Affine
for details on what can be coerced.dsize (Tuple[int, int] | None | str, default=None) – A integer width and height tuple of the resulting “canvas” image. If None, then the input image size is used.
If specified as a string, dsize is computed based on the given heuristic.
If ‘positive’ (or ‘auto’), dsize is computed such that the positive coordinates of the warped image will fit in the new canvas. In this case, any pixel that maps to a negative coordinate will be clipped. This has the property that the input transformation is not modified.
If ‘content’ (or ‘max’), the transform is modified with an extra translation such that both the positive and negative coordinates of the warped image will fit in the new canvas.
antialias (bool, default=False) – if True determines if the transform is downsampling and applies antialiasing via gaussian a blur.
interpolation (str, default=”linear”) – interpolation code or cv2 integer. Interpolation codes are linear, nearest, cubic, lancsoz, and area.
border_mode (str) – Border code or cv2 integer. Border codes are constant replicate, reflect, wrap, reflect101, and transparent.
border_value (int | float) – Used as the fill value if border_mode is constant. Otherwise this is ignored.
large_warp_dim (int | None | str, default=None) – If specified, perform the warp piecewise in chunks of the specified size. If “auto”, it is set to the maximum “short” value in numpy. This works around a limitation of cv2.warpAffine, which must have image dimensions < SHRT_MAX (=32767 in version 4.5.3)
return_info (bool, default=Fasle) – if True, returns information about the operation. In the case where dsize=”content”, this includes the modified transformation.
- Returns
the warped image, or if return info is True, the warped image and the info dictionary.
- Return type
ndarray | Tuple[ndarray, Dict]
Example
>>> from kwimage.im_cv2 import * # NOQA >>> import kwimage >>> from kwimage.transform import Affine >>> image = kwimage.grab_test_image('astro') >>> #image = kwimage.grab_test_image('checkerboard') >>> transform = Affine.random() @ Affine.scale(0.05) >>> transform = Affine.scale(0.02) >>> warped1 = warp_affine(image, transform, dsize='positive', antialias=1, interpolation='nearest') >>> warped2 = warp_affine(image, transform, dsize='positive', antialias=0) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> pnum_ = kwplot.PlotNums(nRows=1, nCols=2) >>> kwplot.imshow(warped1, pnum=pnum_(), title='antialias=True') >>> kwplot.imshow(warped2, pnum=pnum_(), title='antialias=False') >>> kwplot.show_if_requested()
Example
>>> from kwimage.im_cv2 import * # NOQA >>> import kwimage >>> from kwimage.transform import Affine >>> image = kwimage.grab_test_image('astro') >>> image = kwimage.grab_test_image('checkerboard') >>> transform = Affine.random() @ Affine.scale((.1, 1.2)) >>> warped1 = warp_affine(image, transform, dsize='positive', antialias=1) >>> warped2 = warp_affine(image, transform, dsize='positive', antialias=0) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> pnum_ = kwplot.PlotNums(nRows=1, nCols=2) >>> kwplot.imshow(warped1, pnum=pnum_(), title='antialias=True') >>> kwplot.imshow(warped2, pnum=pnum_(), title='antialias=False') >>> kwplot.show_if_requested()
Example
>>> # Test the case where the input data is empty or the target canvas >>> # is empty, this should be handled like boundary effects >>> import kwimage >>> image = np.random.rand(1, 1, 3) >>> transform = kwimage.Affine.random() >>> result = kwimage.warp_affine(image, transform, dsize=(0, 0)) >>> assert result.shape == (0, 0, 3) >>> # >>> empty_image = np.random.rand(0, 1, 3) >>> result = kwimage.warp_affine(empty_image, transform, dsize=(10, 10)) >>> assert result.shape == (10, 10, 3) >>> # >>> empty_image = np.random.rand(0, 1, 3) >>> result = kwimage.warp_affine(empty_image, transform, dsize=(10, 0)) >>> assert result.shape == (0, 10, 3)
Example
>>> # Demo difference between positive and content dsize >>> from kwimage.im_cv2 import * # NOQA >>> import kwimage >>> from kwimage.transform import Affine >>> image = kwimage.grab_test_image('astro', dsize=(512, 512)) >>> transform = Affine.coerce(offset=(-100, -50), scale=2, theta=0.1) >>> # When warping other images or geometry along with this image >>> # it is important to account for the modified transform when >>> # setting dsize='content'. If dsize='positive', the transform >>> # will remain unchanged wrt other aligned images / geometries. >>> poly = kwimage.Boxes([[350, 5, 130, 290]], 'xywh').to_polygons()[0] >>> # Apply the warping to the images >>> warped_pos, info_pos = warp_affine(image, transform, dsize='positive', return_info=True) >>> warped_con, info_con = warp_affine(image, transform, dsize='content', return_info=True) >>> assert info_pos['dsize'] == (919, 1072) >>> assert info_con['dsize'] == (1122, 1122) >>> assert info_pos['transform'] == transform >>> # Demo the correct and incorrect way to apply transforms >>> poly_pos = poly.warp(transform) >>> poly_con = poly.warp(info_con['transform']) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> # show original >>> kwplot.imshow(image, pnum=(1, 3, 1), title='original') >>> poly.draw(color='green', alpha=0.5, border=True) >>> # show positive warped >>> kwplot.imshow(warped_pos, pnum=(1, 3, 2), title='dsize=positive') >>> poly_pos.draw(color='purple', alpha=0.5, border=True) >>> # show content warped >>> ax = kwplot.imshow(warped_con, pnum=(1, 3, 3), title='dsize=content')[1] >>> poly_con.draw(color='dodgerblue', alpha=0.5, border=True) # correct >>> poly_pos.draw(color='orangered', alpha=0.5, border=True) # incorrect >>> cc = poly_con.to_shapely().centroid >>> cp = poly_pos.to_shapely().centroid >>> ax.text(cc.x, cc.y + 250, 'correctly transformed', color='dodgerblue', >>> backgroundcolor=(0, 0, 0, 0.7), horizontalalignment='center') >>> ax.text(cp.x, cp.y - 250, 'incorrectly transformed', color='orangered', >>> backgroundcolor=(0, 0, 0, 0.7), horizontalalignment='center') >>> kwplot.show_if_requested()
Example
>>> # Demo piecewise transform >>> from kwimage.im_cv2 import * # NOQA >>> import kwimage >>> from kwimage.transform import Affine >>> image = kwimage.grab_test_image('astro', dsize=(512, 512)) >>> transform = Affine.coerce(offset=(-100, -50), scale=2, theta=0.1) >>> warped_piecewise, info = warp_affine(image, transform, dsize='positive', return_info=True, large_warp_dim=32) >>> warped_normal, info = warp_affine(image, transform, dsize='positive', return_info=True, large_warp_dim=None) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(image, pnum=(1, 3, 1), title='original') >>> kwplot.imshow(warped_normal, pnum=(1, 3, 2), title='normal warp') >>> kwplot.imshow(warped_piecewise, pnum=(1, 3, 3), title='piecewise warp')
- kwimage.im_cv2._try_warp(image, transform_, large_warp_dim, dsize, max_dsize, new_origin, flags, borderMode, borderValue)[source]¶
Helper for warp_affine
- kwimage.im_cv2._large_warp(image, transform_, dsize, max_dsize, new_origin, flags, borderMode, borderValue, pieces_per_dim)[source]¶
Split an image into pieces smaller than cv2’s limit, perform cv2.warpAffine on each piece, and stitch them back together with minimal artifacts.
Example
>>> # xdoctest: +REQUIRES(--large_memory) >>> import kwimage >>> img = np.random.randint(255, size=(32767, 32767), dtype=np.uint8) >>> aff = kwimage.Affine.random() >>> import cv2 >>> # >>> # without this function >>> try: >>> res = kwimage.warp_affine(img, aff, large_warp_dim=None) >>> except cv2.error as e: >>> pass >>> # >>> # with this function >>> res = kwimage.warp_affine(img, aff, large_warp_dim='auto') >>> assert res.shape == img.shape >>> assert res.dtype == img.dtype
Example
>>> import kwimage >>> import cv2 >>> image = kwimage.grab_test_image('astro') >>> # Use wrapper function >>> transform = kwimage.Affine.coerce( >>> {'offset': (136.3946757082253, 0.0), >>> 'scale': (1.7740542832875767, 1.0314621286400032), >>> 'theta': 0.2612311452107956, >>> 'type': 'affine'}) >>> res, info = kwimage.warp_affine( >>> image, transform, dsize='content', return_info=True, >>> large_warp_dim=128) >>> # Explicit args for this function >>> transform = info['transform'] >>> new_origin = np.array((0, 0)) >>> max_dsize = (1015, 745) >>> dsize = max_dsize >>> res2 = _large_warp(image, transform, dsize, max_dsize, new_origin, >>> flags=cv2.INTER_LINEAR, borderMode=None, >>> borderValue=None, pieces_per_dim=2) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(res, pnum=(1, 2, 1)) >>> kwplot.imshow(res2, pnum=(1, 2, 2))
- kwimage.im_cv2._prepare_downscale(image, sx, sy)[source]¶
Does a partial downscale with antialiasing and prepares for a final downsampling. Only downscales by factors of 2, any residual scaling to be done is returned.
Example
>>> s = 523 >>> image = np.random.rand(s, s) >>> sx = sy = 1 / 11 >>> downsampled, rx, ry = _prepare_downscale(image, sx, sy)
- kwimage.im_cv2._gauss_params(scale, k0=5, sigma0=1, fractional=True)[source]¶
Compute a gaussian to mitigate aliasing for a requested downsample
- Parameters
scale – requested downsample factor
k0 (int) – kernel size for one downsample operation
sigma0 (float) – sigma for one downsample operation
fractional (bool) – controls if we compute params for integer downsample
ops
kwimage.im_demodata
¶
Module Contents¶
for dev use to update hashes of the demo images |
|
|
Ensures that the test image exists (this might use the network), reads it |
|
Ensures that the test image exists (this might use the network) and returns |
|
Creates a checkerboard image |
- kwimage.im_demodata._update_hashes()[source]¶
for dev use to update hashes of the demo images
- CommandLine:
xdoctest -m kwimage.im_demodata _update_hashes xdoctest -m kwimage.im_demodata _update_hashes –require-hashes
- kwimage.im_demodata.grab_test_image(key='astro', space='rgb', dsize=None, interpolation='lanczos')[source]¶
Ensures that the test image exists (this might use the network), reads it and returns the the image pixels.
- Parameters
key (str) – which test image to grab. Valid choices are: astro - an astronaught carl - Carl Sagan paraview - ParaView logo stars - picture of stars in the sky airport - SkySat image of Beijing Capital International Airport on 18 February 2018 See
kwimage.grab_test_image.keys
for a full list.space (str, default=’rgb’) – which colorspace to return in
dsize (Tuple[int, int], default=None) – if specified resizes image to this size
- Returns
the requested image
- Return type
ndarray
- CommandLine:
xdoctest -m kwimage.im_demodata grab_test_image
Example
>>> # xdoctest: +REQUIRES(--network) >>> import kwimage >>> for key in kwimage.grab_test_image.keys(): >>> print('attempt to grab key = {!r}'.format(key)) >>> kwimage.grab_test_image(key) >>> print('grabbed key = {!r}'.format(key)) >>> kwimage.grab_test_image('astro', dsize=(255, 255)).shape (255, 255, 3)
- kwimage.im_demodata.grab_test_image_fpath(key='astro')[source]¶
Ensures that the test image exists (this might use the network) and returns the cached filepath to the requested image.
- Parameters
key (str) – which test image to grab. Valid choices are: astro - an astronaught carl - Carl Sagan paraview - ParaView logo stars - picture of stars in the sky
- Returns
path to the requested image
- Return type
- CommandLine:
python -c “import kwimage; print(kwimage.grab_test_image_fpath(‘airport’))”
Example
>>> # xdoctest: +REQUIRES(--network) >>> import kwimage >>> for key in kwimage.grab_test_image.keys(): ... print('attempt to grab key = {!r}'.format(key)) ... kwimage.grab_test_image_fpath(key) ... print('grabbed grab key = {!r}'.format(key))
- kwimage.im_demodata.checkerboard(num_squares=8, dsize=(512, 512))[source]¶
Creates a checkerboard image
- Parameters
num_squares (int) – number of squares in a row
dsize (Tuple[int, int]) – width and height
References
https://stackoverflow.com/questions/2169478/how-to-make-a-checkerboard-in-numpy
Example
>>> from kwimage.im_demodata import * # NOQA >>> img = checkerboard()
kwimage.im_draw
¶
Module Contents¶
|
Draws multiline text on an image using opencv |
|
Draws classification label on an image. |
|
Draws boxes on an image. |
|
Draw line segments between pts1 and pts2 on an image. |
|
Determine if color applies a single color to all |
|
Colorizes a single-channel intensity mask (with an alpha channel) |
|
Makes a colormap in HSV space where the orientation changes color and mag |
|
Create an image representing a 2D vector field. |
|
Create an image representing a 2D vector field. |
- kwimage.im_draw.draw_text_on_image(img, text, org, return_info=False, **kwargs)[source]¶
Draws multiline text on an image using opencv
- Parameters
img (ndarray | None | dict) – Generally a numpy image to draw on (inplace). Otherwise a canvas will be constructed such that the text will fit. The user may specify a dictionary with keys width and height to have more control over the constructed canvas.
text (str) – text to draw
org (Tuple[int, int]) – The x, y location of the text string “anchor” in the image as specified by halign and valign. For instance, If valign=’bottom’, halign=’left’, this is the bottom left corner.
return_info (bool, default=False) – if True, also returns information about the positions the text was drawn on.
**kwargs – color (tuple): default blue thickness (int): defaults to 2 fontFace (int): defaults to cv2.FONT_HERSHEY_SIMPLEX fontScale (float): defaults to 1.0 valign (str, default=’bottom’):
either top, center, or bottom. NOTE: this default may change to “top” in the future.
- halign (str, default=’left’):
either left, center, or right
- border (dict | int):
If specified as an integer, draws a black border with that given thickness. If specified as a dictionary, draws a border with color specified parameters.
“color”: border color, defaults to “black”. “thickness”: border thickness, defaults to 1.
- Returns
the image that was drawn on
- Return type
ndarray
Note
The image is modified inplace. If the image is non-contiguous then this returns a UMat instead of a ndarray, so be carefull with that.
References
https://stackoverflow.com/questions/27647424/ https://stackoverflow.com/questions/51285616/opencvs-gettextsize-and-puttext-return-wrong-size-and-chop-letters-with-low
Example
>>> import kwimage >>> img = kwimage.grab_test_image(space='rgb') >>> img2 = kwimage.draw_text_on_image(img.copy(), 'FOOBAR', org=(0, 0), valign='top') >>> assert img2.shape == img.shape >>> assert np.any(img2 != img) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(img2) >>> kwplot.show_if_requested()
Example
>>> import kwimage >>> # Test valign >>> img = kwimage.grab_test_image(space='rgb', dsize=(500, 500)) >>> img2 = kwimage.draw_text_on_image(img, 'FOOBAR\nbazbiz\nspam', org=(0, 0), valign='top', border=2) >>> img2 = kwimage.draw_text_on_image(img, 'FOOBAR\nbazbiz\nspam', org=(150, 0), valign='center', border=2) >>> img2 = kwimage.draw_text_on_image(img, 'FOOBAR\nbazbiz\nspam', org=(300, 0), valign='bottom', border=2) >>> # Test halign >>> img2 = kwimage.draw_text_on_image(img, 'FOOBAR\nbazbiz\nspam', org=(250, 100), halign='right', border=2) >>> img2 = kwimage.draw_text_on_image(img, 'FOOBAR\nbazbiz\nspam', org=(250, 250), halign='center', border=2) >>> img2 = kwimage.draw_text_on_image(img, 'FOOBAR\nbazbiz\nspam', org=(250, 400), halign='left', border=2) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(img2) >>> kwplot.show_if_requested()
Example
>>> # Ensure the function works with float01 or uint255 images >>> import kwimage >>> img = kwimage.grab_test_image(space='rgb') >>> img = kwimage.ensure_float01(img) >>> img2 = kwimage.draw_text_on_image(img, 'FOOBAR\nbazbiz\nspam', org=(0, 0), valign='top', border=2)
Example
>>> # Test dictionary border >>> import kwimage >>> img = kwimage.draw_text_on_image(None, 'hello\neveryone', org=(100, 100), valign='top', halign='center', border={'color': 'green', 'thickness': 9}) >>> #img = kwimage.draw_text_on_image(None, 'hello\neveryone', org=(0, 0), valign='top') >>> #img = kwimage.draw_text_on_image(None, 'hello', org=(0, 60), valign='top', halign='center', border=0) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(img) >>> kwplot.show_if_requested()
Example
>>> # Test dictionary image >>> import kwimage >>> img = kwimage.draw_text_on_image({'width': 300}, 'good\nPropogate', org=(150, 0), valign='top', halign='center', border={'color': 'green', 'thickness': 0}) >>> print('img.shape = {!r}'.format(img.shape)) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(img) >>> kwplot.show_if_requested()
- kwimage.im_draw.draw_clf_on_image(im, classes, tcx=None, probs=None, pcx=None, border=1)[source]¶
Draws classification label on an image.
Works best with image chips sized between 200x200 and 500x500
- Parameters
im (ndarray) – the image
classes (Sequence | CategoryTree) – list of class names
tcx (int, default=None) – true class index if known
probs (ndarray) – predicted class probs for each class
pcx (int, default=None) – predicted class index. (if None but probs is specified uses argmax of probs)
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> import torch >>> import kwarray >>> import kwimage >>> rng = kwarray.ensure_rng(0) >>> im = (rng.rand(300, 300) * 255).astype(np.uint8) >>> classes = ['cls_a', 'cls_b', 'cls_c'] >>> tcx = 1 >>> probs = rng.rand(len(classes)) >>> probs[tcx] = 0 >>> probs = torch.FloatTensor(probs).softmax(dim=0).numpy() >>> im1_ = kwimage.draw_clf_on_image(im, classes, tcx, probs) >>> probs[tcx] = .9 >>> probs = torch.FloatTensor(probs).softmax(dim=0).numpy() >>> im2_ = kwimage.draw_clf_on_image(im, classes, tcx, probs) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(im1_, colorspace='rgb', pnum=(1, 2, 1), fnum=1, doclf=True) >>> kwplot.imshow(im2_, colorspace='rgb', pnum=(1, 2, 2), fnum=1) >>> kwplot.show_if_requested()
- kwimage.im_draw.draw_boxes_on_image(img, boxes, color='blue', thickness=1, box_format=None, colorspace='rgb')[source]¶
Draws boxes on an image.
- Parameters
img (ndarray) – image to copy and draw on
boxes (nh.util.Boxes) – boxes to draw
colorspace (str) – string code of the input image colorspace
Example
>>> import kwimage >>> import numpy as np >>> img = np.zeros((10, 10, 3), dtype=np.uint8) >>> color = 'dodgerblue' >>> thickness = 1 >>> boxes = kwimage.Boxes([[1, 1, 8, 8]], 'ltrb') >>> img2 = draw_boxes_on_image(img, boxes, color, thickness) >>> assert tuple(img2[1, 1]) == (30, 144, 255) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() # xdoc: +SKIP >>> kwplot.figure(doclf=True, fnum=1) >>> kwplot.imshow(img2)
- kwimage.im_draw.draw_line_segments_on_image(img, pts1, pts2, color='blue', colorspace='rgb', thickness=1, **kwargs)[source]¶
Draw line segments between pts1 and pts2 on an image.
- Parameters
pts1 (ndarray) – xy coordinates of starting points
pts2 (ndarray) – corresponding xy coordinates of ending points
color (str | List) – color code or a list of colors for each line segment
colorspace (str, default=’rgb’) – colorspace of image
thickness (int, default=1)
lineType (int, default=cv2.LINE_AA)
- Returns
the modified image (inplace if possible)
- Return type
ndarray
Example
>>> from kwimage.im_draw import * # NOQA >>> pts1 = np.array([[2, 0], [2, 20], [2.5, 30]]) >>> pts2 = np.array([[10, 5], [30, 28], [100, 50]]) >>> img = np.ones((100, 100, 3), dtype=np.uint8) * 255 >>> color = 'blue' >>> colorspace = 'rgb' >>> img2 = draw_line_segments_on_image(img, pts1, pts2, thickness=2) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() # xdoc: +SKIP >>> kwplot.figure(doclf=True, fnum=1) >>> kwplot.imshow(img2)
Example
>>> import kwimage >>> pts1 = kwimage.Points.random(10).scale(512).xy >>> pts2 = kwimage.Points.random(10).scale(512).xy >>> img = np.ones((512, 512, 3), dtype=np.uint8) * 255 >>> color = kwimage.Color.distinct(10) >>> img2 = kwimage.draw_line_segments_on_image(img, pts1, pts2, color=color) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() # xdoc: +SKIP >>> kwplot.figure(doclf=True, fnum=1) >>> kwplot.imshow(img2)
- kwimage.im_draw._broadcast_colors(color, num, img, colorspace)[source]¶
Determine if color applies a single color to all
num
items, or if it is a list of colors for each item. Return as a list of colors for each item.Todo
[ ] add as classmethod of kwimage.Color
Example
>>> img = (np.random.rand(512, 512, 3) * 255).astype(np.uint8) >>> colorspace = 'rgb' >>> color = color_str_list = ['red', 'green', 'blue'] >>> color_str = 'red' >>> num = 3 >>> print(_broadcast_colors(color_str_list, num, img, colorspace)) >>> print(_broadcast_colors(color_str, num, img, colorspace)) >>> colors_tuple_list = _broadcast_colors(color_str_list, num, img, colorspace) >>> print(_broadcast_colors(colors_tuple_list, num, img, colorspace)) >>> # >>> # FIXME: This case seems broken >>> colors_ndarray_list = np.array(_broadcast_colors(color_str_list, num, img, colorspace)) >>> print(_broadcast_colors(colors_ndarray_list, num, img, colorspace))
- kwimage.im_draw.make_heatmask(probs, cmap='plasma', with_alpha=1.0, space='rgb', dsize=None)[source]¶
Colorizes a single-channel intensity mask (with an alpha channel)
- Parameters
probs (ndarray) – 2D probability map with values between 0 and 1
cmap (str) – mpl colormap
with_alpha (float) – between 0 and 1, uses probs as the alpha multipled by this number.
space (str) – output colorspace
dsize (tuple) – if not None, then output is resized to W,H=dsize
- SeeAlso:
kwimage.overlay_alpha_images
Example
>>> # xdoc: +REQUIRES(module:matplotlib) >>> probs = np.tile(np.linspace(0, 1, 10), (10, 1)) >>> heatmask = make_heatmask(probs, with_alpha=0.8, dsize=(100, 100)) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.imshow(heatmask, fnum=1, doclf=True, colorspace='rgb') >>> kwplot.show_if_requested()
- kwimage.im_draw.make_orimask(radians, mag=None, alpha=1.0)[source]¶
Makes a colormap in HSV space where the orientation changes color and mag changes the saturation/value.
- Parameters
radians (ndarray) – orientation in radians
mag (ndarray) – magnitude (must be normalized between 0 and 1)
alpha (float | ndarray) – if False or None, then the image is returned without alpha if a float, then mag is scaled by this and used as the alpha channel if an ndarray, then this is explicilty set as the alpha channel
- Returns
an rgb / rgba image in 01 space
- Return type
ndarray[float32]
- SeeAlso:
kwimage.overlay_alpha_images
Example
>>> # xdoc: +REQUIRES(module:matplotlib) >>> x, y = np.meshgrid(np.arange(64), np.arange(64)) >>> dx, dy = x - 32, y - 32 >>> radians = np.arctan2(dx, dy) >>> mag = np.sqrt(dx ** 2 + dy ** 2) >>> orimask = make_orimask(radians, mag) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.imshow(orimask, fnum=1, doclf=True, colorspace='rgb') >>> kwplot.show_if_requested()
- kwimage.im_draw.make_vector_field(dx, dy, stride=0.02, thresh=0.0, scale=1.0, alpha=1.0, color='red', thickness=1, tipLength=0.1, line_type='aa')[source]¶
Create an image representing a 2D vector field.
- Parameters
dx (ndarray) – grid of vector x components
dy (ndarray) – grid of vector y components
stride (int | float) – sparsity of vectors, int specifies stride step in pixels, a float specifies it as a percentage.
thresh (float) – only plot vectors with magnitude greater than thres
scale (float) – multiply magnitude for easier visualization
alpha (float) – alpha value for vectors. Non-vector regions receive 0 alpha (if False, no alpha channel is used)
color (str | tuple | kwimage.Color) – RGB color of the vectors
thickness (int, default=1) – thickness of arrows
tipLength (float, default=0.1) – fraction of line length
line_type (int) – either cv2.LINE_4, cv2.LINE_8, or cv2.LINE_AA
- Returns
vec_img: an rgb/rgba image in 0-1 space
- Return type
ndarray[float32]
- SeeAlso:
kwimage.overlay_alpha_images
DEPRECATED USE: draw_vector_field instead
Example
>>> x, y = np.meshgrid(np.arange(512), np.arange(512)) >>> dx, dy = x - 256.01, y - 256.01 >>> radians = np.arctan2(dx, dy) >>> mag = np.sqrt(dx ** 2 + dy ** 2) >>> dx, dy = dx / mag, dy / mag >>> img = make_vector_field(dx, dy, scale=10, alpha=False) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(img) >>> kwplot.show_if_requested()
- kwimage.im_draw.draw_vector_field(image, dx, dy, stride=0.02, thresh=0.0, scale=1.0, alpha=1.0, color='red', thickness=1, tipLength=0.1, line_type='aa')[source]¶
Create an image representing a 2D vector field.
- Parameters
image (ndarray) – image to draw on
dx (ndarray) – grid of vector x components
dy (ndarray) – grid of vector y components
stride (int | float) – sparsity of vectors, int specifies stride step in pixels, a float specifies it as a percentage.
thresh (float) – only plot vectors with magnitude greater than thres
scale (float) – multiply magnitude for easier visualization
alpha (float) – alpha value for vectors. Non-vector regions receive 0 alpha (if False, no alpha channel is used)
color (str | tuple | kwimage.Color) – RGB color of the vectors
thickness (int, default=1) – thickness of arrows
tipLength (float, default=0.1) – fraction of line length
line_type (int) – either cv2.LINE_4, cv2.LINE_8, or cv2.LINE_AA
- Returns
- The image with vectors overlaid. If image=None, then an
rgb/a image is created and returned.
- Return type
ndarray[float32]
Example
>>> import kwimage >>> width, height = 512, 512 >>> image = kwimage.grab_test_image(dsize=(width, height)) >>> x, y = np.meshgrid(np.arange(height), np.arange(width)) >>> dx, dy = x - width / 2, y - height / 2 >>> radians = np.arctan2(dx, dy) >>> mag = np.sqrt(dx ** 2 + dy ** 2) + 1e-3 >>> dx, dy = dx / mag, dy / mag >>> img = kwimage.draw_vector_field(image, dx, dy, scale=10, alpha=False) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(img) >>> kwplot.show_if_requested()
kwimage.im_filter
¶
Module Contents¶
|
In [1] they use a radius of 11.0 on CIFAR-10. |
|
Applies a mask to the fourier spectrum of an image |
- kwimage.im_filter.radial_fourier_mask(img_hwc, radius=11, axis=None, clip=None)[source]¶
In [1] they use a radius of 11.0 on CIFAR-10.
- Parameters
img_hwc (ndarray) – assumed to be float 01
References
[1] Jo and Bengio “Measuring the tendency of CNNs to Learn Surface Statistical Regularities” 2017. https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/py_transforms/py_fourier_transform/py_fourier_transform.html
Example
>>> from kwimage.im_filter import * # NOQA >>> import kwimage >>> img_hwc = kwimage.grab_test_image() >>> img_hwc = kwimage.ensure_float01(img_hwc) >>> out_hwc = radial_fourier_mask(img_hwc, radius=11) >>> # xdoc: REQUIRES(--show) >>> import kwplot >>> plt = kwplot.autoplt() >>> def keepdim(func): >>> def _wrap(im): >>> needs_transpose = (im.shape[0] == 3) >>> if needs_transpose: >>> im = im.transpose(1, 2, 0) >>> out = func(im) >>> if needs_transpose: >>> out = out.transpose(2, 0, 1) >>> return out >>> return _wrap >>> @keepdim >>> def rgb_to_lab(im): >>> return kwimage.convert_colorspace(im, src_space='rgb', dst_space='lab') >>> @keepdim >>> def lab_to_rgb(im): >>> return kwimage.convert_colorspace(im, src_space='lab', dst_space='rgb') >>> @keepdim >>> def rgb_to_yuv(im): >>> return kwimage.convert_colorspace(im, src_space='rgb', dst_space='yuv') >>> @keepdim >>> def yuv_to_rgb(im): >>> return kwimage.convert_colorspace(im, src_space='yuv', dst_space='rgb') >>> def show_data(img_hwc): >>> # dpath = ub.ensuredir('./fouriertest') >>> kwplot.imshow(img_hwc, fnum=1) >>> pnum_ = kwplot.PlotNums(nRows=4, nCols=5) >>> for r in range(0, 17): >>> imgt = radial_fourier_mask(img_hwc, r, clip=(0, 1)) >>> kwplot.imshow(imgt, pnum=pnum_(), fnum=2) >>> plt.gca().set_title('r = {}'.format(r)) >>> kwplot.set_figtitle('RGB') >>> # plt.gcf().savefig(join(dpath, '{}_{:08d}.png'.format('rgb', x))) >>> pnum_ = kwplot.PlotNums(nRows=4, nCols=5) >>> for r in range(0, 17): >>> imgt = lab_to_rgb(radial_fourier_mask(rgb_to_lab(img_hwc), r)) >>> kwplot.imshow(imgt, pnum=pnum_(), fnum=3) >>> plt.gca().set_title('r = {}'.format(r)) >>> kwplot.set_figtitle('LAB') >>> # plt.gcf().savefig(join(dpath, '{}_{:08d}.png'.format('lab', x))) >>> pnum_ = kwplot.PlotNums(nRows=4, nCols=5) >>> for r in range(0, 17): >>> imgt = yuv_to_rgb(radial_fourier_mask(rgb_to_yuv(img_hwc), r)) >>> kwplot.imshow(imgt, pnum=pnum_(), fnum=4) >>> plt.gca().set_title('r = {}'.format(r)) >>> kwplot.set_figtitle('YUV') >>> # plt.gcf().savefig(join(dpath, '{}_{:08d}.png'.format('yuv', x))) >>> show_data(img_hwc) >>> kwplot.show_if_requested()
- kwimage.im_filter.fourier_mask(img_hwc, mask, axis=None, clip=None)[source]¶
Applies a mask to the fourier spectrum of an image
- Parameters
img_hwc (ndarray) – assumed to be float 01
mask (ndarray) – mask used to modulate the image in the fourier domain. Usually these are boolean values (hence the name mask), but any numerical value is technically allowed.
- CommandLine:
xdoctest -m kwimage.im_filter fourier_mask –show
Example
>>> from kwimage.im_filter import * # NOQA >>> import kwimage >>> img_hwc = kwimage.grab_test_image(space='gray') >>> mask = np.random.rand(*img_hwc.shape[0:2]) >>> out_hwc = fourier_mask(img_hwc, mask) >>> # xdoc: REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(img_hwc, pnum=(1, 2, 1), fnum=1) >>> kwplot.imshow(out_hwc, pnum=(1, 2, 2), fnum=1) >>> kwplot.show_if_requested()
kwimage.im_io
¶
This module provides functions imread
and imwrite
which are wrappers
around concrete readers/writers provided by other libraries. This allows us to
support a wider array of formats than any of individual backends.
Module Contents¶
|
Reads image data in a specified format using some backend implementation. |
|
Writes image data to disk. |
|
Determine the height/width/channels of an image without reading the entire |
- kwimage.im_io.imread(fpath, space='auto', backend='auto')[source]¶
Reads image data in a specified format using some backend implementation.
- Parameters
fpath (str) – path to the file to be read
space (str, default=’auto’) – The desired colorspace of the image. Can by any colorspace accepted by convert_colorspace, or it can be ‘auto’, in which case the colorspace of the image is unmodified (except in the case where a color image is read by opencv, in which case we convert BGR to RGB by default). If None, then no modification is made to whatever backend is used to read the image.
New in version 0.7.10: when the backend does not resolve to “cv2” the “auto” space resolves to None, thus the image is read as-is.
backend (str, default=’auto’) – which backend reader to use. By default the file extension is used to determine this, but it can be manually overridden. Valid backends are ‘gdal’, ‘skimage’, ‘itk’, and ‘cv2’.
- Returns
the image data in the specified color space.
- Return type
ndarray
Note
if space is something non-standard like HSV or LAB, then the file must be a normal 8-bit color image, otherwise an error will occur.
- Raises
IOError - If the image cannot be read –
ImportError - If trying to read a nitf without gdal –
NotImplementedError - if trying to read a corner-case image –
Example
>>> # xdoctest: +REQUIRES(--network) >>> from kwimage.im_io import * # NOQA >>> import tempfile >>> from os.path import splitext # NOQA >>> # Test a non-standard image, which encodes a depth map >>> fpath = ub.grabdata( >>> 'http://www.topcoder.com/contest/problem/UrbanMapper3D/JAX_Tile_043_DTM.tif', >>> hasher='sha256', hash_prefix='64522acba6f0fb7060cd4c202ed32c5163c34e63d386afdada4190cce51ff4d4') >>> img1 = imread(fpath) >>> # Check that write + read preserves data >>> tmp = tempfile.NamedTemporaryFile(suffix=splitext(fpath)[1]) >>> imwrite(tmp.name, img1) >>> img2 = imread(tmp.name) >>> assert np.all(img2 == img1) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(img1, pnum=(1, 2, 1), fnum=1, norm=True) >>> kwplot.imshow(img2, pnum=(1, 2, 2), fnum=1, norm=True)
Example
>>> # xdoctest: +REQUIRES(--network) >>> import tempfile >>> img1 = imread(ub.grabdata( >>> 'http://i.imgur.com/iXNf4Me.png', fname='ada.png', hasher='sha256', >>> hash_prefix='898cf2588c40baf64d6e09b6a93b4c8dcc0db26140639a365b57619e17dd1c77')) >>> tmp_tif = tempfile.NamedTemporaryFile(suffix='.tif') >>> tmp_png = tempfile.NamedTemporaryFile(suffix='.png') >>> imwrite(tmp_tif.name, img1) >>> imwrite(tmp_png.name, img1) >>> tif_im = imread(tmp_tif.name) >>> png_im = imread(tmp_png.name) >>> assert np.all(tif_im == png_im) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(png_im, pnum=(1, 2, 1), fnum=1) >>> kwplot.imshow(tif_im, pnum=(1, 2, 2), fnum=1)
Example
>>> # xdoctest: +REQUIRES(--network) >>> import tempfile >>> tif_fpath = ub.grabdata( >>> 'https://ghostscript.com/doc/tiff/test/images/rgb-3c-16b.tiff', >>> fname='pepper.tif', hasher='sha256', >>> hash_prefix='31ff3a1f416cb7281acfbcbb4b56ee8bb94e9f91489602ff2806e5a49abc03c0') >>> img1 = imread(tif_fpath) >>> tmp_tif = tempfile.NamedTemporaryFile(suffix='.tif') >>> tmp_png = tempfile.NamedTemporaryFile(suffix='.png') >>> imwrite(tmp_tif.name, img1) >>> imwrite(tmp_png.name, img1) >>> tif_im = imread(tmp_tif.name) >>> png_im = imread(tmp_png.name) >>> assert np.all(tif_im == png_im) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(png_im / 2 ** 16, pnum=(1, 2, 1), fnum=1) >>> kwplot.imshow(tif_im / 2 ** 16, pnum=(1, 2, 2), fnum=1)
Example
>>> # xdoctest: +REQUIRES(module:itk, --network) >>> import kwimage >>> import ubelt as ub >>> # Grab an image that ITK can read >>> fpath = ub.grabdata( >>> url='https://data.kitware.com/api/v1/file/606754e32fa25629b9476f9e/download', >>> fname='brainweb1e5a10f17Rot20Tx20.mha', >>> hash_prefix='08f0812591691ae24a29788ba8cd1942e91', hasher='sha512') >>> # Read the image (this is actually a DxHxW stack of images) >>> img1_stack = kwimage.imread(fpath) >>> # Check that write + read preserves data >>> import tempfile >>> tmp_file = tempfile.NamedTemporaryFile(suffix='.mha') >>> kwimage.imwrite(tmp_file.name, img1_stack) >>> recon = kwimage.imread(tmp_file.name) >>> assert not np.may_share_memory(recon, img1_stack) >>> assert np.all(recon == img1_stack) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(kwimage.stack_images_grid(recon[0::20])) >>> kwplot.show_if_requested()
- Benchmark:
>>> from kwimage.im_io import * # NOQA >>> import timerit >>> import kwimage >>> import tempfile >>> # >>> dsize = (1920, 1080) >>> img1 = kwimage.grab_test_image('amazon', dsize=dsize) >>> ti = timerit.Timerit(10, bestof=3, verbose=1, unit='us') >>> formats = {} >>> dpath = ub.ensure_app_cache_dir('cache') >>> space = 'auto' >>> formats['png'] = kwimage.imwrite(join(dpath, '.png'), img1, space=space, backend='cv2') >>> formats['jpg'] = kwimage.imwrite(join(dpath, '.jpg'), img1, space=space, backend='cv2') >>> formats['tif_raw'] = kwimage.imwrite(join(dpath, '.raw.tif'), img1, space=space, backend='gdal', compress='RAW') >>> formats['tif_deflate'] = kwimage.imwrite(join(dpath, '.deflate.tif'), img1, space=space, backend='gdal', compress='DEFLATE') >>> formats['tif_lzw'] = kwimage.imwrite(join(dpath, '.lzw.tif'), img1, space=space, backend='gdal', compress='LZW') >>> grid = [ >>> ('cv2', 'png'), >>> ('cv2', 'jpg'), >>> ('gdal', 'jpg'), >>> ('turbojpeg', 'jpg'), >>> ('gdal', 'tif_raw'), >>> ('gdal', 'tif_lzw'), >>> ('gdal', 'tif_deflate'), >>> ('skimage', 'tif_raw'), >>> ] >>> backend, filefmt = 'cv2', 'png' >>> for backend, filefmt in grid: >>> for timer in ti.reset(f'imread-{filefmt}-{backend}'): >>> with timer: >>> kwimage.imread(formats[filefmt], space=space, backend=backend) >>> # Test all formats in auto mode >>> for filefmt in formats.keys(): >>> for timer in ti.reset(f'kwimage.imread-{filefmt}-auto'): >>> with timer: >>> kwimage.imread(formats[filefmt], space=space, backend='auto') >>> ti.measures = ub.map_vals(ub.sorted_vals, ti.measures) >>> import netharn as nh >>> print('ti.measures = {}'.format(nh.util.align(ub.repr2(ti.measures['min'], nl=2), ':'))) Timed best=42891.504 µs, mean=44008.439 ± 1409.2 µs for imread-png-cv2 Timed best=33146.808 µs, mean=34185.172 ± 656.3 µs for imread-jpg-cv2 Timed best=40120.306 µs, mean=41220.927 ± 1010.9 µs for imread-jpg-gdal Timed best=30798.162 µs, mean=31573.070 ± 737.0 µs for imread-jpg-turbojpeg Timed best=6223.170 µs, mean=6370.462 ± 150.7 µs for imread-tif_raw-gdal Timed best=42459.404 µs, mean=46519.940 ± 5664.9 µs for imread-tif_lzw-gdal Timed best=36271.175 µs, mean=37301.108 ± 861.1 µs for imread-tif_deflate-gdal Timed best=5239.503 µs, mean=6566.574 ± 1086.2 µs for imread-tif_raw-skimage ti.measures = { 'imread-tif_raw-skimage' : 0.0052395030070329085, 'imread-tif_raw-gdal' : 0.006223169999429956, 'imread-jpg-turbojpeg' : 0.030798161998973228, 'imread-jpg-cv2' : 0.03314680799667258, 'imread-tif_deflate-gdal': 0.03627117499127053, 'imread-jpg-gdal' : 0.040120305988239124, 'imread-tif_lzw-gdal' : 0.042459404008695856, 'imread-png-cv2' : 0.042891503995633684, }
>>> print('ti.measures = {}'.format(nh.util.align(ub.repr2(ti.measures['mean'], nl=2), ':')))
- kwimage.im_io.imwrite(fpath, image, space='auto', backend='auto', **kwargs)[source]¶
Writes image data to disk.
- Parameters
fpath (PathLike) – location to save the image
image (ndarray) – image data
space (str | None, default=’auto’) – the colorspace of the image to save. Can by any colorspace accepted by convert_colorspace, or it can be ‘auto’, in which case we assume the input image is either RGB, RGBA or grayscale. If None, then absolutely no color modification is made and whatever backend is used writes the image as-is.
New in version 0.7.10: when the backend does not resolve to “cv2”, the “auto” space resolves to None, thus the image is saved as-is.
backend (str, default=’auto’) – which backend writer to use. By default the file extension is used to determine this. Valid backends are ‘gdal’, ‘skimage’, ‘itk’, and ‘cv2’.
**kwargs – args passed to the backend writer
- Returns
path to the written file
- Return type
Notes
The image may be modified to preserve its colorspace depending on which backend is used to write the image.
When saving as a jpeg or png, the image must be encoded with the uint8 data type. When saving as a tiff, any data type is allowed.
- Raises
Exception – if the image cannot be written
- Doctest:
>>> # xdoctest: +REQUIRES(--network) >>> # This should be moved to a unit test >>> import tempfile >>> test_image_paths = [ >>> ub.grabdata('https://ghostscript.com/doc/tiff/test/images/rgb-3c-16b.tiff', fname='pepper.tif'), >>> ub.grabdata('http://i.imgur.com/iXNf4Me.png', fname='ada.png'), >>> #ub.grabdata('http://www.topcoder.com/contest/problem/UrbanMapper3D/JAX_Tile_043_DTM.tif'), >>> ub.grabdata('https://upload.wikimedia.org/wikipedia/commons/f/fa/Grayscale_8bits_palette_sample_image.png', fname='parrot.png') >>> ] >>> for fpath in test_image_paths: >>> for space in ['auto', 'rgb', 'bgr', 'gray', 'rgba']: >>> img1 = imread(fpath, space=space) >>> print('Test im-io consistency of fpath = {!r} in {} space, shape={}'.format(fpath, space, img1.shape)) >>> # Write the image in TIF and PNG format >>> tmp_tif = tempfile.NamedTemporaryFile(suffix='.tif') >>> tmp_png = tempfile.NamedTemporaryFile(suffix='.png') >>> imwrite(tmp_tif.name, img1, space=space, backend='skimage') >>> imwrite(tmp_png.name, img1, space=space) >>> tif_im = imread(tmp_tif.name, space=space) >>> png_im = imread(tmp_png.name, space=space) >>> assert np.all(tif_im == png_im), 'im-read/write inconsistency' >>> if _have_gdal: >>> tmp_tif2 = tempfile.NamedTemporaryFile(suffix='.tif') >>> imwrite(tmp_tif2.name, img1, space=space, backend='gdal') >>> tif_im2 = imread(tmp_tif2.name, space=space) >>> assert np.all(tif_im == tif_im2), 'im-read/write inconsistency' >>> if space == 'gray': >>> assert tif_im.ndim == 2 >>> assert png_im.ndim == 2 >>> elif space in ['rgb', 'bgr']: >>> assert tif_im.shape[2] == 3 >>> assert png_im.shape[2] == 3 >>> elif space in ['rgba', 'bgra']: >>> assert tif_im.shape[2] == 4 >>> assert png_im.shape[2] == 4
- Benchmark:
>>> import timerit >>> import os >>> import kwimage >>> import tempfile >>> # >>> img1 = kwimage.grab_test_image('astro', dsize=(1920, 1080)) >>> space = 'auto' >>> # >>> file_sizes = {} >>> # >>> ti = timerit.Timerit(10, bestof=3, verbose=2) >>> # >>> for timer in ti.reset('imwrite-skimage-tif'): >>> with timer: >>> tmp = tempfile.NamedTemporaryFile(suffix='.tif') >>> kwimage.imwrite(tmp.name, img1, space=space, backend='skimage') >>> file_sizes[ti.label] = os.stat(tmp.name).st_size >>> # >>> for timer in ti.reset('imwrite-cv2-png'): >>> with timer: >>> tmp = tempfile.NamedTemporaryFile(suffix='.png') >>> kwimage.imwrite(tmp.name, img1, space=space, backend='cv2') >>> file_sizes[ti.label] = os.stat(tmp.name).st_size >>> # >>> for timer in ti.reset('imwrite-cv2-jpg'): >>> with timer: >>> tmp = tempfile.NamedTemporaryFile(suffix='.jpg') >>> kwimage.imwrite(tmp.name, img1, space=space, backend='cv2') >>> file_sizes[ti.label] = os.stat(tmp.name).st_size >>> # >>> for timer in ti.reset('imwrite-gdal-raw'): >>> with timer: >>> tmp = tempfile.NamedTemporaryFile(suffix='.tif') >>> kwimage.imwrite(tmp.name, img1, space=space, backend='gdal', compress='RAW') >>> file_sizes[ti.label] = os.stat(tmp.name).st_size >>> # >>> for timer in ti.reset('imwrite-gdal-lzw'): >>> with timer: >>> tmp = tempfile.NamedTemporaryFile(suffix='.tif') >>> kwimage.imwrite(tmp.name, img1, space=space, backend='gdal', compress='LZW') >>> file_sizes[ti.label] = os.stat(tmp.name).st_size >>> # >>> for timer in ti.reset('imwrite-gdal-zstd'): >>> with timer: >>> tmp = tempfile.NamedTemporaryFile(suffix='.tif') >>> kwimage.imwrite(tmp.name, img1, space=space, backend='gdal', compress='ZSTD') >>> file_sizes[ti.label] = os.stat(tmp.name).st_size >>> # >>> for timer in ti.reset('imwrite-gdal-deflate'): >>> with timer: >>> tmp = tempfile.NamedTemporaryFile(suffix='.tif') >>> kwimage.imwrite(tmp.name, img1, space=space, backend='gdal', compress='DEFLATE') >>> file_sizes[ti.label] = os.stat(tmp.name).st_size >>> # >>> for timer in ti.reset('imwrite-gdal-jpeg'): >>> with timer: >>> tmp = tempfile.NamedTemporaryFile(suffix='.tif') >>> kwimage.imwrite(tmp.name, img1, space=space, backend='gdal', compress='JPEG') >>> file_sizes[ti.label] = os.stat(tmp.name).st_size >>> # >>> file_sizes = ub.sorted_vals(file_sizes) >>> import xdev >>> file_sizes_human = ub.map_vals(lambda x: xdev.byte_str(x, 'MB'), file_sizes) >>> print('ti.rankings = {}'.format(ub.repr2(ti.rankings, nl=2))) >>> print('file_sizes = {}'.format(ub.repr2(file_sizes_human, nl=1)))
Example
>>> # Test saving a multi-band file >>> import kwimage >>> import tempfile >>> # In this case the backend will not resolve to cv2, so >>> # we should not need to specify space. >>> data = np.random.rand(32, 32, 13).astype(np.float32) >>> temp = tempfile.NamedTemporaryFile(suffix='.tif') >>> fpath = temp.name >>> kwimage.imwrite(fpath, data) >>> recon = kwimage.imread(fpath) >>> assert np.all(recon == data)
>>> kwimage.imwrite(fpath, data, backend='skimage') >>> recon = kwimage.imread(fpath) >>> assert np.all(recon == data)
>>> import pytest >>> # In this case the backend will resolve to cv2, and thus we expect >>> # a failure >>> temp = tempfile.NamedTemporaryFile(suffix='.png') >>> fpath = temp.name >>> with pytest.raises(NotImplementedError): >>> kwimage.imwrite(fpath, data)
- kwimage.im_io.load_image_shape(fpath)[source]¶
Determine the height/width/channels of an image without reading the entire file.
- Parameters
fpath (str) – path to an image
- Returns
- Tuple - shape of the dataset.
Recall this library uses the convention that “shape” is refers to height,width,channels and “size” is width,height ordering.
- Benchmark:
>>> # For large files, PIL is much faster >>> import gdal >>> from PIL import Image >>> # >>> import kwimage >>> fpath = kwimage.grab_test_image_fpath() >>> # >>> ti = ub.Timerit(100, bestof=10, verbose=2) >>> for timer in ti.reset('gdal'): >>> with timer: >>> gdal_dset = gdal.Open(fpath, gdal.GA_ReadOnly) >>> width = gdal_dset.RasterXSize >>> height = gdal_dset.RasterYSize >>> gdal_dset = None >>> # >>> for timer in ti.reset('PIL'): >>> with timer: >>> pil_img = Image.open(fpath) >>> width, height = pil_img.size >>> pil_img.close() Timed gdal for: 100 loops, best of 10 time per loop: best=62.967 µs, mean=63.991 ± 0.8 µs Timed PIL for: 100 loops, best of 10 time per loop: best=46.640 µs, mean=47.314 ± 0.4 µs
kwimage.im_runlen
¶
Logic pertaining to run-length encodings
- SeeAlso:
- kwimage.structs.mask - stores binary segmentation masks, using RLEs as a
backend representation. Also contains cython logic for handling the coco-rle format.
Module Contents¶
|
Construct the run length encoding (RLE) of an image. |
|
Decode run length encoding back into an image. |
|
Translates a run-length encoded image in RLE-space. |
|
Uncompresses a coco-bytes RLE into an array representation. |
|
Compresses an array RLE into a coco-bytes RLE. |
- kwimage.im_runlen.encode_run_length(img, binary=False, order='C')[source]¶
Construct the run length encoding (RLE) of an image.
- Parameters
img (ndarray) – 2D image
binary (bool, default=False) – If true, assume that the input image only contains 0’s and 1’s. Set to True for compatibility with COCO (which does not support multi-value RLE encodings).
order ({‘C’, ‘F’}, default=’C’) – row-major (C) or column-major (F)
- Returns
encoding: dictionary items are:
counts (ndarray): the run length encoding
- shape (Tuple): the original image shape.
This should be in standard shape row-major (e.g. h/w) order
- binary (bool):
if True, the counts are assumed to encode only 0’s and 1’s, otherwise the counts encoding specifies any numeric values.
order ({‘C’, ‘F’}, default=’C’): encoding order
- Return type
- SeeAlso:
kwimage.Mask - a cython-backed data structure to handle coco-style RLEs
Example
>>> import ubelt as ub >>> lines = ub.codeblock( >>> ''' >>> .......... >>> ......111. >>> ..2...111. >>> .222..111. >>> 22222..... >>> .222...... >>> ..2....... >>> ''').replace('.', '0').splitlines() >>> img = np.array([list(map(int, line)) for line in lines]) >>> encoding = encode_run_length(img) >>> target = np.array([0,16,1,3,0,3,2,1,0,3,1,3,0,2,2,3,0,2,1,3,0,1,2,5,0,6,2,3,0,8,2,1,0,7]) >>> assert np.all(target == encoding['counts'])
Example
>>> binary = True >>> img = np.array([[1, 0, 1, 1, 1, 0, 0, 1, 0]]) >>> encoding = encode_run_length(img, binary=True) >>> assert encoding['counts'].tolist() == [0, 1, 1, 3, 2, 1, 1]
- kwimage.im_runlen.decode_run_length(counts, shape, binary=False, dtype=np.uint8, order='C')[source]¶
Decode run length encoding back into an image.
- Parameters
counts (ndarray) – the run-length encoding
shape (Tuple[int, int])
binary (bool) – if the RLE is binary or non-binary. Set to True for compatibility with COCO.
dtype (dtype, default=np.uint8) – data type for decoded image
order ({‘C’, ‘F’}, default=’C’) – row-major (C) or column-major (F)
- Returns
the reconstructed image
- Return type
ndarray
Example
>>> from kwimage.im_runlen import * # NOQA >>> img = np.array([[1, 0, 1, 1, 1, 0, 0, 1, 0]]) >>> encoded = encode_run_length(img, binary=True) >>> recon = decode_run_length(**encoded) >>> assert np.all(recon == img)
>>> import ubelt as ub >>> lines = ub.codeblock( >>> ''' >>> .......... >>> ......111. >>> ..2...111. >>> .222..111. >>> 22222..... >>> .222...... >>> ..2....... >>> ''').replace('.', '0').splitlines() >>> img = np.array([list(map(int, line)) for line in lines]) >>> encoded = encode_run_length(img) >>> recon = decode_run_length(**encoded) >>> assert np.all(recon == img)
- kwimage.im_runlen.rle_translate(rle, offset, output_shape=None)[source]¶
Translates a run-length encoded image in RLE-space.
- Parameters
rle (dict) – an enconding dict returned by encode_run_length
offset (Tuple) – x,y offset, CAREFUL, this can only accept integers
output_shape (Tuple, optional) – h,w of transformed mask. If unspecified the input rle shape is used.
- SeeAlso:
# ITK has some RLE code that looks like it can perform translations https://github.com/KitwareMedical/ITKRLEImage/blob/master/include/itkRLERegionOfInterestImageFilter.h
- Doctest:
>>> # test that translate works on all zero images >>> img = np.zeros((7, 8), dtype=np.uint8) >>> rle = encode_run_length(img, binary=True, order='F') >>> new_rle = rle_translate(rle, (1, 2), (6, 9)) >>> assert np.all(new_rle['counts'] == [54])
Example
>>> from kwimage.im_runlen import * # NOQA >>> img = np.array([ >>> [1, 1, 1, 1], >>> [0, 1, 0, 0], >>> [0, 1, 0, 1], >>> [1, 1, 1, 1],], dtype=np.uint8) >>> rle = encode_run_length(img, binary=True, order='C') >>> offset = (1, -1) >>> output_shape = (3, 5) >>> new_rle = rle_translate(rle, offset, output_shape) >>> decoded = decode_run_length(**new_rle) >>> print(decoded) [[0 0 1 0 0] [0 0 1 0 1] [0 1 1 1 1]]
Example
>>> from kwimage.im_runlen import * # NOQA >>> img = np.array([ >>> [0, 0, 0], >>> [0, 1, 0], >>> [0, 0, 0]], dtype=np.uint8) >>> rle = encode_run_length(img, binary=True, order='C') >>> new_rle = rle_translate(rle, (1, 0)) >>> decoded = decode_run_length(**new_rle) >>> print(decoded) [[0 0 0] [0 0 1] [0 0 0]] >>> new_rle = rle_translate(rle, (0, 1)) >>> decoded = decode_run_length(**new_rle) >>> print(decoded) [[0 0 0] [0 0 0] [0 1 0]]
- kwimage.im_runlen._rle_bytes_to_array(s, impl='auto')[source]¶
Uncompresses a coco-bytes RLE into an array representation.
- Parameters
s (bytes) – compressed coco bytes rle
impl (str) – which implementation to use (defaults to cython is possible)
- CommandLine:
xdoctest -m ~/code/kwimage/kwimage/im_runlen.py _rle_bytes_to_array
- Benchmark:
>>> import ubelt as ub >>> from kwimage.im_runlen import _rle_bytes_to_array >>> s = b';?1B10O30O4' >>> ti = ub.Timerit(1000, bestof=50, verbose=2) >>> # --- time python impl --- >>> for timer in ti.reset('python'): >>> with timer: >>> _rle_bytes_to_array(s, impl='python') >>> # --- time cython impl --- >>> # xdoctest: +REQUIRES(--mask) >>> for timer in ti.reset('cython'): >>> with timer: >>> _rle_bytes_to_array(s, impl='cython')
- kwimage.im_runlen._rle_array_to_bytes(counts, impl='auto')[source]¶
Compresses an array RLE into a coco-bytes RLE.
- Parameters
counts (ndarray) – uncompressed array rle
impl (str) – which implementation to use (defaults to cython is possible)
Example
>>> # xdoctest: +REQUIRES(--mask) >>> from kwimage.im_runlen import _rle_array_to_bytes >>> from kwimage.im_runlen import _rle_bytes_to_array >>> arr_counts = np.array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]) >>> str_counts = _rle_array_to_bytes(arr_counts) >>> arr_counts2 = _rle_bytes_to_array(str_counts) >>> assert np.all(arr_counts2 == arr_counts)
- Benchmark:
>>> # xdoctest: +REQUIRES(--mask) >>> import ubelt as ub >>> from kwimage.im_runlen import _rle_array_to_bytes >>> from kwimage.im_runlen import _rle_bytes_to_array >>> counts = np.array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]) >>> ti = ub.Timerit(1000, bestof=50, verbose=2) >>> # --- time python impl --- >>> #for timer in ti.reset('python'): >>> # with timer: >>> # _rle_array_to_bytes(s, impl='python') >>> # --- time cython impl --- >>> for timer in ti.reset('cython'): >>> with timer: >>> _rle_array_to_bytes(s, impl='cython')
kwimage.im_stack
¶
Stack images
Module Contents¶
|
Make a new image with the input images side-by-side |
|
Stacks images in a grid. Optionally return transforms of original image |
|
|
References |
- kwimage.im_stack.stack_images(images, axis=0, resize=None, interpolation=None, overlap=0, return_info=False, bg_value=None)[source]¶
Make a new image with the input images side-by-side
- Parameters
images (Iterable[ndarray[ndim=2]]) – image data
axis (int) – axis to stack on (either 0 or 1)
resize (int, str, or None) – if None image sizes are not modified, otherwise resize resize can be either 0 or 1. We resize the resize-th image to match the 1 - resize-th image. Can also be strings “larger” or “smaller”.
interpolation (int or str) – string or cv2-style interpolation type. only used if resize or overlap > 0
overlap (int) – number of pixels to overlap. Using a negative number results in a border.
return_info (bool) – if True, returns transforms (scales and translations) to map from original image to its new location.
- Returns
an image of stacked images side by side
OR
- Tuple[ndarray, List]: where the first item is the aformentioned stacked
image and the second item is a list of transformations for each input image mapping it to its location in the returned image.
- Return type
ndarray
Example
>>> import kwimage >>> img1 = kwimage.grab_test_image('carl', space='rgb') >>> img2 = kwimage.grab_test_image('astro', space='rgb') >>> images = [img1, img2] >>> imgB, transforms = stack_images(images, axis=0, resize='larger', >>> overlap=-10, return_info=True) >>> print('imgB.shape = {}'.format(imgB.shape)) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> import kwimage >>> kwplot.autompl() >>> kwplot.imshow(imgB, colorspace='rgb') >>> wh1 = np.multiply(img1.shape[0:2][::-1], transforms[0].scale) >>> wh2 = np.multiply(img2.shape[0:2][::-1], transforms[1].scale) >>> xoff1, yoff1 = transforms[0].translation >>> xoff2, yoff2 = transforms[1].translation >>> xywh1 = (xoff1, yoff1, wh1[0], wh1[1]) >>> xywh2 = (xoff2, yoff2, wh2[0], wh2[1]) >>> kwplot.draw_boxes(kwimage.Boxes([xywh1], 'xywh'), color=(1.0, 0, 0)) >>> kwplot.draw_boxes(kwimage.Boxes([xywh2], 'xywh'), color=(1.0, 0, 0)) >>> kwplot.show_if_requested() ((662, 512, 3), (0.0, 0.0), (0, 150))
- kwimage.im_stack.stack_images_grid(images, chunksize=None, axis=0, overlap=0, return_info=False, bg_value=None)[source]¶
Stacks images in a grid. Optionally return transforms of original image positions in the output image.
- Parameters
images (Iterable[ndarray[ndim=2]]) – image data
chunksize (int, default=None) – number of rows per column or columns per row depending on the value of axis. If unspecified, computes this as int(sqrt(len(images))).
axis (int, default=0) – If 0, chunksize is columns per row. If 1, chunksize is rows per column.
overlap (int) – number of pixels to overlap. Using a negative number results in a border.
return_info (bool) – if True, returns transforms (scales and translations) to map from original image to its new location.
- Returns
an image of stacked images in a grid pattern
OR
- Tuple[ndarray, List]: where the first item is the aformentioned stacked
image and the second item is a list of transformations for each input image mapping it to its location in the returned image.
- Return type
ndarray
- kwimage.im_stack._stack_two_images(img1, img2, axis=0, resize=None, interpolation=None, overlap=0, bg_value=None)[source]¶
- Returns
imgB, offset_tup, sf_tup
- Return type
Tuple[ndarray, Tuple, Tuple]
- Ignore:
import xinspect globals().update(xinspect.get_func_kwargs(_stack_two_images)) resize = 1 overlap = -10
- kwimage.im_stack._efficient_rectangle_packing()[source]¶
References
https://en.wikipedia.org/wiki/Packing_problems https://github.com/Penlect/rectangle-packer https://github.com/secnot/rectpack https://stackoverflow.com/questions/1213394/what-algorithm-can-be-used-for-packing-rectangles-of-different-sizes-into-the-sm https://www.codeproject.com/Articles/210979/Fast-optimizing-rectangle-packing-algorithm-for-bu
- Requires:
pip install rectangle-packer
- Ignore:
>>> import kwimage >>> anchors = anchors=[[1, 1], [3 / 4, 1], [1, 3 / 4]] >>> boxes = kwimage.Boxes.random(num=100, anchors=anchors).scale((100, 100)).to_xywh() >>> # Create a bunch of rectangles (width, height) >>> sizes = boxes.data[:, 2:4].astype(int).tolist() >>> import rpack >>> positions = rpack.pack(sizes) >>> boxes.data[:, 0:2] = positions >>> boxes = boxes.scale(0.95, about='center') >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> boxes.draw() >>> # The result will be a list of (x, y) positions: >>> positions
images = [kwimage.grab_test_image(key) for key in kwimage.grab_test_image.keys()] images = [kwimage.imresize(g, max_dim=256) for g in images]
sizes = [g.shape[0:2][::-1] for g in images]
import rpack positions = rpack.pack(sizes)
!pip install rectpack
import rectpack
bin_width = 512
packer = rectpack.newPacker(rotation=False) for rid, (w, h) in enumerate(sizes):
packer.add_rect(w, h, rid=rid)
max_w, max_h = np.array(sizes).sum(axis=0) # f = max_w / bin_width avail_height = max_h packer.add_bin(bin_width, avail_height)
packer.pack()
packer[0]
all_rects = packer.rect_list() all_rects = np.array(all_rects)
rids = all_rects[:, 5] tl_x = all_rects[:, 1] tl_y = all_rects[:, 2] w = all_rects[:, 3] h = all_rects[:, 4]
ltrb = kwimage.Boxes(all_rects[:, 1:5], ‘xywh’).to_ltrb() canvas_w, canvas_h = ltrb.data[:, 2:4].max(axis=0)
canvas = np.zeros((canvas_h, canvas_w), dtype=np.float32)
- for b, x, y, w, h, rid in all_rects:
img = images[rid] img = kwimage.ensure_float01(img) canvas, img = kwimage.make_channels_comparable(canvas, img) canvas[y: y + h, x: x + w] = img
kwplot.imshow(canvas)
kwimage.transform
¶
Objects for representing and manipulating image transforms.
Module Contents¶
Inherit from this class and define |
|
Base class for matrix-based transform. |
|
Base class for matrix-based transform. |
|
Currently just a stub class that may be used to implement projective / |
|
Helper for making affine transform matrices. |
|
- class kwimage.transform.Transform[source]¶
Bases:
ubelt.NiceRepr
Inherit from this class and define
__nice__
to “nicely” print your objects.Defines
__str__
and__repr__
in terms of__nice__
function Classes that inherit fromNiceRepr
should redefine__nice__
. If the inheriting class has a__len__
, method then the default__nice__
method will return its length.Example
>>> import ubelt as ub >>> class Foo(ub.NiceRepr): ... def __nice__(self): ... return 'info' >>> foo = Foo() >>> assert str(foo) == '<Foo(info)>' >>> assert repr(foo).startswith('<Foo(info) at ')
Example
>>> import ubelt as ub >>> class Bar(ub.NiceRepr): ... pass >>> bar = Bar() >>> import pytest >>> with pytest.warns(RuntimeWarning) as record: >>> assert 'object at' in str(bar) >>> assert 'object at' in repr(bar)
Example
>>> import ubelt as ub >>> class Baz(ub.NiceRepr): ... def __len__(self): ... return 5 >>> baz = Baz() >>> assert str(baz) == '<Baz(5)>'
Example
>>> import ubelt as ub >>> # If your nice message has a bug, it shouldn't bring down the house >>> class Foo(ub.NiceRepr): ... def __nice__(self): ... assert False >>> foo = Foo() >>> import pytest >>> with pytest.warns(RuntimeWarning) as record: >>> print('foo = {!r}'.format(foo)) foo = <...Foo ...>
- class kwimage.transform.Matrix(matrix)[source]¶
Bases:
Transform
Base class for matrix-based transform.
Example
>>> from kwimage.transform import * # NOQA >>> ms = {} >>> ms['random()'] = Matrix.random() >>> ms['eye()'] = Matrix.eye() >>> ms['random(3)'] = Matrix.random(3) >>> ms['random(4, 4)'] = Matrix.random(4, 4) >>> ms['eye(3)'] = Matrix.eye(3) >>> ms['explicit'] = Matrix(np.array([[1.618]])) >>> for k, m in ms.items(): >>> print('----') >>> print(f'{k} = {m}') >>> print(f'{k}.inv() = {m.inv()}') >>> print(f'{k}.T = {m.T}') >>> print(f'{k}.det() = {m.det()}')
- classmethod coerce(cls, data=None, **kwargs)[source]¶
Example
>>> Matrix.coerce({'type': 'matrix', 'matrix': [[1, 0, 0], [0, 1, 0]]}) >>> Matrix.coerce(np.eye(3)) >>> Matrix.coerce(None)
- __matmul__(self, other)[source]¶
Example
>>> m = {} >>> # Works, and returns a Matrix >>> m[len(m)] = x = Matrix.random() @ np.eye(2) >>> assert isinstance(x, Matrix) >>> m[len(m)] = x = Matrix.random() @ None >>> assert isinstance(x, Matrix) >>> # Works, and returns an ndarray >>> m[len(m)] = x = np.eye(3) @ Matrix.random(3) >>> assert isinstance(x, np.ndarray) >>> # These do not work >>> # m[len(m)] = None @ Matrix.random() >>> # m[len(m)] = np.eye(3) @ None >>> print('m = {}'.format(ub.repr2(m)))
- class kwimage.transform.Linear(matrix)[source]¶
Bases:
Matrix
Base class for matrix-based transform.
Example
>>> from kwimage.transform import * # NOQA >>> ms = {} >>> ms['random()'] = Matrix.random() >>> ms['eye()'] = Matrix.eye() >>> ms['random(3)'] = Matrix.random(3) >>> ms['random(4, 4)'] = Matrix.random(4, 4) >>> ms['eye(3)'] = Matrix.eye(3) >>> ms['explicit'] = Matrix(np.array([[1.618]])) >>> for k, m in ms.items(): >>> print('----') >>> print(f'{k} = {m}') >>> print(f'{k}.inv() = {m.inv()}') >>> print(f'{k}.T = {m.T}') >>> print(f'{k}.det() = {m.det()}')
- class kwimage.transform.Projective(matrix)[source]¶
Bases:
Linear
Currently just a stub class that may be used to implement projective / homography transforms in the future.
- class kwimage.transform.Affine(matrix)[source]¶
Bases:
Projective
Helper for making affine transform matrices.
Example
>>> self = Affine(np.eye(3)) >>> m1 = np.eye(3) @ self >>> m2 = self @ np.eye(3)
Example
>>> from kwimage.transform import * # NOQA >>> m = {} >>> # Works, and returns a Affine >>> m[len(m)] = x = Affine.random() @ np.eye(3) >>> assert isinstance(x, Affine) >>> m[len(m)] = x = Affine.random() @ None >>> assert isinstance(x, Affine) >>> # Works, and returns an ndarray >>> m[len(m)] = x = np.eye(3) @ Affine.random(3) >>> assert isinstance(x, np.ndarray) >>> # Works, and returns an Matrix >>> m[len(m)] = x = Affine.random() @ Matrix.random(3) >>> assert isinstance(x, Matrix) >>> m[len(m)] = x = Matrix.random(3) @ Affine.random() >>> assert isinstance(x, Matrix) >>> print('m = {}'.format(ub.repr2(m)))
- concise(self)[source]¶
Return a concise coercable dictionary representation of this matrix
- Returns
- a small serializable dict that can be passed
to
Affine.coerce()
to reconstruct this object.
- Return type
- Returns
dictionary with consise parameters
- Return type
Dict
Example
>>> self = Affine.random(rng=0, scale=1) >>> params = self.concise() >>> assert np.allclose(Affine.coerce(params).matrix, self.matrix) >>> print('params = {}'.format(ub.repr2(params, nl=1, precision=2))) params = { 'offset': (0.08, 0.38), 'theta': 0.08, 'type': 'affine', }
Example
>>> self = Affine.random(rng=0, scale=2, offset=0) >>> params = self.concise() >>> assert np.allclose(Affine.coerce(params).matrix, self.matrix) >>> print('params = {}'.format(ub.repr2(params, nl=1, precision=2))) params = { 'scale': 2.00, 'theta': 0.04, 'type': 'affine', }
- classmethod coerce(cls, data=None, **kwargs)[source]¶
Attempt to coerce the data into an affine object
- Parameters
data – some data we attempt to coerce to an Affine matrix
**kwargs – some data we attempt to coerce to an Affine matrix, mutually exclusive with data.
- Returns
Affine
Example
>>> import kwimage >>> kwimage.Affine.coerce({'type': 'affine', 'matrix': [[1, 0, 0], [0, 1, 0]]}) >>> kwimage.Affine.coerce({'scale': 2}) >>> kwimage.Affine.coerce({'offset': 3}) >>> kwimage.Affine.coerce(np.eye(3)) >>> kwimage.Affine.coerce(None) >>> kwimage.Affine.coerce(skimage.transform.AffineTransform(scale=30))
- decompose(self)[source]¶
Decompose the affine matrix into its individual scale, translation, rotation, and skew parameters.
- Returns
decomposed offset, scale, theta, and shear params
- Return type
Dict
References
https://math.stackexchange.com/questions/612006/decompose-affine
Example
>>> self = Affine.random() >>> params = self.decompose() >>> recon = Affine.coerce(**params) >>> params2 = recon.decompose() >>> pt = np.vstack([np.random.rand(2, 1), [1]]) >>> result1 = self.matrix[0:2] @ pt >>> result2 = recon.matrix[0:2] @ pt >>> assert np.allclose(result1, result2)
>>> self = Affine.scale(0.001) @ Affine.random() >>> params = self.decompose() >>> self.det()
- classmethod scale(cls, scale)[source]¶
Create a scale Affine object
- Parameters
scale (float | Tuple[float, float]) – x, y scale factor
- Returns
Affine
- classmethod translate(cls, offset)[source]¶
Create a translation Affine object
- Parameters
offset (float | Tuple[float, float]) – x, y translation factor
- Returns
Affine
- classmethod rotate(cls, theta)[source]¶
Create a rotation Affine object
- Parameters
theta (float) – counter-clockwise rotation angle in radians
- Returns
Affine
- classmethod random(cls, rng=None, **kw)[source]¶
Create a random Affine object
- Parameters
rng – random number generator
**kw – passed to
Affine.random_params()
. can contain coercable random distributions for scale, offset, about, theta, and shear.
- Returns
Affine
- classmethod random_params(cls, rng=None, **kw)[source]¶
- Parameters
rng – random number generator
**kw – can contain coercable random distributions for scale, offset, about, theta, and shear.
- Returns
affine parameters suitable to be passed to Affine.affine
- Return type
Dict
Todo
[ ] improve kwargs parameterization
- classmethod affine(cls, scale=None, offset=None, theta=None, shear=None, about=None)[source]¶
Create an affine matrix from high-level parameters
- Parameters
scale (float | Tuple[float, float]) – x, y scale factor
offset (float | Tuple[float, float]) – x, y translation factor
theta (float) – counter-clockwise rotation angle in radians
shear (float) – counter-clockwise shear angle in radians
about (float | Tuple[float, float]) – x, y location of the origin
- Returns
the constructed Affine object
- Return type
Example
>>> rng = kwarray.ensure_rng(None) >>> scale = rng.randn(2) * 10 >>> offset = rng.randn(2) * 10 >>> about = rng.randn(2) * 10 >>> theta = rng.randn() * 10 >>> shear = rng.randn() * 10 >>> # Create combined matrix from all params >>> F = Affine.affine( >>> scale=scale, offset=offset, theta=theta, shear=shear, >>> about=about) >>> # Test that combining components matches >>> S = Affine.affine(scale=scale) >>> T = Affine.affine(offset=offset) >>> R = Affine.affine(theta=theta) >>> H = Affine.affine(shear=shear) >>> O = Affine.affine(offset=about) >>> # combine (note shear must be on the RHS of rotation) >>> alt = O @ T @ R @ H @ S @ O.inv() >>> print('F = {}'.format(ub.repr2(F.matrix.tolist(), nl=1))) >>> print('alt = {}'.format(ub.repr2(alt.matrix.tolist(), nl=1))) >>> assert np.all(np.isclose(alt.matrix, F.matrix)) >>> pt = np.vstack([np.random.rand(2, 1), [[1]]]) >>> warp_pt1 = (F.matrix @ pt) >>> warp_pt2 = (alt.matrix @ pt) >>> assert np.allclose(warp_pt2, warp_pt1)
- Sympy:
>>> # xdoctest: +SKIP >>> import sympy >>> # Shows the symbolic construction of the code >>> # https://groups.google.com/forum/#!topic/sympy/k1HnZK_bNNA >>> from sympy.abc import theta >>> x0, y0, sx, sy, theta, shear, tx, ty = sympy.symbols( >>> 'x0, y0, sx, sy, theta, shear, tx, ty') >>> # move the center to 0, 0 >>> tr1_ = np.array([[1, 0, -x0], >>> [0, 1, -y0], >>> [0, 0, 1]]) >>> # Define core components of the affine transform >>> S = np.array([ # scale >>> [sx, 0, 0], >>> [ 0, sy, 0], >>> [ 0, 0, 1]]) >>> H = np.array([ # shear >>> [1, -sympy.sin(shear), 0], >>> [0, sympy.cos(shear), 0], >>> [0, 0, 1]]) >>> R = np.array([ # rotation >>> [sympy.cos(theta), -sympy.sin(theta), 0], >>> [sympy.sin(theta), sympy.cos(theta), 0], >>> [ 0, 0, 1]]) >>> T = np.array([ # translation >>> [ 1, 0, tx], >>> [ 0, 1, ty], >>> [ 0, 0, 1]]) >>> # Contruct the affine 3x3 about the origin >>> aff0 = np.array(sympy.simplify(T @ R @ H @ S)) >>> # move 0, 0 back to the specified origin >>> tr2_ = np.array([[1, 0, x0], >>> [0, 1, y0], >>> [0, 0, 1]]) >>> # combine transformations >>> aff = tr2_ @ aff0 @ tr1_ >>> print('aff = {}'.format(ub.repr2(aff.tolist(), nl=1)))
kwimage.util_warp
¶
[ ] Replace internal padded slice with kwarray.padded_slice
Module Contents¶
|
Creates a homogenous coordinate system. |
|
|
|
A pytorch implementation of warp affine that works similarly to |
|
Returns an aligned version of the source tensor and destination index. |
|
Add the source values array into the destination array at a particular |
|
Add the source values array into the destination array at a particular |
|
Take the max of the source values array into and the destination array at a |
|
Take the min of the source values array into and the destination array at a |
|
Take a subpixel slice from a larger image. The returned output is |
|
Translates an image by a subpixel shift value using bilinear interpolation |
|
Allows slices with out-of-bound coordinates. Any out of bounds coordinate |
|
|
|
Given image dimensions, bounding box dimensions, and a padding get the |
|
implementation with cv2.warpAffine for speed / correctness comparison |
|
Warp ND points / coordinates using a transformation matrix. |
|
Remove homogenous coordinate to a point array. |
|
Add a homogenous coordinate to a point array |
|
Get values at subpixel locations |
|
Set values at subpixel locations |
|
- kwimage.util_warp._coordinate_grid(dims, align_corners=False)[source]¶
Creates a homogenous coordinate system.
- Parameters
dims (Tuple[int]*) – height / width or depth / height / width
align_corners (bool) – returns a grid where the left and right corners assigned to the extreme values and intermediate values are interpolated.
- Returns
Tensor[shape=(3, *DIMS)]
References
https://github.com/ClementPinard/SfmLearner-Pytorch/blob/master/inverse_warp.py
Example
>>> # xdoctest: +IGNORE_WHITESPACE >>> # xdoctest: +REQUIRES(module:torch) >>> _coordinate_grid((2, 2)) tensor([[[0., 1.], [0., 1.]], [[0., 0.], [1., 1.]], [[1., 1.], [1., 1.]]]) >>> _coordinate_grid((2, 2, 2)) >>> _coordinate_grid((2, 2), align_corners=True) tensor([[[0., 2.], [0., 2.]], [[0., 0.], [2., 2.]], [[1., 1.], [1., 1.]]])
- kwimage.util_warp.warp_tensor(inputs, mat, output_dims, mode='bilinear', padding_mode='zeros', isinv=False, ishomog=None, align_corners=False, new_mode=False)[source]¶
A pytorch implementation of warp affine that works similarly to cv2.warpAffine / cv2.warpPerspective.
It is possible to use 3x3 transforms to warp 2D image data. It is also possible to use 4x4 transforms to warp 3D volumetric data.
- Parameters
inputs (Tensor[…, *DIMS]) – tensor to warp. Up to 3 (determined by output_dims) of the trailing space-time dimensions are warped. Best practice is to use inputs with the shape in [B, C, *DIMS].
mat (Tensor) – either a 3x3 / 4x4 single transformation matrix to apply to all inputs or Bx3x3 or Bx4x4 tensor that specifies a transformation matrix for each batch item.
output_dims (Tuple[int]*) –
- The output space-time dimensions. This can either be in the form
(W,), (H, W), or (D, H, W).
mode (str) – Can be bilinear or nearest. See torch.nn.functional.grid_sample
padding_mode (str) – Can be zeros, border, or reflection. See torch.nn.functional.grid_sample.
isinv (bool, default=False) – Set to true if mat is the inverse transform
ishomog (bool, default=None) – Set to True if the matrix is non-affine
align_corners (bool, default=False) – Note the default of False does not work correctly with grid_sample in torch <= 1.2, but using align_corners=True isnt typically what you want either. We will be stuck with buggy functionality until torch 1.3 is released.
However, using align_corners=0 does seem to reasonably correspond with opencv behavior.
Notes
Also, it may be possible to speed up the code with F.affine_grid
- KNOWN ISSUE: There appears to some difference with cv2.warpAffine when
rotation or shear are non-zero. I’m not sure what the cause is. It may just be floating point issues, but Im’ not sure.
Todo
[ ] FIXME: see example in Mask.scale where this algo breaks when
the matrix is 2x3 - [ ] Make this algo work when matrix ix 2x2
References
https://discuss.pytorch.org/t/affine-transformation-matrix-paramters-conversion/19522 https://github.com/pytorch/pytorch/issues/15386
Example
>>> # Create a relatively simple affine matrix >>> # xdoctest: +REQUIRES(module:torch) >>> import skimage >>> mat = torch.FloatTensor(skimage.transform.AffineTransform( >>> translation=[1, -1], scale=[.532, 2], >>> rotation=0, shear=0, >>> ).params) >>> # Create inputs and an output dimension >>> input_shape = [1, 1, 4, 5] >>> inputs = torch.arange(int(np.prod(input_shape))).reshape(*input_shape).float() >>> output_dims = (11, 7) >>> # Warp with our code >>> result1 = warp_tensor(inputs, mat, output_dims=output_dims, align_corners=0) >>> print('result1 =\n{}'.format(ub.repr2(result1.cpu().numpy()[0, 0], precision=2))) >>> # Warp with opencv >>> import cv2 >>> cv2_M = mat.cpu().numpy()[0:2] >>> src = inputs[0, 0].cpu().numpy() >>> dsize = tuple(output_dims[::-1]) >>> result2 = cv2.warpAffine(src, cv2_M, dsize=dsize, flags=cv2.INTER_LINEAR) >>> print('result2 =\n{}'.format(ub.repr2(result2, precision=2))) >>> # Ensure the results are the same (up to floating point errors) >>> assert np.all(np.isclose(result1[0, 0].cpu().numpy(), result2, atol=1e-2, rtol=1e-2))
Example
>>> # Create a relatively simple affine matrix >>> # xdoctest: +REQUIRES(module:torch) >>> import skimage >>> mat = torch.FloatTensor(skimage.transform.AffineTransform( >>> rotation=0.01, shear=0.1).params) >>> # Create inputs and an output dimension >>> input_shape = [1, 1, 4, 5] >>> inputs = torch.arange(int(np.prod(input_shape))).reshape(*input_shape).float() >>> output_dims = (11, 7) >>> # Warp with our code >>> result1 = warp_tensor(inputs, mat, output_dims=output_dims) >>> print('result1 =\n{}'.format(ub.repr2(result1.cpu().numpy()[0, 0], precision=2, supress_small=True))) >>> print('result1.shape = {}'.format(result1.shape)) >>> # Warp with opencv >>> import cv2 >>> cv2_M = mat.cpu().numpy()[0:2] >>> src = inputs[0, 0].cpu().numpy() >>> dsize = tuple(output_dims[::-1]) >>> result2 = cv2.warpAffine(src, cv2_M, dsize=dsize, flags=cv2.INTER_LINEAR) >>> print('result2 =\n{}'.format(ub.repr2(result2, precision=2))) >>> print('result2.shape = {}'.format(result2.shape)) >>> # Ensure the results are the same (up to floating point errors) >>> # NOTE: The floating point errors seem to be significant for rotation / shear >>> assert np.all(np.isclose(result1[0, 0].cpu().numpy(), result2, atol=1, rtol=1e-2))
Example
>>> # Create a random affine matrix >>> # xdoctest: +REQUIRES(module:torch) >>> import skimage >>> rng = np.random.RandomState(0) >>> mat = torch.FloatTensor(skimage.transform.AffineTransform( >>> translation=rng.randn(2), scale=1 + rng.randn(2), >>> rotation=rng.randn() / 10., shear=rng.randn() / 10., >>> ).params) >>> # Create inputs and an output dimension >>> input_shape = [1, 1, 5, 7] >>> inputs = torch.arange(int(np.prod(input_shape))).reshape(*input_shape).float() >>> output_dims = (3, 11) >>> # Warp with our code >>> result1 = warp_tensor(inputs, mat, output_dims=output_dims, align_corners=0) >>> print('result1 =\n{}'.format(ub.repr2(result1.cpu().numpy()[0, 0], precision=2))) >>> # Warp with opencv >>> import cv2 >>> cv2_M = mat.cpu().numpy()[0:2] >>> src = inputs[0, 0].cpu().numpy() >>> dsize = tuple(output_dims[::-1]) >>> result2 = cv2.warpAffine(src, cv2_M, dsize=dsize, flags=cv2.INTER_LINEAR) >>> print('result2 =\n{}'.format(ub.repr2(result2, precision=2))) >>> # Ensure the results are the same (up to floating point errors) >>> # NOTE: The errors seem to be significant for rotation / shear >>> assert np.all(np.isclose(result1[0, 0].cpu().numpy(), result2, atol=1, rtol=1e-2))
Example
>>> # Test 3D warping with identity >>> # xdoctest: +REQUIRES(module:torch) >>> mat = torch.eye(4) >>> input_dims = [2, 3, 3] >>> output_dims = (2, 3, 3) >>> input_shape = [1, 1] + input_dims >>> inputs = torch.arange(int(np.prod(input_shape))).reshape(*input_shape).float() >>> result = warp_tensor(inputs, mat, output_dims=output_dims) >>> print('result =\n{}'.format(ub.repr2(result.cpu().numpy()[0, 0], precision=2))) >>> assert torch.all(inputs == result)
Example
>>> # Test 3D warping with scaling >>> # xdoctest: +REQUIRES(module:torch) >>> mat = torch.FloatTensor([ >>> [0.8, 0, 0, 0], >>> [ 0, 1.0, 0, 0], >>> [ 0, 0, 1.2, 0], >>> [ 0, 0, 0, 1], >>> ]) >>> input_dims = [2, 3, 3] >>> output_dims = (2, 3, 3) >>> input_shape = [1, 1] + input_dims >>> inputs = torch.arange(int(np.prod(input_shape))).reshape(*input_shape).float() >>> result = warp_tensor(inputs, mat, output_dims=output_dims, align_corners=0) >>> print('result =\n{}'.format(ub.repr2(result.cpu().numpy()[0, 0], precision=2))) result = np.array([[[ 0. , 1.25, 1. ], [ 3. , 4.25, 2.5 ], [ 6. , 7.25, 4. ]], ... [[ 7.5 , 8.75, 4.75], [10.5 , 11.75, 6.25], [13.5 , 14.75, 7.75]]], dtype=np.float32)
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> mat = torch.eye(3) >>> input_dims = [5, 7] >>> output_dims = (11, 7) >>> for n_prefix_dims in [0, 1, 2, 3, 4, 5]: >>> input_shape = [2] * n_prefix_dims + input_dims >>> inputs = torch.arange(int(np.prod(input_shape))).reshape(*input_shape).float() >>> result = warp_tensor(inputs, mat, output_dims=output_dims) >>> #print('result =\n{}'.format(ub.repr2(result.cpu().numpy(), precision=2))) >>> print(result.shape)
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> mat = torch.eye(4) >>> input_dims = [5, 5, 5] >>> output_dims = (6, 6, 6) >>> for n_prefix_dims in [0, 1, 2, 3, 4, 5]: >>> input_shape = [2] * n_prefix_dims + input_dims >>> inputs = torch.arange(int(np.prod(input_shape))).reshape(*input_shape).float() >>> result = warp_tensor(inputs, mat, output_dims=output_dims) >>> #print('result =\n{}'.format(ub.repr2(result.cpu().numpy(), precision=2))) >>> print(result.shape)
- Ignore:
import xdev globals().update(xdev.get_func_kwargs(warp_tensor)) >>> # xdoctest: +REQUIRES(module:torch) >>> import cv2 >>> inputs = torch.arange(9).view(1, 1, 3, 3).float() + 2 >>> input_dims = inputs.shape[2:] >>> #output_dims = (6, 6) >>> def fmt(a): >>> return ub.repr2(a.numpy(), precision=2) >>> s = 2.5 >>> output_dims = tuple(np.round((np.array(input_dims) * s)).astype(int).tolist()) >>> mat = torch.FloatTensor([[s, 0, 0], [0, s, 0], [0, 0, 1]]) >>> inv = mat.inverse() >>> warp_tensor(inputs, mat, output_dims) >>> print(‘## INPUTS’) >>> print(fmt(inputs)) >>> print(’nalign_corners=True’) >>> print(’—-‘) >>> print(‘## warp_tensor, align_corners=True’) >>> print(fmt(warp_tensor(inputs, inv, output_dims, isinv=True, align_corners=True))) >>> print(‘## interpolate, align_corners=True’) >>> print(fmt(F.interpolate(inputs, output_dims, mode=’bilinear’, align_corners=True))) >>> print(’nalign_corners=False’) >>> print(’—-‘) >>> print(‘## warp_tensor, align_corners=False, new_mode=False’) >>> print(fmt(warp_tensor(inputs, inv, output_dims, isinv=True, align_corners=False))) >>> print(‘## warp_tensor, align_corners=False, new_mode=True’) >>> print(fmt(warp_tensor(inputs, inv, output_dims, isinv=True, align_corners=False, new_mode=True))) >>> print(‘## interpolate, align_corners=False’) >>> print(fmt(F.interpolate(inputs, output_dims, mode=’bilinear’, align_corners=False))) >>> print(‘## interpolate (scale), align_corners=False’) >>> print(ub.repr2(F.interpolate(inputs, scale_factor=s, mode=’bilinear’, align_corners=False).numpy(), precision=2)) >>> cv2_M = mat.cpu().numpy()[0:2] >>> src = inputs[0, 0].cpu().numpy() >>> dsize = tuple(output_dims[::-1]) >>> print(’nOpen CV warp Result’) >>> result2 = (cv2.warpAffine(src, cv2_M, dsize=dsize, flags=cv2.INTER_LINEAR)) >>> print(‘result2 =n{}’.format(ub.repr2(result2, precision=2)))
- kwimage.util_warp.subpixel_align(dst, src, index, interp_axes=None)[source]¶
Returns an aligned version of the source tensor and destination index.
- Used as the backend to implement other subpixel functions like:
subpixel_accum, subpixel_maximum.
- kwimage.util_warp.subpixel_set(dst, src, index, interp_axes=None)[source]¶
Add the source values array into the destination array at a particular subpixel index.
- Parameters
dst (ArrayLike) – destination accumulation array
src (ArrayLike) – source array containing values to add
index (Tuple[slice]) – subpixel slice into dst that corresponds with src
interp_axes (tuple) – specify which axes should be spatially interpolated
Todo
[ ]: allow index to be a sequence indices
Example
>>> import kwimage >>> dst = np.zeros(5) + .1 >>> src = np.ones(2) >>> index = [slice(1.5, 3.5)] >>> kwimage.util_warp.subpixel_set(dst, src, index) >>> print(ub.repr2(dst, precision=2, with_dtype=0)) np.array([0.1, 0.5, 1. , 0.5, 0.1])
- kwimage.util_warp.subpixel_accum(dst, src, index, interp_axes=None)[source]¶
Add the source values array into the destination array at a particular subpixel index.
- Parameters
dst (ArrayLike) – destination accumulation array
src (ArrayLike) – source array containing values to add
index (Tuple[slice]) – subpixel slice into dst that corresponds with src
interp_axes (tuple) – specify which axes should be spatially interpolated
Notes
- Inputs:
- +—+—+—+—+—+ dst.shape = (5,)
+—+—+ src.shape = (2,) |=======| index = 1.5:3.5
Subpixel shift the source by -0.5. When the index is non-integral, pad the aligned src with an extra value to ensure all dst pixels that would be influenced by the smaller subpixel shape are influenced by the aligned src. Note that we are not scaling.
+—+—+—+ aligned_src.shape = (3,) |===========| aligned_index = 1:4
Example
>>> dst = np.zeros(5) >>> src = np.ones(2) >>> index = [slice(1.5, 3.5)] >>> subpixel_accum(dst, src, index) >>> print(ub.repr2(dst, precision=2, with_dtype=0)) np.array([0. , 0.5, 1. , 0.5, 0. ])
Example
>>> dst = np.zeros((6, 6)) >>> src = np.ones((3, 3)) >>> index = (slice(1.5, 4.5), slice(1, 4)) >>> subpixel_accum(dst, src, index) >>> print(ub.repr2(dst, precision=2, with_dtype=0)) np.array([[0. , 0. , 0. , 0. , 0. , 0. ], [0. , 0.5, 0.5, 0.5, 0. , 0. ], [0. , 1. , 1. , 1. , 0. , 0. ], [0. , 1. , 1. , 1. , 0. , 0. ], [0. , 0.5, 0.5, 0.5, 0. , 0. ], [0. , 0. , 0. , 0. , 0. , 0. ]]) >>> # xdoctest: +REQUIRES(module:torch) >>> dst = torch.zeros((1, 3, 6, 6)) >>> src = torch.ones((1, 3, 3, 3)) >>> index = (slice(None), slice(None), slice(1.5, 4.5), slice(1.25, 4.25)) >>> subpixel_accum(dst, src, index) >>> print(ub.repr2(dst.numpy()[0, 0], precision=2, with_dtype=0)) np.array([[0. , 0. , 0. , 0. , 0. , 0. ], [0. , 0.38, 0.5 , 0.5 , 0.12, 0. ], [0. , 0.75, 1. , 1. , 0.25, 0. ], [0. , 0.75, 1. , 1. , 0.25, 0. ], [0. , 0.38, 0.5 , 0.5 , 0.12, 0. ], [0. , 0. , 0. , 0. , 0. , 0. ]])
- Doctest:
>>> # TODO: move to a unit test file >>> subpixel_accum(np.zeros(5), np.ones(2), [slice(1.5, 3.5)]).tolist() [0.0, 0.5, 1.0, 0.5, 0.0] >>> subpixel_accum(np.zeros(5), np.ones(2), [slice(0, 2)]).tolist() [1.0, 1.0, 0.0, 0.0, 0.0] >>> subpixel_accum(np.zeros(5), np.ones(3), [slice(.5, 3.5)]).tolist() [0.5, 1.0, 1.0, 0.5, 0.0] >>> subpixel_accum(np.zeros(5), np.ones(3), [slice(-1, 2)]).tolist() [1.0, 1.0, 0.0, 0.0, 0.0] >>> subpixel_accum(np.zeros(5), np.ones(3), [slice(-1.5, 1.5)]).tolist() [1.0, 0.5, 0.0, 0.0, 0.0] >>> subpixel_accum(np.zeros(5), np.ones(3), [slice(10, 13)]).tolist() [0.0, 0.0, 0.0, 0.0, 0.0] >>> subpixel_accum(np.zeros(5), np.ones(3), [slice(3.25, 6.25)]).tolist() [0.0, 0.0, 0.0, 0.75, 1.0] >>> subpixel_accum(np.zeros(5), np.ones(3), [slice(4.9, 7.9)]).tolist() [0.0, 0.0, 0.0, 0.0, 0.099...] >>> subpixel_accum(np.zeros(5), np.ones(9), [slice(-1.5, 7.5)]).tolist() [1.0, 1.0, 1.0, 1.0, 1.0] >>> subpixel_accum(np.zeros(5), np.ones(9), [slice(2.625, 11.625)]).tolist() [0.0, 0.0, 0.375, 1.0, 1.0] >>> subpixel_accum(np.zeros(5), 1, [slice(2.625, 11.625)]).tolist() [0.0, 0.0, 0.375, 1.0, 1.0]
- kwimage.util_warp.subpixel_maximum(dst, src, index, interp_axes=None)[source]¶
Take the max of the source values array into and the destination array at a particular subpixel index. Modifies the destination array.
- Parameters
dst (ArrayLike) – destination array to index into
src (ArrayLike) – source array that agrees with the index
index (Tuple[slice]) – subpixel slice into dst that corresponds with src
interp_axes (tuple) – specify which axes should be spatially interpolated
Example
>>> dst = np.array([0, 1.0, 1.0, 1.0, 0]) >>> src = np.array([2.0, 2.0]) >>> index = [slice(1.6, 3.6)] >>> subpixel_maximum(dst, src, index) >>> print(ub.repr2(dst, precision=2, with_dtype=0)) np.array([0. , 1. , 2. , 1.2, 0. ])
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> dst = torch.zeros((1, 3, 5, 5)) + .5 >>> src = torch.ones((1, 3, 3, 3)) >>> index = (slice(None), slice(None), slice(1.4, 4.4), slice(1.25, 4.25)) >>> subpixel_maximum(dst, src, index) >>> print(ub.repr2(dst.numpy()[0, 0], precision=2, with_dtype=0)) np.array([[0.5 , 0.5 , 0.5 , 0.5 , 0.5 ], [0.5 , 0.5 , 0.6 , 0.6 , 0.5 ], [0.5 , 0.75, 1. , 1. , 0.5 ], [0.5 , 0.75, 1. , 1. , 0.5 ], [0.5 , 0.5 , 0.5 , 0.5 , 0.5 ]])
- kwimage.util_warp.subpixel_minimum(dst, src, index, interp_axes=None)[source]¶
Take the min of the source values array into and the destination array at a particular subpixel index. Modifies the destination array.
- Parameters
dst (ArrayLike) – destination array to index into
src (ArrayLike) – source array that agrees with the index
index (Tuple[slice]) – subpixel slice into dst that corresponds with src
interp_axes (tuple) – specify which axes should be spatially interpolated
Example
>>> dst = np.array([0, 1.0, 1.0, 1.0, 0]) >>> src = np.array([2.0, 2.0]) >>> index = [slice(1.6, 3.6)] >>> subpixel_minimum(dst, src, index) >>> print(ub.repr2(dst, precision=2, with_dtype=0)) np.array([0. , 0.8, 1. , 1. , 0. ])
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> dst = torch.zeros((1, 3, 5, 5)) + .5 >>> src = torch.ones((1, 3, 3, 3)) >>> index = (slice(None), slice(None), slice(1.4, 4.4), slice(1.25, 4.25)) >>> subpixel_minimum(dst, src, index) >>> print(ub.repr2(dst.numpy()[0, 0], precision=2, with_dtype=0)) np.array([[0.5 , 0.5 , 0.5 , 0.5 , 0.5 ], [0.5 , 0.45, 0.5 , 0.5 , 0.15], [0.5 , 0.5 , 0.5 , 0.5 , 0.25], [0.5 , 0.5 , 0.5 , 0.5 , 0.25], [0.5 , 0.3 , 0.4 , 0.4 , 0.1 ]])
- kwimage.util_warp.subpixel_slice(inputs, index)[source]¶
Take a subpixel slice from a larger image. The returned output is left-aligned with the requested slice.
- Parameters
inputs (ArrayLike) – data
index (Tuple[slice]) – a slice to subpixel accuracy
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> import kwimage >>> import torch >>> # say we have a (576, 576) input space >>> # and a (9, 9) output space downsampled by 64x >>> ospc_feats = np.tile(np.arange(9 * 9).reshape(1, 9, 9), (1024, 1, 1)) >>> inputs = torch.from_numpy(ospc_feats) >>> # We detected a box in the input space >>> ispc_bbox = kwimage.Boxes([[64, 65, 100, 120]], 'ltrb') >>> # Get coordinates in the output space >>> ospc_bbox = ispc_bbox.scale(1 / 64) >>> tl_x, tl_y, br_x, br_y = ospc_bbox.data[0] >>> # Convert the box to a slice >>> index = [slice(None), slice(tl_y, br_y), slice(tl_x, br_x)] >>> # Note: I'm not 100% sure this work right with non-intergral slices >>> outputs = kwimage.subpixel_slice(inputs, index)
Example
>>> inputs = np.arange(5 * 5 * 3).reshape(5, 5, 3) >>> index = [slice(0, 3), slice(0, 3)] >>> outputs = subpixel_slice(inputs, index) >>> index = [slice(0.5, 3.5), slice(-0.5, 2.5)] >>> outputs = subpixel_slice(inputs, index)
>>> inputs = np.arange(5 * 5).reshape(1, 5, 5).astype(float) >>> index = [slice(None), slice(3, 6), slice(3, 6)] >>> outputs = subpixel_slice(inputs, index) >>> print(outputs) [[[18. 19. 0.] [23. 24. 0.] [ 0. 0. 0.]]] >>> index = [slice(None), slice(3.5, 6.5), slice(2.5, 5.5)] >>> outputs = subpixel_slice(inputs, index) >>> print(outputs) [[[20. 21. 10.75] [11.25 11.75 6. ] [ 0. 0. 0. ]]]
- kwimage.util_warp.subpixel_translate(inputs, shift, interp_axes=None, output_shape=None)[source]¶
Translates an image by a subpixel shift value using bilinear interpolation
- Parameters
inputs (ArrayLike) – data to translate
shift (Sequence) – amount to translate each dimension specified by interp_axes. Note: if inputs contains more than one “image” then all “images” are translated by the same amount. This function contains no mechanism for translating each image differently. Note that by default this is a y,x shift for 2 dimensions.
interp_axes (Sequence, default=None) – axes to perform interpolation on, if not specified the final n axes are interpolated, where n=len(shift)
output_shape (tuple, default=None) – if specified the output is returned with this shape, otherwise
Notes
This function powers most other functions in this file. Speedups here can go a long way.
Example
>>> inputs = np.arange(5) + 1 >>> print(inputs.tolist()) [1, 2, 3, 4, 5] >>> outputs = subpixel_translate(inputs, 1.5) >>> print(outputs.tolist()) [0.0, 0.5, 1.5, 2.5, 3.5]
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> inputs = torch.arange(9).view(1, 1, 3, 3).float() >>> print(inputs.long()) tensor([[[[0, 1, 2], [3, 4, 5], [6, 7, 8]]]]) >>> outputs = subpixel_translate(inputs, (-.4, .5), output_shape=(1, 1, 2, 5)) >>> print(outputs) tensor([[[[0.6000, 1.7000, 2.7000, 1.6000, 0.0000], [2.1000, 4.7000, 5.7000, 3.1000, 0.0000]]]])
- Ignore:
>>> inputs = np.arange(5) >>> shift = -.6 >>> interp_axes = None >>> subpixel_translate(inputs, -.6) >>> subpixel_translate(inputs[None, None, None, :], -.6) >>> inputs = np.arange(25).reshape(5, 5) >>> shift = (-1.6, 2.3) >>> interp_axes = (0, 1) >>> subpixel_translate(inputs, shift, interp_axes, output_shape=(9, 9)) >>> subpixel_translate(inputs, shift, interp_axes, output_shape=(3, 4))
- kwimage.util_warp._padded_slice(data, in_slice, ndim=None, pad_slice=None, pad_mode='constant', **padkw)[source]¶
Allows slices with out-of-bound coordinates. Any out of bounds coordinate will be sampled via padding.
Note
Negative slices have a different meaning here then they usually do. Normally, they indicate a wrap-around or a reversed stride, but here they index into out-of-bounds space (which depends on the pad mode). For example a slice of -2:1 literally samples two pixels to the left of the data and one pixel from the data, so you get two padded values and one data value.
- Parameters
data (Sliceable[T]) – data to slice into. Any channels must be the last dimension.
in_slice (Tuple[slice, …]) – slice for each dimensions
ndim (int) – number of spatial dimensions
pad_slice (List[int|Tuple]) – additional padding of the slice
- Returns
- data_sliced: subregion of the input data (possibly with padding,
depending on if the original slice went out of bounds)
- st_dimsa list indicating the low and high space-time coordinate
values of the returned data slice.
- Return type
Tuple[Sliceable, List]
Example
>>> data = np.arange(5) >>> in_slice = [slice(-2, 7)]
>>> data_sliced, st_dims = _padded_slice(data, in_slice) >>> print(ub.repr2(data_sliced, with_dtype=False)) >>> print(st_dims) np.array([0, 0, 0, 1, 2, 3, 4, 0, 0]) [(-2, 7)]
>>> data_sliced, st_dims = _padded_slice(data, in_slice, pad_slice=(3, 3)) >>> print(ub.repr2(data_sliced, with_dtype=False)) >>> print(st_dims) np.array([0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 0, 0, 0, 0, 0]) [(-5, 10)]
>>> data_sliced, st_dims = _padded_slice(data, slice(3, 4), pad_slice=[(1, 0)]) >>> print(ub.repr2(data_sliced, with_dtype=False)) >>> print(st_dims) np.array([2, 3]) [(2, 4)]
- kwimage.util_warp._rectify_slice(data_dims, low_dims, high_dims, pad_slice=None)[source]¶
Given image dimensions, bounding box dimensions, and a padding get the corresponding slice from the image and any extra padding needed to achieve the requested window size.
- Parameters
data_dims (tuple) – n-dimension data sizes (e.g. 2d height, width)
low_dims (tuple) – bounding box low values (e.g. 2d ymin, xmin)
high_dims (tuple) – bounding box high values (e.g. 2d ymax, xmax)
pad_slice (List[int|Tuple]) – pad applied to (left and right) / (both) sides of each slice dim
- Returns
- data_slice - low and high values of a fancy slice corresponding to
the image with shape data_dims. This slice may not correspond to the full window size if the requested bounding box goes out of bounds.
- extra_padding - extra padding needed after slicing to achieve
the requested window size.
- Return type
Tuple
Example
>>> # Case where 2D-bbox is inside the data dims on left edge >>> # Comprehensive 1D-cases are in the unit-test file >>> data_dims = [300, 300] >>> low_dims = [0, 0] >>> high_dims = [10, 10] >>> pad_slice = [(10, 10), (5, 5)] >>> a, b = _rectify_slice(data_dims, low_dims, high_dims, pad_slice) >>> print('data_slice = {!r}'.format(a)) >>> print('extra_padding = {!r}'.format(b)) data_slice = [(0, 20), (0, 15)] extra_padding = [(10, 0), (5, 0)]
- kwimage.util_warp._warp_tensor_cv2(inputs, mat, output_dims, mode='linear', ishomog=None)[source]¶
implementation with cv2.warpAffine for speed / correctness comparison
On GPU: torch is faster in both modes On CPU: torch is faster for homog, but cv2 is faster for affine
- Benchmark:
>>> # xdoctest: +REQUIRES(module:torch) >>> from kwimage.util.util_warp import * >>> from kwimage.util.util_warp import _warp_tensor_cv2 >>> from kwimage.util.util_warp import warp_tensor >>> import numpy as np >>> ti = ub.Timerit(10, bestof=3, verbose=2, unit='ms') >>> mode = 'linear' >>> rng = np.random.RandomState(0) >>> inputs = torch.Tensor(rng.rand(16, 10, 32, 32)).to('cpu') >>> mat = torch.FloatTensor([[2.5, 0, 10.5], [0, 3, 0], [0, 0, 1]]) >>> mat[2, 0] = .009 >>> mat[2, 2] = 2 >>> output_dims = (64, 64) >>> results = ub.odict() >>> # ------------- >>> for timer in ti.reset('warp_tensor(torch)'): >>> with timer: >>> outputs = warp_tensor(inputs, mat, output_dims, mode=mode) >>> torch.cuda.synchronize() >>> results[ti.label] = outputs >>> # ------------- >>> inputs = inputs.cpu().numpy() >>> mat = mat.cpu().numpy() >>> for timer in ti.reset('warp_tensor(cv2)'): >>> with timer: >>> outputs = _warp_tensor_cv2(inputs, mat, output_dims, mode=mode) >>> results[ti.label] = outputs >>> import itertools as it >>> for k1, k2 in it.combinations(results, 2): >>> a = kwarray.ArrayAPI.numpy(results[k1]) >>> b = kwarray.ArrayAPI.numpy(results[k2]) >>> diff = np.abs(a - b) >>> diff_stats = kwarray.stats_dict(diff, n_extreme=1, extreme=1) >>> print('{} - {}: {}'.format(k1, k2, ub.repr2(diff_stats, nl=0, precision=4))) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(results['warp_tensor(torch)'][0, 0], fnum=1, pnum=(1, 2, 1), title='torch') >>> kwplot.imshow(results['warp_tensor(cv2)'][0, 0], fnum=1, pnum=(1, 2, 2), title='cv2')
- kwimage.util_warp.warp_points(matrix, pts, homog_mode='divide')[source]¶
Warp ND points / coordinates using a transformation matrix.
Homogoenous coordinates are added on the fly if needed. Works with both numpy and torch.
- Parameters
matrix (ArrayLike) – [D1 x D2] transformation matrix. if using homogenous coordinates D2=D + 1, otherwise D2=D. if using homogenous coordinates and the matrix represents an Affine transformation, then either D1=D or D1=D2, i.e. the last row of zeros and a one is optional.
pts (ArrayLike) – [N1 x … x D] points (usually x, y). If points are already in homogenous space, then the output will be returned in homogenous space. D is the dimensionality of the points. The leading axis may take any shape, but usually, shape will be [N x D] where N is the number of points.
homog_mode (str, default=’divide’) – what to do for homogenous coordinates. Can either divide, keep, or drop.
- Retrns:
new_pts (ArrayLike): the points after being transformed by the matrix
Example
>>> from kwimage.util_warp import * # NOQA >>> # --- with numpy >>> rng = np.random.RandomState(0) >>> pts = rng.rand(10, 2) >>> matrix = rng.rand(2, 2) >>> warp_points(matrix, pts) >>> # --- with torch >>> # xdoctest: +REQUIRES(module:torch) >>> pts = torch.Tensor(pts) >>> matrix = torch.Tensor(matrix) >>> warp_points(matrix, pts)
Example
>>> from kwimage.util_warp import * # NOQA >>> # --- with numpy >>> pts = np.ones((10, 2)) >>> matrix = np.diag([2, 3, 1]) >>> ra = warp_points(matrix, pts) >>> # xdoctest: +REQUIRES(module:torch) >>> rb = warp_points(torch.Tensor(matrix), torch.Tensor(pts)) >>> assert np.allclose(ra, rb.numpy())
Example
>>> from kwimage.util_warp import * # NOQA >>> # test different cases >>> rng = np.random.RandomState(0) >>> # Test 3x3 style projective matrices >>> pts = rng.rand(1000, 2) >>> matrix = rng.rand(3, 3) >>> ra33 = warp_points(matrix, pts) >>> # xdoctest: +REQUIRES(module:torch) >>> rb33 = warp_points(torch.Tensor(matrix), torch.Tensor(pts)) >>> assert np.allclose(ra33, rb33.numpy()) >>> # Test opencv style affine matrices >>> pts = rng.rand(10, 2) >>> matrix = rng.rand(2, 3) >>> ra23 = warp_points(matrix, pts) >>> rb23 = warp_points(torch.Tensor(matrix), torch.Tensor(pts)) >>> assert np.allclose(ra33, rb33.numpy())
- kwimage.util_warp.remove_homog(pts, mode='divide')[source]¶
Remove homogenous coordinate to a point array.
This is a convinience function, it is not particularly efficient.
- SeeAlso:
cv2.convertPointsFromHomogeneous
Example
>>> homog_pts = np.random.rand(10, 3) >>> remove_homog(homog_pts, 'divide') >>> remove_homog(homog_pts, 'drop')
- kwimage.util_warp.add_homog(pts)[source]¶
Add a homogenous coordinate to a point array
This is a convinience function, it is not particularly efficient.
- SeeAlso:
cv2.convertPointsToHomogeneous
Example
>>> pts = np.random.rand(10, 2) >>> add_homog(pts)
- Benchmark:
>>> import timerit >>> ti = timerit.Timerit(1000, bestof=10, verbose=2) >>> pts = np.random.rand(1000, 2) >>> for timer in ti.reset('kwimage'): >>> with timer: >>> kwimage.add_homog(pts) >>> for timer in ti.reset('cv2'): >>> with timer: >>> cv2.convertPointsToHomogeneous(pts) >>> # cv2 is 4x faster, but has more restrictive inputs
- kwimage.util_warp.subpixel_getvalue(img, pts, coord_axes=None, interp='bilinear', bordermode='edge')[source]¶
Get values at subpixel locations
- Parameters
img (ArrayLike) – image to sample from
pts (ArrayLike) – subpixel rc-coordinates to sample
coord_axes (Sequence, default=None) – axes to perform interpolation on, if not specified the first d axes are interpolated, where d=pts.shape[-1]. IE: this indicates which axes each coordinate dimension corresponds to.
interp (str) – interpolation mode
bordermode (str) – how locations outside the image are handled
Example
>>> from kwimage.util_warp import * # NOQA >>> img = np.arange(3 * 3).reshape(3, 3) >>> pts = np.array([[1, 1], [1.5, 1.5], [1.9, 1.1]]) >>> subpixel_getvalue(img, pts) array([4. , 6. , 6.8]) >>> subpixel_getvalue(img, pts, coord_axes=(1, 0)) array([4. , 6. , 5.2]) >>> # xdoctest: +REQUIRES(module:torch) >>> img = torch.Tensor(img) >>> pts = torch.Tensor(pts) >>> subpixel_getvalue(img, pts) tensor([4.0000, 6.0000, 6.8000]) >>> subpixel_getvalue(img.numpy(), pts.numpy(), interp='nearest') array([4., 8., 7.], dtype=float32) >>> subpixel_getvalue(img.numpy(), pts.numpy(), interp='nearest', coord_axes=[1, 0]) array([4., 8., 5.], dtype=float32) >>> subpixel_getvalue(img, pts, interp='nearest') tensor([4., 8., 7.])
References
stackoverflow.com/uestions/12729228/simple-binlin-interp-images-numpy
- SeeAlso:
cv2.getRectSubPix(image, patchSize, center[, patch[, patchType]])
- kwimage.util_warp.subpixel_setvalue(img, pts, value, coord_axes=None, interp='bilinear', bordermode='edge')[source]¶
Set values at subpixel locations
- Parameters
img (ArrayLike) – image to set values in
pts (ArrayLike) – subpixel rc-coordinates to set
value (ArrayLike) – value to place in the image
coord_axes (Sequence, default=None) – axes to perform interpolation on, if not specified the first d axes are interpolated, where d=pts.shape[-1]. IE: this indicates which axes each coordinate dimension corresponds to.
interp (str) – interpolation mode
bordermode (str) – how locations outside the image are handled
Example
>>> from kwimage.util_warp import * # NOQA >>> img = np.arange(3 * 3).reshape(3, 3).astype(float) >>> pts = np.array([[1, 1], [1.5, 1.5], [1.9, 1.1]]) >>> interp = 'bilinear' >>> value = 0 >>> print('img = {!r}'.format(img)) >>> pts = np.array([[1.5, 1.5]]) >>> img2 = subpixel_setvalue(img.copy(), pts, value) >>> print('img2 = {!r}'.format(img2)) >>> pts = np.array([[1.0, 1.0]]) >>> img2 = subpixel_setvalue(img.copy(), pts, value) >>> print('img2 = {!r}'.format(img2)) >>> pts = np.array([[1.1, 1.9]]) >>> img2 = subpixel_setvalue(img.copy(), pts, value) >>> print('img2 = {!r}'.format(img2)) >>> img2 = subpixel_setvalue(img.copy(), pts, value, coord_axes=[1, 0]) >>> print('img2 = {!r}'.format(img2))
Package Contents¶
Classes¶
Used for converting a single color between spaces and encodings. |
|
Converts boxes between different formats as long as the last dimension |
|
A data structure to store n-dimensional coordinate geometry. |
|
Container for holding and manipulating multiple detections. |
|
Keeps track of a downscaled heatmap and how to transform it to overlay the |
|
Manages a single segmentation mask and can convert to and from |
|
Store and manipulate multiple masks, usually within the same image |
|
Data structure for storing multiple polygons (typically related to the same |
|
Stores multiple keypoints for a single object. |
|
Stores a list of Points, each item usually corresponds to a different object. |
|
Represents a single polygon as set of exterior boundary points and a list |
|
Stores and allows manipluation of multiple polygons, usually within the |
|
Either holds a MultiPolygon, Polygon, or Mask |
|
Store and manipulate multiple segmentations (masks or polygons), usually |
|
Helper for making affine transform matrices. |
|
Base class for matrix-based transform. |
|
Base class for matrix-based transform. |
|
Currently just a stub class that may be used to implement projective / |
|
Inherit from this class and define |
Functions¶
List available values for the impl kwarg of non_max_supression |
|
|
Divide and conquor speedup non-max-supression algorithm for when bboxes |
|
Non-Maximum Suppression - remove redundant bounding boxes |
|
Returns the input image with 4 channels. |
|
Places img1 on top of img2 respecting alpha channels. |
|
Stacks a sequences of layers on top of one another. The first item is the |
|
Ensures that there are 3 channels in the image |
|
Ensure that an image is encoded using a float32 properly |
|
Ensure that an image is encoded using a uint8 properly. Either |
|
Broadcasts image arrays so they can have elementwise operations applied |
|
Rebalance pixel intensities via contrast stretching. |
|
Normalize data intensities using heuristics to help put sensor data with |
|
Returns the number of color channels in an image. |
|
Allows slices with out-of-bound coordinates. Any out of bounds coordinate |
|
Converts colorspace of img. |
|
Creates a 2D gaussian patch with a specific size and sigma |
|
Crop an image about a specified point, padding if necessary. |
|
Resize an image based on a scale factor, final size, or size and aspect |
|
DEPRECATED and removed: use imresize instead |
|
Applies an affine transformation to an image with optional antialiasing. |
|
Creates a checkerboard image |
|
Ensures that the test image exists (this might use the network), reads it |
|
Ensures that the test image exists (this might use the network) and returns |
|
Draws boxes on an image. |
|
Draws classification label on an image. |
|
Draw line segments between pts1 and pts2 on an image. |
|
Draws multiline text on an image using opencv |
|
Create an image representing a 2D vector field. |
|
Colorizes a single-channel intensity mask (with an alpha channel) |
|
Makes a colormap in HSV space where the orientation changes color and mag |
|
Create an image representing a 2D vector field. |
|
Applies a mask to the fourier spectrum of an image |
|
In [1] they use a radius of 11.0 on CIFAR-10. |
|
Reads image data in a specified format using some backend implementation. |
|
Writes image data to disk. |
|
Determine the height/width/channels of an image without reading the entire |
|
Decode run length encoding back into an image. |
|
Construct the run length encoding (RLE) of an image. |
|
Translates a run-length encoded image in RLE-space. |
|
Make a new image with the input images side-by-side |
|
Stacks images in a grid. Optionally return transforms of original image |
|
Smooths the probability map, but preserves the magnitude of the peaks. |
|
Add a homogenous coordinate to a point array |
|
Remove homogenous coordinate to a point array. |
|
Add the source values array into the destination array at a particular |
|
Returns an aligned version of the source tensor and destination index. |
|
Get values at subpixel locations |
|
Take the max of the source values array into and the destination array at a |
|
Take the min of the source values array into and the destination array at a |
|
Add the source values array into the destination array at a particular |
|
Set values at subpixel locations |
|
Take a subpixel slice from a larger image. The returned output is |
|
Translates an image by a subpixel shift value using bilinear interpolation |
|
|
|
Warp ND points / coordinates using a transformation matrix. |
|
A pytorch implementation of warp affine that works similarly to |
- kwimage.available_nms_impls()¶
List available values for the impl kwarg of non_max_supression
- CommandLine:
xdoctest -m kwimage.algo.algo_nms available_nms_impls
Example
>>> impls = available_nms_impls() >>> assert 'numpy' in impls >>> print('impls = {!r}'.format(impls))
- kwimage.daq_spatial_nms(ltrb, scores, diameter, thresh, max_depth=6, stop_size=2048, recsize=2048, impl='auto', device_id=None)¶
Divide and conquor speedup non-max-supression algorithm for when bboxes have a known max size
- Parameters
ltrb (ndarray) – boxes in (tlx, tly, brx, bry) format
scores (ndarray) – scores of each box
diameter (int or Tuple[int, int]) – Distance from split point to consider rectification. If specified as an integer, then number is used for both height and width. If specified as a tuple, then dims are assumed to be in [height, width] format.
thresh (float) – iou threshold. Boxes are removed if they overlap greater than this threshold. 0 is the most strict, resulting in the fewest boxes, and 1 is the most permissive resulting in the most.
max_depth (int) – maximum number of times we can divide and conquor
stop_size (int) – number of boxes that triggers full NMS computation
recsize (int) – number of boxes that triggers full NMS recombination
impl (str) – algorithm to use
- LookInfo:
# Didn’t read yet but it seems similar http://www.cyberneum.de/fileadmin/user_upload/files/publications/CVPR2010-Lampert_[0].pdf
https://www.researchgate.net/publication/220929789_Efficient_Non-Maximum_Suppression
# This seems very similar https://projet.liris.cnrs.fr/m2disco/pub/Congres/2006-ICPR/DATA/C03_0406.PDF
Example
>>> import kwimage >>> # Make a bunch of boxes with the same width and height >>> #boxes = kwimage.Boxes.random(230397, scale=1000, format='cxywh') >>> boxes = kwimage.Boxes.random(237, scale=1000, format='cxywh') >>> boxes.data.T[2] = 10 >>> boxes.data.T[3] = 10 >>> # >>> ltrb = boxes.to_ltrb().data.astype(np.float32) >>> scores = np.arange(0, len(ltrb)).astype(np.float32) >>> # >>> n_megabytes = (ltrb.size * ltrb.dtype.itemsize) / (2 ** 20) >>> print('n_megabytes = {!r}'.format(n_megabytes)) >>> # >>> thresh = iou_thresh = 0.01 >>> impl = 'auto' >>> max_depth = 20 >>> diameter = 10 >>> stop_size = 2000 >>> recsize = 500 >>> # >>> import ubelt as ub >>> # >>> with ub.Timer(label='daq'): >>> keep1 = daq_spatial_nms(ltrb, scores, >>> diameter=diameter, thresh=thresh, max_depth=max_depth, >>> stop_size=stop_size, recsize=recsize, impl=impl) >>> # >>> with ub.Timer(label='full'): >>> keep2 = non_max_supression(ltrb, scores, >>> thresh=thresh, impl=impl) >>> # >>> # Due to the greedy nature of the algorithm, there will be slight >>> # differences in results, but they will be mostly similar. >>> similarity = len(set(keep1) & set(keep2)) / len(set(keep1) | set(keep2)) >>> print('similarity = {!r}'.format(similarity))
- kwimage.non_max_supression(ltrb, scores, thresh, bias=0.0, classes=None, impl='auto', device_id=None)¶
Non-Maximum Suppression - remove redundant bounding boxes
- Parameters
ltrb (ndarray[float32]) – Nx4 boxes in ltrb format
scores (ndarray[float32]) – score for each bbox
thresh (float) – iou threshold. Boxes are removed if they overlap greater than this threshold (i.e. Boxes are removed if iou > threshold). Thresh = 0 is the most strict, resulting in the fewest boxes, and 1 is the most permissive resulting in the most.
bias (float) – bias for iou computation either 0 or 1
classes (ndarray[int64] or None) – integer classes. If specified NMS is done on a perclass basis.
impl (str) – implementation can be “auto”, “python”, “cython_cpu”, “gpu”, “torch”, or “torchvision”.
device_id (int) – used if impl is gpu, device id to work on. If not specified torch.cuda.current_device() is used.
Notes
Using impl=’cython_gpu’ may result in an CUDA memory error that is not exposed to the python processes. In other words your program will hard crash if impl=’cython_gpu’, and you feed it too many bounding boxes. Ideally this will be fixed in the future.
References
https://github.com/facebookresearch/Detectron/blob/master/detectron/utils/cython_nms.pyx https://www.pyimagesearch.com/2015/02/16/faster-non-maximum-suppression-python/ https://github.com/bharatsingh430/soft-nms/blob/master/lib/nms/cpu_nms.pyx <- TODO
- CommandLine:
xdoctest -m ~/code/kwimage/kwimage/algo/algo_nms.py non_max_supression
Example
>>> from kwimage.algo.algo_nms import * >>> from kwimage.algo.algo_nms import _impls >>> ltrb = np.array([ >>> [0, 0, 100, 100], >>> [100, 100, 10, 10], >>> [10, 10, 100, 100], >>> [50, 50, 100, 100], >>> ], dtype=np.float32) >>> scores = np.array([.1, .5, .9, .1]) >>> keep = non_max_supression(ltrb, scores, thresh=0.5, impl='numpy') >>> print('keep = {!r}'.format(keep)) >>> assert keep == [2, 1, 3] >>> thresh = 0.0 >>> non_max_supression(ltrb, scores, thresh, impl='numpy') >>> if 'numpy' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='numpy') >>> assert list(keep) == [2, 1] >>> if 'cython_cpu' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='cython_cpu') >>> assert list(keep) == [2, 1] >>> if 'cython_gpu' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='cython_gpu') >>> assert list(keep) == [2, 1] >>> if 'torch' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='torch') >>> assert set(keep.tolist()) == {2, 1} >>> if 'torchvision' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='torchvision') # note torchvision has no bias >>> assert list(keep) == [2] >>> thresh = 1.0 >>> if 'numpy' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='numpy') >>> assert list(keep) == [2, 1, 3, 0] >>> if 'cython_cpu' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='cython_cpu') >>> assert list(keep) == [2, 1, 3, 0] >>> if 'cython_gpu' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='cython_gpu') >>> assert list(keep) == [2, 1, 3, 0] >>> if 'torch' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='torch') >>> assert set(keep.tolist()) == {2, 1, 3, 0} >>> if 'torchvision' in available_nms_impls(): >>> keep = non_max_supression(ltrb, scores, thresh, impl='torchvision') # note torchvision has no bias >>> assert set(kwarray.ArrayAPI.tolist(keep)) == {2, 1, 3, 0}
Example
>>> import ubelt as ub >>> ltrb = np.array([ >>> [0, 0, 100, 100], >>> [100, 100, 10, 10], >>> [10, 10, 100, 100], >>> [50, 50, 100, 100], >>> [100, 100, 150, 101], >>> [120, 100, 180, 101], >>> [150, 100, 200, 101], >>> ], dtype=np.float32) >>> scores = np.linspace(0, 1, len(ltrb)) >>> thresh = .2 >>> solutions = {} >>> if not _impls._funcs: >>> _impls._lazy_init() >>> for impl in _impls._funcs: >>> keep = non_max_supression(ltrb, scores, thresh, impl=impl) >>> solutions[impl] = sorted(keep) >>> assert 'numpy' in solutions >>> print('solutions = {}'.format(ub.repr2(solutions, nl=1))) >>> assert ub.allsame(solutions.values())
- CommandLine:
xdoctest -m ~/code/kwimage/kwimage/algo/algo_nms.py non_max_supression
Example
>>> import ubelt as ub >>> # Check that zero-area boxes are ok >>> ltrb = np.array([ >>> [0, 0, 0, 0], >>> [0, 0, 0, 0], >>> [10, 10, 10, 10], >>> ], dtype=np.float32) >>> scores = np.array([1, 2, 3], dtype=np.float32) >>> thresh = .2 >>> solutions = {} >>> if not _impls._funcs: >>> _impls._lazy_init() >>> for impl in _impls._funcs: >>> keep = non_max_supression(ltrb, scores, thresh, impl=impl) >>> solutions[impl] = sorted(keep) >>> assert 'numpy' in solutions >>> print('solutions = {}'.format(ub.repr2(solutions, nl=1))) >>> assert ub.allsame(solutions.values())
- kwimage.ensure_alpha_channel(img, alpha=1.0, dtype=np.float32, copy=False)[source]¶
Returns the input image with 4 channels.
- Parameters
img (ndarray) – an image with shape [H, W], [H, W, 1], [H, W, 3], or [H, W, 4].
alpha (float, default=1.0) – default value for missing alpha channel
dtype (type, default=np.float32) – a numpy floating type
copy (bool, default=False) – always copy if True, else copy if needed.
- Returns
an image with specified dtype with shape [H, W, 4].
- Raises
ValueError - if the input image does not have 1, 3, or 4 input channels – or if the image cannot be converted into a float01 representation
- kwimage.overlay_alpha_images(img1, img2, keepalpha=True, dtype=np.float32, impl='inplace')[source]¶
Places img1 on top of img2 respecting alpha channels. Works like the Photoshop layers with opacity.
- Parameters
img1 (ndarray) – top image to overlay over img2
img2 (ndarray) – base image to superimpose on
keepalpha (bool) – if False, the alpha channel is removed after blending
dtype (np.dtype) – format for blending computation (defaults to float32)
impl (str, default=inplace) – code specifying the backend implementation
- Returns
raster: the blended images
- Return type
ndarray
Todo
[ ] Make fast C++ version of this function
References
http://stackoverflow.com/questions/25182421/overlay-numpy-alpha https://en.wikipedia.org/wiki/Alpha_compositing#Alpha_blending
Example
>>> import kwimage >>> img1 = kwimage.grab_test_image('astro', dsize=(100, 100)) >>> img2 = kwimage.grab_test_image('carl', dsize=(100, 100)) >>> img1 = kwimage.ensure_alpha_channel(img1, alpha=.5) >>> img3 = overlay_alpha_images(img1, img2) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(img3) >>> kwplot.show_if_requested()
- kwimage.overlay_alpha_layers(layers, keepalpha=True, dtype=np.float32)[source]¶
Stacks a sequences of layers on top of one another. The first item is the topmost layer and the last item is the bottommost layer.
- Parameters
layers (Sequence[ndarray]) – stack of images
keepalpha (bool) – if False, the alpha channel is removed after blending
dtype (np.dtype) – format for blending computation (defaults to float32)
- Returns
raster: the blended images
- Return type
ndarray
References
http://stackoverflow.com/questions/25182421/overlay-numpy-alpha https://en.wikipedia.org/wiki/Alpha_compositing#Alpha_blending
Example
>>> import kwimage >>> keys = ['astro', 'carl', 'stars'] >>> layers = [kwimage.grab_test_image(k, dsize=(100, 100)) for k in keys] >>> layers = [kwimage.ensure_alpha_channel(g, alpha=.5) for g in layers] >>> stacked = overlay_alpha_layers(layers) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(stacked) >>> kwplot.show_if_requested()
- class kwimage.Color(color, alpha=None, space=None)[source]¶
Bases:
ubelt.NiceRepr
Used for converting a single color between spaces and encodings. This should only be used when handling small numbers of colors(e.g. 1), don’t use this to represent an image.
move to colorutil?
- Parameters
space (str) – colorspace of wrapped color. Assume RGB if not specified and it cannot be inferred
- CommandLine:
xdoctest -m ~/code/kwimage/kwimage/im_color.py Color
Example
>>> print(Color('g')) >>> print(Color('orangered')) >>> print(Color('#AAAAAA').as255()) >>> print(Color([0, 255, 0])) >>> print(Color([1, 1, 1.])) >>> print(Color([1, 1, 1])) >>> print(Color(Color([1, 1, 1])).as255()) >>> print(Color(Color([1., 0, 1, 0])).ashex()) >>> print(Color([1, 1, 1], alpha=255)) >>> print(Color([1, 1, 1], alpha=255, space='lab'))
- __nice__(self)¶
- _forimage(self, image, space='rgb')¶
Experimental function.
Create a numeric color tuple that agrees with the format of the input image (i.e. float or int, with 3 or 4 channels).
- Parameters
image (ndarray) – image to return color for
space (str, default=rgb) – colorspace of the input image.
Example
>>> img_f3 = np.zeros([8, 8, 3], dtype=np.float32) >>> img_u3 = np.zeros([8, 8, 3], dtype=np.uint8) >>> img_f4 = np.zeros([8, 8, 4], dtype=np.float32) >>> img_u4 = np.zeros([8, 8, 4], dtype=np.uint8) >>> Color('red')._forimage(img_f3) (1.0, 0.0, 0.0) >>> Color('red')._forimage(img_f4) (1.0, 0.0, 0.0, 1.0) >>> Color('red')._forimage(img_u3) (255, 0, 0) >>> Color('red')._forimage(img_u4) (255, 0, 0, 255) >>> Color('red', alpha=0.5)._forimage(img_f4) (1.0, 0.0, 0.0, 0.5) >>> Color('red', alpha=0.5)._forimage(img_u4) (255, 0, 0, 127)
- ashex(self, space=None)¶
- as255(self, space=None)¶
- as01(self, space=None)¶
self = mplutil.Color(‘red’) mplutil.Color(‘green’).as01(‘rgba’)
- classmethod _is_base01(channels)¶
check if a color is in base 01
- classmethod _is_base255(Color, channels)¶
there is a one corner case where all pixels are 1 or less
- classmethod _hex_to_01(Color, hex_color)¶
hex_color = ‘#6A5AFFAF’
- _ensure_color01(Color, color)¶
Infer what type color is and normalize to 01
- classmethod _255_to_01(Color, color255)¶
converts base 255 color to base 01 color
- classmethod _string_to_01(Color, color)¶
mplutil.Color._string_to_01(‘green’) mplutil.Color._string_to_01(‘red’)
- classmethod named_colors(cls)¶
- Returns
names of colors that Color accepts
- Return type
List[str]
Example
>>> import kwimage >>> named_colors = kwimage.Color.named_colors() >>> color_lut = {name: kwimage.Color(name).as01() for name in named_colors} >>> # xdoctest: +REQUIRES(module:kwplot) >>> import kwplot >>> kwplot.autompl() >>> canvas = kwplot.make_legend_img(color_lut) >>> kwplot.imshow(canvas)
- classmethod distinct(Color, num, space='rgb')¶
Make multiple distinct colors
- classmethod random(Color, pool='named')¶
- kwimage.atleast_3channels(arr, copy=True)[source]¶
Ensures that there are 3 channels in the image
- Parameters
arr (ndarray[N, M, …]) – the image
copy (bool) – Always copies if True, if False, then copies only when the size of the array must change.
- Returns
with shape (N, M, C), where C in {3, 4}
- Return type
ndarray
- Doctest:
>>> assert atleast_3channels(np.zeros((10, 10))).shape[-1] == 3 >>> assert atleast_3channels(np.zeros((10, 10, 1))).shape[-1] == 3 >>> assert atleast_3channels(np.zeros((10, 10, 3))).shape[-1] == 3 >>> assert atleast_3channels(np.zeros((10, 10, 4))).shape[-1] == 4
- kwimage.ensure_float01(img, dtype=np.float32, copy=True)[source]¶
Ensure that an image is encoded using a float32 properly
- Parameters
img (ndarray) – an image in uint255 or float01 format. Other formats will raise errors.
dtype (type, default=np.float32) – a numpy floating type
copy (bool, default=False) – always copy if True, else copy if needed.
- Returns
an array of floats in the range 0-1
- Return type
ndarray
- Raises
ValueError – if the image type is integer and not in [0-255]
Example
>>> ensure_float01(np.array([[0, .5, 1.0]])) array([[0. , 0.5, 1. ]], dtype=float32) >>> ensure_float01(np.array([[0, 1, 200]])) array([[0..., 0.0039..., 0.784...]], dtype=float32)
- kwimage.ensure_uint255(img, copy=True)[source]¶
Ensure that an image is encoded using a uint8 properly. Either
- Parameters
img (ndarray) – an image in uint255 or float01 format. Other formats will raise errors.
copy (bool, default=False) – always copy if True, else copy if needed.
- Returns
an array of bytes in the range 0-255
- Return type
ndarray
- Raises
ValueError – if the image type is float and not in [0-1]
ValueError – if the image type is integer and not in [0-255]
Example
>>> ensure_uint255(np.array([[0, .5, 1.0]])) array([[ 0, 127, 255]], dtype=uint8) >>> ensure_uint255(np.array([[0, 1, 200]])) array([[ 0, 1, 200]], dtype=uint8)
- kwimage.make_channels_comparable(img1, img2, atleast3d=False)[source]¶
Broadcasts image arrays so they can have elementwise operations applied
- Parameters
img1 (ndarray) – first image
img2 (ndarray) – second image
atleast3d (bool, default=False) – if true we ensure that the channel dimension exists (only relevant for 1-channel images)
Example
>>> import itertools as it >>> wh_basis = [(5, 5), (3, 5), (5, 3), (1, 1), (1, 3), (3, 1)] >>> for w, h in wh_basis: >>> shape_basis = [(w, h), (w, h, 1), (w, h, 3)] >>> # Test all permutations of shap inputs >>> for shape1, shape2 in it.product(shape_basis, shape_basis): >>> print('* input shapes: %r, %r' % (shape1, shape2)) >>> img1 = np.empty(shape1) >>> img2 = np.empty(shape2) >>> img1, img2 = make_channels_comparable(img1, img2) >>> print('... output shapes: %r, %r' % (img1.shape, img2.shape)) >>> elem = (img1 + img2) >>> print('... elem(+) shape: %r' % (elem.shape,)) >>> assert elem.size == img1.size, 'outputs should have same size' >>> assert img1.size == img2.size, 'new imgs should have same size' >>> print('--------')
- kwimage.normalize(arr, mode='linear', alpha=None, beta=None, out=None)[source]¶
Rebalance pixel intensities via contrast stretching.
By default linearly stretches pixel intensities to minimum and maximum values.
Notes
DEPRECATED: this function has been MOVED to
kwarray.normalize
- kwimage.normalize_intensity(imdata, return_info=False, nodata=None, axis=None, dtype=np.float32)[source]¶
Normalize data intensities using heuristics to help put sensor data with extremely high or low contrast into a visible range.
This function is designed with an emphasis on getting something that is reasonable for visualization.
- Parameters
imdata (ndarray) – raw intensity data
return_info (bool, default=False) – if True, return information about the chosen normalization heuristic.
nodata – A value representing nodata to leave unchanged during normalization, for example 0
dtype – can be float32 or float64
- Returns
a floating point array with values between 0 and 1.
- Return type
ndarray
Example
>>> from kwimage.im_core import * # NOQA >>> import ubelt as ub >>> import kwimage >>> import kwarray >>> s = 512 >>> bit_depth = 11 >>> dtype = np.uint16 >>> max_val = int(2 ** bit_depth) >>> min_val = int(0) >>> rng = kwarray.ensure_rng(0) >>> background = np.random.randint(min_val, max_val, size=(s, s), dtype=dtype) >>> poly1 = kwimage.Polygon.random(rng=rng).scale(s / 2) >>> poly2 = kwimage.Polygon.random(rng=rng).scale(s / 2).translate(s / 2) >>> forground = np.zeros_like(background, dtype=np.uint8) >>> forground = poly1.fill(forground, value=255) >>> forground = poly2.fill(forground, value=122) >>> forground = (kwimage.ensure_float01(forground) * max_val).astype(dtype) >>> imdata = background + forground >>> normed, info = normalize_intensity(imdata, return_info=True) >>> print('info = {}'.format(ub.repr2(info, nl=1))) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(imdata, pnum=(1, 2, 1), fnum=1) >>> kwplot.imshow(normed, pnum=(1, 2, 2), fnum=1)
Example
>>> from kwimage.im_core import * # NOQA >>> import ubelt as ub >>> import kwimage >>> # Test on an image that is already normalized to test how it >>> # degrades >>> imdata = kwimage.grab_test_image() >>> normed, info = normalize_intensity(imdata, return_info=True) >>> print('info = {}'.format(ub.repr2(info, nl=1))) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(imdata, pnum=(1, 2, 1), fnum=1) >>> kwplot.imshow(normed, pnum=(1, 2, 2), fnum=1)
- kwimage.num_channels(img)[source]¶
Returns the number of color channels in an image.
Assumes images are 2D and the the channels are the trailing dimension. Returns 1 in the case with no trailing channel dimension, otherwise simply returns
img.shape[2]
.- Parameters
img (ndarray) – an image with 2 or 3 dimensions.
- Returns
the number of color channels (1, 3, or 4)
- Return type
Example
>>> H = W = 3 >>> assert num_channels(np.empty((W, H))) == 1 >>> assert num_channels(np.empty((W, H, 1))) == 1 >>> assert num_channels(np.empty((W, H, 3))) == 3 >>> assert num_channels(np.empty((W, H, 4))) == 4 >>> assert num_channels(np.empty((W, H, 2))) == 2
- kwimage.padded_slice(data, in_slice, pad=None, padkw=None, return_info=False)[source]¶
Allows slices with out-of-bound coordinates. Any out of bounds coordinate will be sampled via padding.
DEPRECATED FOR THE VERSION IN KWARRAY (slices are more array-ish than image-ish)
Note
Negative slices have a different meaning here then they usually do. Normally, they indicate a wrap-around or a reversed stride, but here they index into out-of-bounds space (which depends on the pad mode). For example a slice of -2:1 literally samples two pixels to the left of the data and one pixel from the data, so you get two padded values and one data value.
- Parameters
data (Sliceable[T]) – data to slice into. Any channels must be the last dimension.
in_slice (slice | Tuple[slice, …]) – slice for each dimensions
ndim (int) – number of spatial dimensions
pad (List[int|Tuple]) – additional padding of the slice
padkw (Dict) – if unspecified defaults to
{'mode': 'constant'}
return_info (bool, default=False) – if True, return extra information about the transform.
- SeeAlso:
_padded_slice_embed - finds the embedded slice and padding _padded_slice_apply - applies padding to sliced data
- Returns
- data_sliced: subregion of the input data (possibly with padding,
depending on if the original slice went out of bounds)
- Tuple[Sliceable, Dict] :
data_sliced : as above
transform : information on how to return to the original coordinates
- Currently a dict containing:
- st_dims: a list indicating the low and high space-time
coordinate values of the returned data slice.
The structure of this dictionary mach change in the future
- Return type
Sliceable
Example
>>> data = np.arange(5) >>> in_slice = [slice(-2, 7)]
>>> data_sliced = padded_slice(data, in_slice) >>> print(ub.repr2(data_sliced, with_dtype=False)) np.array([0, 0, 0, 1, 2, 3, 4, 0, 0])
>>> data_sliced = padded_slice(data, in_slice, pad=(3, 3)) >>> print(ub.repr2(data_sliced, with_dtype=False)) np.array([0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 0, 0, 0, 0, 0])
>>> data_sliced = padded_slice(data, slice(3, 4), pad=[(1, 0)]) >>> print(ub.repr2(data_sliced, with_dtype=False)) np.array([2, 3])
- kwimage.convert_colorspace(img, src_space, dst_space, copy=False, implicit=False, dst=None)[source]¶
Converts colorspace of img. Convenience function around cv2.cvtColor
- Parameters
img (ndarray) – image data with float32 or uint8 precision
src_space (str) – input image colorspace. (e.g. BGR, GRAY)
dst_space (str) – desired output colorspace. (e.g. RGB, HSV, LAB)
implicit (bool) –
- if False, the user must correctly specify if the input/output
colorspaces contain alpha channels.
- If True and the input image has an alpha channel, we modify
src_space and dst_space to ensure they both end with “A”.
dst (ndarray[uint8_t, ndim=2], optional) – inplace-output array.
- Returns
img - image data
- Return type
ndarray
Note
Note the LAB and HSV colorspaces in float do not go into the 0-1 range.
- For HSV the floating point range is:
0:360, 0:1, 0:1
- For LAB the floating point range is:
0:100, -86.1875:98.234375, -107.859375:94.46875 (Note, that some extreme combinations of a and b are not valid)
Example
>>> import numpy as np >>> convert_colorspace(np.array([[[0, 0, 1]]], dtype=np.float32), 'RGB', 'LAB') >>> convert_colorspace(np.array([[[0, 1, 0]]], dtype=np.float32), 'RGB', 'LAB') >>> convert_colorspace(np.array([[[1, 0, 0]]], dtype=np.float32), 'RGB', 'LAB') >>> convert_colorspace(np.array([[[1, 1, 1]]], dtype=np.float32), 'RGB', 'LAB') >>> convert_colorspace(np.array([[[0, 0, 1]]], dtype=np.float32), 'RGB', 'HSV')
- Ignore:
# Check LAB output ranges import itertools as it s = 1 _iter = it.product(range(0, 256, s), range(0, 256, s), range(0, 256, s)) minvals = np.full(3, np.inf) maxvals = np.full(3, -np.inf) for r, g, b in ub.ProgIter(_iter, total=(256 // s) ** 3):
img255 = np.array([[[r, g, b]]], dtype=np.uint8) img01 = (img255 / 255.0).astype(np.float32) lab = convert_colorspace(img01, ‘rgb’, ‘lab’) np.minimum(lab[0, 0], minvals, out=minvals) np.maximum(lab[0, 0], maxvals, out=maxvals)
print(‘minvals = {}’.format(ub.repr2(minvals, nl=0))) print(‘maxvals = {}’.format(ub.repr2(maxvals, nl=0)))
- kwimage.gaussian_patch(shape=(7, 7), sigma=None)[source]¶
Creates a 2D gaussian patch with a specific size and sigma
- Parameters
shape (Tuple[int, int]) – patch height and width
sigma (float | Tuple[float, float]) – Gaussian standard deviation
References
http://docs.opencv.org/modules/imgproc/doc/filtering.html#getgaussiankernel
Todo
[ ] Look into this C-implementation
https://kwgitlab.kitware.com/computer-vision/heatmap/blob/master/heatmap/heatmap.c
- CommandLine:
xdoctest -m kwimage.im_cv2 gaussian_patch –show
Example
>>> import numpy as np >>> shape = (88, 24) >>> sigma = None # 1.0 >>> gausspatch = gaussian_patch(shape, sigma) >>> sum_ = gausspatch.sum() >>> assert np.all(np.isclose(sum_, 1.0)) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> norm = (gausspatch - gausspatch.min()) / (gausspatch.max() - gausspatch.min()) >>> kwplot.imshow(norm) >>> kwplot.show_if_requested()
Example
>>> import numpy as np >>> shape = (24, 24) >>> sigma = 3.0 >>> gausspatch = gaussian_patch(shape, sigma) >>> sum_ = gausspatch.sum() >>> assert np.all(np.isclose(sum_, 1.0)) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> norm = (gausspatch - gausspatch.min()) / (gausspatch.max() - gausspatch.min()) >>> kwplot.imshow(norm) >>> kwplot.show_if_requested()
- kwimage.imcrop(img, dsize, about=None, origin=None, border_value=None, interpolation='nearest')[source]¶
Crop an image about a specified point, padding if necessary.
This is like PIL.Image.Image.crop with more convenient arguments, or cv2.getRectSubPix without the baked-in bilinear interpolation.
- Parameters
img (ndarray) – image to crop
dsize (Tuple[None | int, None | int]) – the desired width and height of the new image. If a dimension is None, then it is automatically computed to preserve aspect ratio. This can be larger than the original dims; if so, the cropped image is padded with border_value.
about (Tuple[str | int, str | int]) – the location to crop about. Mutually exclusive with origin. Defaults to top left. If ints (w,h) are provided, that will be the center of the cropped image. There are also string codes available: ‘lt’: make the top left point of the image the top left point of
the cropped image. This is equivalent to img[:dsize[1], :dsize[0]], plus padding.
- ‘rb’: make the bottom right point of the image the bottom right
point of the cropped image. This is equivalent to img[-dsize[1]:, -dsize[0]:], plus padding.
‘cc’: make the center of the image the center of the cropped image. Any combination of these codes can be used, ex. ‘lb’, ‘ct’, (‘r’, 200), …
origin (Tuple[int, int] | None) – the origin of the crop in (x,y) order (same order as dsize/about). Mutually exclusive with about. Defaults to top left.
border_value (Numeric | Tuple | str, default=0) – any border border_value accepted by cv2.copyMakeBorder, ex. [255, 0, 0] (blue). Default is 0.
interpolation (str, default=’nearest’) – Can be ‘nearest’, in which case integral cropping is used. Can also be ‘linear’, in which case cv2.getRectSubPix is used.
- Returns
the cropped image
- Return type
ndarray
- SeeAlso:
kwarray.padded_slice()
- a similar function for working with“negative slices”.
Example
>>> import kwimage >>> import numpy as np >>> # >>> img = kwimage.grab_test_image('astro', dsize=(32, 32)) >>> # >>> # regular crop >>> new_img1 = kwimage.imcrop(img, dsize=(5,6)) >>> assert new_img1.shape == (6, 5, 3) >>> # >>> # padding for coords outside the image bounds >>> new_img2 = kwimage.imcrop(img, dsize=(5,6), >>> origin=(-1,0), border_value=[1, 0, 0]) >>> assert np.all(new_img2[:, 0] == [1, 0, 0]) >>> # >>> # codes for corner- and edge-centered cropping >>> new_img3 = kwimage.imcrop(img, dsize=(5,6), >>> about='cb') >>> # >>> # special code for bilinear interpolation >>> # with floating-point coordinates >>> new_img4 = kwimage.imcrop(img, dsize=(5,6), >>> about=(5.5, 8.5), interpolation='linear') >>> # >>> # use with bounding boxes >>> bbox = kwimage.Boxes.random(scale=5, rng=132).to_xywh().quantize() >>> origin, dsize = np.split(bbox.data[0], 2) >>> new_img5 = kwimage.imcrop(img, dsize=dsize, >>> origin=origin) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> pnum_ = kwplot.PlotNums(nSubplots=6) >>> kwplot.imshow(img, pnum=pnum_()) >>> kwplot.imshow(new_img1, pnum=pnum_()) >>> kwplot.imshow(new_img2, pnum=pnum_()) >>> kwplot.imshow(new_img3, pnum=pnum_()) >>> kwplot.imshow(new_img4, pnum=pnum_()) >>> kwplot.imshow(new_img5, pnum=pnum_()) >>> kwplot.show_if_requested()
- kwimage.imresize(img, scale=None, dsize=None, max_dim=None, min_dim=None, interpolation=None, grow_interpolation=None, letterbox=False, return_info=False, antialias=False)[source]¶
Resize an image based on a scale factor, final size, or size and aspect ratio.
Slightly more general than cv2.resize, allows for specification of either a scale factor, a final size, or the final size for a particular dimension.
- Parameters
img (ndarray) – image to resize
scale (float or Tuple[float, float]) – Desired floating point scale factor. If a tuple, the dimension ordering is x,y. Mutually exclusive with dsize, max_dim, and min_dim.
dsize (Tuple[int] | None) – The desired with and height of the new image. If a dimension is None, then it is automatically computed to preserve aspect ratio. Mutually exclusive with size, max_dim, and min_dim.
max_dim (int) – New size of the maximum dimension, the other dimension is scaled to maintain aspect ratio. Mutually exclusive with size, dsize, and min_dim.
min_dim (int) – New size of the minimum dimension, the other dimension is scaled to maintain aspect ratio.Mutually exclusive with size, dsize, and max_dim.
interpolation (str | int) – The interpolation key or code (e.g. linear lanczos). By default “area” is used if the image is shrinking and “lanczos” is used if the image is growing. Note, if this is explicitly set, then it will be used regardless of if the image is growing or shrinking. Set
grow_interpolation
to change the default for an enlarging interpolation.grow_interpolation (str | int, default=”lanczos”) – The interpolation key or code to use when the image is being enlarged. Does nothing if “interpolation” is explicitly given. If “interpolation” is not specified “area” is used when shrinking.
letterbox (bool, default=False) – If used in conjunction with dsize, then the image is scaled and translated to fit in the center of the new image while maintaining aspect ratio. Zero padding is added if necessary.
return_info (bool, default=False) – if True returns information about the final transformation in a dictionary. If there is an offset, the scale is applied before the offset when transforming to the new resized space.
antialias (bool, default=False) – if True blurs to anti-alias before downsampling.
- Returns
the new image and optionally an info dictionary if return_info=True
- Return type
ndarray | Tuple[ndarray, Dict]
Example
>>> import kwimage >>> import numpy as np >>> # Test scale >>> img = np.zeros((16, 10, 3), dtype=np.uint8) >>> new_img, info = kwimage.imresize(img, scale=.85, >>> interpolation='area', >>> return_info=True) >>> print('info = {!r}'.format(info)) >>> assert info['scale'].tolist() == [.8, 0.875] >>> # Test dsize without None >>> new_img, info = kwimage.imresize(img, dsize=(5, 12), >>> interpolation='area', >>> return_info=True) >>> print('info = {!r}'.format(info)) >>> assert info['scale'].tolist() == [0.5 , 0.75] >>> # Test dsize with None >>> new_img, info = kwimage.imresize(img, dsize=(6, None), >>> interpolation='area', >>> return_info=True) >>> print('info = {!r}'.format(info)) >>> assert info['scale'].tolist() == [0.6, 0.625] >>> # Test max_dim >>> new_img, info = kwimage.imresize(img, max_dim=6, >>> interpolation='area', >>> return_info=True) >>> print('info = {!r}'.format(info)) >>> assert info['scale'].tolist() == [0.4 , 0.375] >>> # Test min_dim >>> new_img, info = kwimage.imresize(img, min_dim=6, >>> interpolation='area', >>> return_info=True) >>> print('info = {!r}'.format(info)) >>> assert info['scale'].tolist() == [0.6 , 0.625]
Example
>>> import kwimage >>> import numpy as np >>> # Test letterbox resize >>> img = np.ones((5, 10, 3), dtype=np.float32) >>> new_img, info = kwimage.imresize(img, dsize=(19, 19), >>> letterbox=True, >>> return_info=True) >>> print('info = {!r}'.format(info)) >>> assert info['offset'].tolist() == [0, 4] >>> img = np.ones((10, 5, 3), dtype=np.float32) >>> new_img, info = kwimage.imresize(img, dsize=(19, 19), >>> letterbox=True, >>> return_info=True) >>> print('info = {!r}'.format(info)) >>> assert info['offset'].tolist() == [4, 0]
>>> import kwimage >>> import numpy as np >>> # Test letterbox resize >>> img = np.random.rand(100, 200) >>> new_img, info = kwimage.imresize(img, dsize=(300, 300), letterbox=True, return_info=True)
Example
>>> # Check aliasing >>> import kwimage >>> img = kwimage.grab_test_image('checkerboard') >>> img = kwimage.grab_test_image('astro') >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> dsize = (14, 14) >>> dsize = (64, 64) >>> # When we set "grow_interpolation" for a "shrinking" resize it should >>> # still do the "area" interpolation to antialias the results. But if we >>> # use explicit interpolation it should alias. >>> pnum_ = kwplot.PlotNums(nSubplots=12, nCols=4) >>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=True, interpolation='area'), pnum=pnum_(), title='resize aa area') >>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=True, interpolation='linear'), pnum=pnum_(), title='resize aa linear') >>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=True, interpolation='nearest'), pnum=pnum_(), title='resize aa nearest') >>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=True, interpolation='cubic'), pnum=pnum_(), title='resize aa cubic')
>>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=True, grow_interpolation='area'), pnum=pnum_(), title='resize aa grow area') >>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=True, grow_interpolation='linear'), pnum=pnum_(), title='resize aa grow linear') >>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=True, grow_interpolation='nearest'), pnum=pnum_(), title='resize aa grow nearest') >>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=True, grow_interpolation='cubic'), pnum=pnum_(), title='resize aa grow cubic')
>>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=False, interpolation='area'), pnum=pnum_(), title='resize no-aa area') >>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=False, interpolation='linear'), pnum=pnum_(), title='resize no-aa linear') >>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=False, interpolation='nearest'), pnum=pnum_(), title='resize no-aa nearest') >>> kwplot.imshow(kwimage.imresize(img, dsize=dsize, antialias=False, interpolation='cubic'), pnum=pnum_(), title='resize no-aa cubic')
Todo
- [X] When interpolation is area and the number of channels > 4
cv2.resize will error but it is fine for linear interpolation
[ ] TODO: add padding options when letterbox=True
- kwimage.imscale(img, scale, interpolation=None, return_scale=False)[source]¶
DEPRECATED and removed: use imresize instead
- kwimage.warp_affine(image, transform, dsize=None, antialias=False, interpolation='linear', border_mode=None, border_value=0, large_warp_dim=None, return_info=False)[source]¶
Applies an affine transformation to an image with optional antialiasing.
- Parameters
image (ndarray) – the input image as a numpy array. Note: this is passed directly to cv2, so it is best to ensure that it is contiguous and using a dtype that cv2 can handle.
transform (ndarray | Affine) – a coercable affine matrix. See
kwimage.Affine
for details on what can be coerced.dsize (Tuple[int, int] | None | str, default=None) – A integer width and height tuple of the resulting “canvas” image. If None, then the input image size is used.
If specified as a string, dsize is computed based on the given heuristic.
If ‘positive’ (or ‘auto’), dsize is computed such that the positive coordinates of the warped image will fit in the new canvas. In this case, any pixel that maps to a negative coordinate will be clipped. This has the property that the input transformation is not modified.
If ‘content’ (or ‘max’), the transform is modified with an extra translation such that both the positive and negative coordinates of the warped image will fit in the new canvas.
antialias (bool, default=False) – if True determines if the transform is downsampling and applies antialiasing via gaussian a blur.
interpolation (str, default=”linear”) – interpolation code or cv2 integer. Interpolation codes are linear, nearest, cubic, lancsoz, and area.
border_mode (str) – Border code or cv2 integer. Border codes are constant replicate, reflect, wrap, reflect101, and transparent.
border_value (int | float) – Used as the fill value if border_mode is constant. Otherwise this is ignored.
large_warp_dim (int | None | str, default=None) – If specified, perform the warp piecewise in chunks of the specified size. If “auto”, it is set to the maximum “short” value in numpy. This works around a limitation of cv2.warpAffine, which must have image dimensions < SHRT_MAX (=32767 in version 4.5.3)
return_info (bool, default=Fasle) – if True, returns information about the operation. In the case where dsize=”content”, this includes the modified transformation.
- Returns
the warped image, or if return info is True, the warped image and the info dictionary.
- Return type
ndarray | Tuple[ndarray, Dict]
Example
>>> from kwimage.im_cv2 import * # NOQA >>> import kwimage >>> from kwimage.transform import Affine >>> image = kwimage.grab_test_image('astro') >>> #image = kwimage.grab_test_image('checkerboard') >>> transform = Affine.random() @ Affine.scale(0.05) >>> transform = Affine.scale(0.02) >>> warped1 = warp_affine(image, transform, dsize='positive', antialias=1, interpolation='nearest') >>> warped2 = warp_affine(image, transform, dsize='positive', antialias=0) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> pnum_ = kwplot.PlotNums(nRows=1, nCols=2) >>> kwplot.imshow(warped1, pnum=pnum_(), title='antialias=True') >>> kwplot.imshow(warped2, pnum=pnum_(), title='antialias=False') >>> kwplot.show_if_requested()
Example
>>> from kwimage.im_cv2 import * # NOQA >>> import kwimage >>> from kwimage.transform import Affine >>> image = kwimage.grab_test_image('astro') >>> image = kwimage.grab_test_image('checkerboard') >>> transform = Affine.random() @ Affine.scale((.1, 1.2)) >>> warped1 = warp_affine(image, transform, dsize='positive', antialias=1) >>> warped2 = warp_affine(image, transform, dsize='positive', antialias=0) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> pnum_ = kwplot.PlotNums(nRows=1, nCols=2) >>> kwplot.imshow(warped1, pnum=pnum_(), title='antialias=True') >>> kwplot.imshow(warped2, pnum=pnum_(), title='antialias=False') >>> kwplot.show_if_requested()
Example
>>> # Test the case where the input data is empty or the target canvas >>> # is empty, this should be handled like boundary effects >>> import kwimage >>> image = np.random.rand(1, 1, 3) >>> transform = kwimage.Affine.random() >>> result = kwimage.warp_affine(image, transform, dsize=(0, 0)) >>> assert result.shape == (0, 0, 3) >>> # >>> empty_image = np.random.rand(0, 1, 3) >>> result = kwimage.warp_affine(empty_image, transform, dsize=(10, 10)) >>> assert result.shape == (10, 10, 3) >>> # >>> empty_image = np.random.rand(0, 1, 3) >>> result = kwimage.warp_affine(empty_image, transform, dsize=(10, 0)) >>> assert result.shape == (0, 10, 3)
Example
>>> # Demo difference between positive and content dsize >>> from kwimage.im_cv2 import * # NOQA >>> import kwimage >>> from kwimage.transform import Affine >>> image = kwimage.grab_test_image('astro', dsize=(512, 512)) >>> transform = Affine.coerce(offset=(-100, -50), scale=2, theta=0.1) >>> # When warping other images or geometry along with this image >>> # it is important to account for the modified transform when >>> # setting dsize='content'. If dsize='positive', the transform >>> # will remain unchanged wrt other aligned images / geometries. >>> poly = kwimage.Boxes([[350, 5, 130, 290]], 'xywh').to_polygons()[0] >>> # Apply the warping to the images >>> warped_pos, info_pos = warp_affine(image, transform, dsize='positive', return_info=True) >>> warped_con, info_con = warp_affine(image, transform, dsize='content', return_info=True) >>> assert info_pos['dsize'] == (919, 1072) >>> assert info_con['dsize'] == (1122, 1122) >>> assert info_pos['transform'] == transform >>> # Demo the correct and incorrect way to apply transforms >>> poly_pos = poly.warp(transform) >>> poly_con = poly.warp(info_con['transform']) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> # show original >>> kwplot.imshow(image, pnum=(1, 3, 1), title='original') >>> poly.draw(color='green', alpha=0.5, border=True) >>> # show positive warped >>> kwplot.imshow(warped_pos, pnum=(1, 3, 2), title='dsize=positive') >>> poly_pos.draw(color='purple', alpha=0.5, border=True) >>> # show content warped >>> ax = kwplot.imshow(warped_con, pnum=(1, 3, 3), title='dsize=content')[1] >>> poly_con.draw(color='dodgerblue', alpha=0.5, border=True) # correct >>> poly_pos.draw(color='orangered', alpha=0.5, border=True) # incorrect >>> cc = poly_con.to_shapely().centroid >>> cp = poly_pos.to_shapely().centroid >>> ax.text(cc.x, cc.y + 250, 'correctly transformed', color='dodgerblue', >>> backgroundcolor=(0, 0, 0, 0.7), horizontalalignment='center') >>> ax.text(cp.x, cp.y - 250, 'incorrectly transformed', color='orangered', >>> backgroundcolor=(0, 0, 0, 0.7), horizontalalignment='center') >>> kwplot.show_if_requested()
Example
>>> # Demo piecewise transform >>> from kwimage.im_cv2 import * # NOQA >>> import kwimage >>> from kwimage.transform import Affine >>> image = kwimage.grab_test_image('astro', dsize=(512, 512)) >>> transform = Affine.coerce(offset=(-100, -50), scale=2, theta=0.1) >>> warped_piecewise, info = warp_affine(image, transform, dsize='positive', return_info=True, large_warp_dim=32) >>> warped_normal, info = warp_affine(image, transform, dsize='positive', return_info=True, large_warp_dim=None) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(image, pnum=(1, 3, 1), title='original') >>> kwplot.imshow(warped_normal, pnum=(1, 3, 2), title='normal warp') >>> kwplot.imshow(warped_piecewise, pnum=(1, 3, 3), title='piecewise warp')
- kwimage.checkerboard(num_squares=8, dsize=(512, 512))[source]¶
Creates a checkerboard image
- Parameters
num_squares (int) – number of squares in a row
dsize (Tuple[int, int]) – width and height
References
https://stackoverflow.com/questions/2169478/how-to-make-a-checkerboard-in-numpy
Example
>>> from kwimage.im_demodata import * # NOQA >>> img = checkerboard()
- kwimage.grab_test_image(key='astro', space='rgb', dsize=None, interpolation='lanczos')[source]¶
Ensures that the test image exists (this might use the network), reads it and returns the the image pixels.
- Parameters
key (str) – which test image to grab. Valid choices are: astro - an astronaught carl - Carl Sagan paraview - ParaView logo stars - picture of stars in the sky airport - SkySat image of Beijing Capital International Airport on 18 February 2018 See
kwimage.grab_test_image.keys
for a full list.space (str, default=’rgb’) – which colorspace to return in
dsize (Tuple[int, int], default=None) – if specified resizes image to this size
- Returns
the requested image
- Return type
ndarray
- CommandLine:
xdoctest -m kwimage.im_demodata grab_test_image
Example
>>> # xdoctest: +REQUIRES(--network) >>> import kwimage >>> for key in kwimage.grab_test_image.keys(): >>> print('attempt to grab key = {!r}'.format(key)) >>> kwimage.grab_test_image(key) >>> print('grabbed key = {!r}'.format(key)) >>> kwimage.grab_test_image('astro', dsize=(255, 255)).shape (255, 255, 3)
- kwimage.grab_test_image_fpath(key='astro')[source]¶
Ensures that the test image exists (this might use the network) and returns the cached filepath to the requested image.
- Parameters
key (str) – which test image to grab. Valid choices are: astro - an astronaught carl - Carl Sagan paraview - ParaView logo stars - picture of stars in the sky
- Returns
path to the requested image
- Return type
- CommandLine:
python -c “import kwimage; print(kwimage.grab_test_image_fpath(‘airport’))”
Example
>>> # xdoctest: +REQUIRES(--network) >>> import kwimage >>> for key in kwimage.grab_test_image.keys(): ... print('attempt to grab key = {!r}'.format(key)) ... kwimage.grab_test_image_fpath(key) ... print('grabbed grab key = {!r}'.format(key))
- kwimage.draw_boxes_on_image(img, boxes, color='blue', thickness=1, box_format=None, colorspace='rgb')[source]¶
Draws boxes on an image.
- Parameters
img (ndarray) – image to copy and draw on
boxes (nh.util.Boxes) – boxes to draw
colorspace (str) – string code of the input image colorspace
Example
>>> import kwimage >>> import numpy as np >>> img = np.zeros((10, 10, 3), dtype=np.uint8) >>> color = 'dodgerblue' >>> thickness = 1 >>> boxes = kwimage.Boxes([[1, 1, 8, 8]], 'ltrb') >>> img2 = draw_boxes_on_image(img, boxes, color, thickness) >>> assert tuple(img2[1, 1]) == (30, 144, 255) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() # xdoc: +SKIP >>> kwplot.figure(doclf=True, fnum=1) >>> kwplot.imshow(img2)
- kwimage.draw_clf_on_image(im, classes, tcx=None, probs=None, pcx=None, border=1)[source]¶
Draws classification label on an image.
Works best with image chips sized between 200x200 and 500x500
- Parameters
im (ndarray) – the image
classes (Sequence | CategoryTree) – list of class names
tcx (int, default=None) – true class index if known
probs (ndarray) – predicted class probs for each class
pcx (int, default=None) – predicted class index. (if None but probs is specified uses argmax of probs)
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> import torch >>> import kwarray >>> import kwimage >>> rng = kwarray.ensure_rng(0) >>> im = (rng.rand(300, 300) * 255).astype(np.uint8) >>> classes = ['cls_a', 'cls_b', 'cls_c'] >>> tcx = 1 >>> probs = rng.rand(len(classes)) >>> probs[tcx] = 0 >>> probs = torch.FloatTensor(probs).softmax(dim=0).numpy() >>> im1_ = kwimage.draw_clf_on_image(im, classes, tcx, probs) >>> probs[tcx] = .9 >>> probs = torch.FloatTensor(probs).softmax(dim=0).numpy() >>> im2_ = kwimage.draw_clf_on_image(im, classes, tcx, probs) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(im1_, colorspace='rgb', pnum=(1, 2, 1), fnum=1, doclf=True) >>> kwplot.imshow(im2_, colorspace='rgb', pnum=(1, 2, 2), fnum=1) >>> kwplot.show_if_requested()
- kwimage.draw_line_segments_on_image(img, pts1, pts2, color='blue', colorspace='rgb', thickness=1, **kwargs)[source]¶
Draw line segments between pts1 and pts2 on an image.
- Parameters
pts1 (ndarray) – xy coordinates of starting points
pts2 (ndarray) – corresponding xy coordinates of ending points
color (str | List) – color code or a list of colors for each line segment
colorspace (str, default=’rgb’) – colorspace of image
thickness (int, default=1)
lineType (int, default=cv2.LINE_AA)
- Returns
the modified image (inplace if possible)
- Return type
ndarray
Example
>>> from kwimage.im_draw import * # NOQA >>> pts1 = np.array([[2, 0], [2, 20], [2.5, 30]]) >>> pts2 = np.array([[10, 5], [30, 28], [100, 50]]) >>> img = np.ones((100, 100, 3), dtype=np.uint8) * 255 >>> color = 'blue' >>> colorspace = 'rgb' >>> img2 = draw_line_segments_on_image(img, pts1, pts2, thickness=2) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() # xdoc: +SKIP >>> kwplot.figure(doclf=True, fnum=1) >>> kwplot.imshow(img2)
Example
>>> import kwimage >>> pts1 = kwimage.Points.random(10).scale(512).xy >>> pts2 = kwimage.Points.random(10).scale(512).xy >>> img = np.ones((512, 512, 3), dtype=np.uint8) * 255 >>> color = kwimage.Color.distinct(10) >>> img2 = kwimage.draw_line_segments_on_image(img, pts1, pts2, color=color) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() # xdoc: +SKIP >>> kwplot.figure(doclf=True, fnum=1) >>> kwplot.imshow(img2)
- kwimage.draw_text_on_image(img, text, org, return_info=False, **kwargs)[source]¶
Draws multiline text on an image using opencv
- Parameters
img (ndarray | None | dict) – Generally a numpy image to draw on (inplace). Otherwise a canvas will be constructed such that the text will fit. The user may specify a dictionary with keys width and height to have more control over the constructed canvas.
text (str) – text to draw
org (Tuple[int, int]) – The x, y location of the text string “anchor” in the image as specified by halign and valign. For instance, If valign=’bottom’, halign=’left’, this is the bottom left corner.
return_info (bool, default=False) – if True, also returns information about the positions the text was drawn on.
**kwargs – color (tuple): default blue thickness (int): defaults to 2 fontFace (int): defaults to cv2.FONT_HERSHEY_SIMPLEX fontScale (float): defaults to 1.0 valign (str, default=’bottom’):
either top, center, or bottom. NOTE: this default may change to “top” in the future.
- halign (str, default=’left’):
either left, center, or right
- border (dict | int):
If specified as an integer, draws a black border with that given thickness. If specified as a dictionary, draws a border with color specified parameters.
“color”: border color, defaults to “black”. “thickness”: border thickness, defaults to 1.
- Returns
the image that was drawn on
- Return type
ndarray
Note
The image is modified inplace. If the image is non-contiguous then this returns a UMat instead of a ndarray, so be carefull with that.
References
https://stackoverflow.com/questions/27647424/ https://stackoverflow.com/questions/51285616/opencvs-gettextsize-and-puttext-return-wrong-size-and-chop-letters-with-low
Example
>>> import kwimage >>> img = kwimage.grab_test_image(space='rgb') >>> img2 = kwimage.draw_text_on_image(img.copy(), 'FOOBAR', org=(0, 0), valign='top') >>> assert img2.shape == img.shape >>> assert np.any(img2 != img) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(img2) >>> kwplot.show_if_requested()
Example
>>> import kwimage >>> # Test valign >>> img = kwimage.grab_test_image(space='rgb', dsize=(500, 500)) >>> img2 = kwimage.draw_text_on_image(img, 'FOOBAR\nbazbiz\nspam', org=(0, 0), valign='top', border=2) >>> img2 = kwimage.draw_text_on_image(img, 'FOOBAR\nbazbiz\nspam', org=(150, 0), valign='center', border=2) >>> img2 = kwimage.draw_text_on_image(img, 'FOOBAR\nbazbiz\nspam', org=(300, 0), valign='bottom', border=2) >>> # Test halign >>> img2 = kwimage.draw_text_on_image(img, 'FOOBAR\nbazbiz\nspam', org=(250, 100), halign='right', border=2) >>> img2 = kwimage.draw_text_on_image(img, 'FOOBAR\nbazbiz\nspam', org=(250, 250), halign='center', border=2) >>> img2 = kwimage.draw_text_on_image(img, 'FOOBAR\nbazbiz\nspam', org=(250, 400), halign='left', border=2) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(img2) >>> kwplot.show_if_requested()
Example
>>> # Ensure the function works with float01 or uint255 images >>> import kwimage >>> img = kwimage.grab_test_image(space='rgb') >>> img = kwimage.ensure_float01(img) >>> img2 = kwimage.draw_text_on_image(img, 'FOOBAR\nbazbiz\nspam', org=(0, 0), valign='top', border=2)
Example
>>> # Test dictionary border >>> import kwimage >>> img = kwimage.draw_text_on_image(None, 'hello\neveryone', org=(100, 100), valign='top', halign='center', border={'color': 'green', 'thickness': 9}) >>> #img = kwimage.draw_text_on_image(None, 'hello\neveryone', org=(0, 0), valign='top') >>> #img = kwimage.draw_text_on_image(None, 'hello', org=(0, 60), valign='top', halign='center', border=0) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(img) >>> kwplot.show_if_requested()
Example
>>> # Test dictionary image >>> import kwimage >>> img = kwimage.draw_text_on_image({'width': 300}, 'good\nPropogate', org=(150, 0), valign='top', halign='center', border={'color': 'green', 'thickness': 0}) >>> print('img.shape = {!r}'.format(img.shape)) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(img) >>> kwplot.show_if_requested()
- kwimage.draw_vector_field(image, dx, dy, stride=0.02, thresh=0.0, scale=1.0, alpha=1.0, color='red', thickness=1, tipLength=0.1, line_type='aa')[source]¶
Create an image representing a 2D vector field.
- Parameters
image (ndarray) – image to draw on
dx (ndarray) – grid of vector x components
dy (ndarray) – grid of vector y components
stride (int | float) – sparsity of vectors, int specifies stride step in pixels, a float specifies it as a percentage.
thresh (float) – only plot vectors with magnitude greater than thres
scale (float) – multiply magnitude for easier visualization
alpha (float) – alpha value for vectors. Non-vector regions receive 0 alpha (if False, no alpha channel is used)
color (str | tuple | kwimage.Color) – RGB color of the vectors
thickness (int, default=1) – thickness of arrows
tipLength (float, default=0.1) – fraction of line length
line_type (int) – either cv2.LINE_4, cv2.LINE_8, or cv2.LINE_AA
- Returns
- The image with vectors overlaid. If image=None, then an
rgb/a image is created and returned.
- Return type
ndarray[float32]
Example
>>> import kwimage >>> width, height = 512, 512 >>> image = kwimage.grab_test_image(dsize=(width, height)) >>> x, y = np.meshgrid(np.arange(height), np.arange(width)) >>> dx, dy = x - width / 2, y - height / 2 >>> radians = np.arctan2(dx, dy) >>> mag = np.sqrt(dx ** 2 + dy ** 2) + 1e-3 >>> dx, dy = dx / mag, dy / mag >>> img = kwimage.draw_vector_field(image, dx, dy, scale=10, alpha=False) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(img) >>> kwplot.show_if_requested()
- kwimage.make_heatmask(probs, cmap='plasma', with_alpha=1.0, space='rgb', dsize=None)[source]¶
Colorizes a single-channel intensity mask (with an alpha channel)
- Parameters
probs (ndarray) – 2D probability map with values between 0 and 1
cmap (str) – mpl colormap
with_alpha (float) – between 0 and 1, uses probs as the alpha multipled by this number.
space (str) – output colorspace
dsize (tuple) – if not None, then output is resized to W,H=dsize
- SeeAlso:
kwimage.overlay_alpha_images
Example
>>> # xdoc: +REQUIRES(module:matplotlib) >>> probs = np.tile(np.linspace(0, 1, 10), (10, 1)) >>> heatmask = make_heatmask(probs, with_alpha=0.8, dsize=(100, 100)) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.imshow(heatmask, fnum=1, doclf=True, colorspace='rgb') >>> kwplot.show_if_requested()
- kwimage.make_orimask(radians, mag=None, alpha=1.0)[source]¶
Makes a colormap in HSV space where the orientation changes color and mag changes the saturation/value.
- Parameters
radians (ndarray) – orientation in radians
mag (ndarray) – magnitude (must be normalized between 0 and 1)
alpha (float | ndarray) – if False or None, then the image is returned without alpha if a float, then mag is scaled by this and used as the alpha channel if an ndarray, then this is explicilty set as the alpha channel
- Returns
an rgb / rgba image in 01 space
- Return type
ndarray[float32]
- SeeAlso:
kwimage.overlay_alpha_images
Example
>>> # xdoc: +REQUIRES(module:matplotlib) >>> x, y = np.meshgrid(np.arange(64), np.arange(64)) >>> dx, dy = x - 32, y - 32 >>> radians = np.arctan2(dx, dy) >>> mag = np.sqrt(dx ** 2 + dy ** 2) >>> orimask = make_orimask(radians, mag) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.imshow(orimask, fnum=1, doclf=True, colorspace='rgb') >>> kwplot.show_if_requested()
- kwimage.make_vector_field(dx, dy, stride=0.02, thresh=0.0, scale=1.0, alpha=1.0, color='red', thickness=1, tipLength=0.1, line_type='aa')[source]¶
Create an image representing a 2D vector field.
- Parameters
dx (ndarray) – grid of vector x components
dy (ndarray) – grid of vector y components
stride (int | float) – sparsity of vectors, int specifies stride step in pixels, a float specifies it as a percentage.
thresh (float) – only plot vectors with magnitude greater than thres
scale (float) – multiply magnitude for easier visualization
alpha (float) – alpha value for vectors. Non-vector regions receive 0 alpha (if False, no alpha channel is used)
color (str | tuple | kwimage.Color) – RGB color of the vectors
thickness (int, default=1) – thickness of arrows
tipLength (float, default=0.1) – fraction of line length
line_type (int) – either cv2.LINE_4, cv2.LINE_8, or cv2.LINE_AA
- Returns
vec_img: an rgb/rgba image in 0-1 space
- Return type
ndarray[float32]
- SeeAlso:
kwimage.overlay_alpha_images
DEPRECATED USE: draw_vector_field instead
Example
>>> x, y = np.meshgrid(np.arange(512), np.arange(512)) >>> dx, dy = x - 256.01, y - 256.01 >>> radians = np.arctan2(dx, dy) >>> mag = np.sqrt(dx ** 2 + dy ** 2) >>> dx, dy = dx / mag, dy / mag >>> img = make_vector_field(dx, dy, scale=10, alpha=False) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(img) >>> kwplot.show_if_requested()
- kwimage.fourier_mask(img_hwc, mask, axis=None, clip=None)[source]¶
Applies a mask to the fourier spectrum of an image
- Parameters
img_hwc (ndarray) – assumed to be float 01
mask (ndarray) – mask used to modulate the image in the fourier domain. Usually these are boolean values (hence the name mask), but any numerical value is technically allowed.
- CommandLine:
xdoctest -m kwimage.im_filter fourier_mask –show
Example
>>> from kwimage.im_filter import * # NOQA >>> import kwimage >>> img_hwc = kwimage.grab_test_image(space='gray') >>> mask = np.random.rand(*img_hwc.shape[0:2]) >>> out_hwc = fourier_mask(img_hwc, mask) >>> # xdoc: REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(img_hwc, pnum=(1, 2, 1), fnum=1) >>> kwplot.imshow(out_hwc, pnum=(1, 2, 2), fnum=1) >>> kwplot.show_if_requested()
- kwimage.radial_fourier_mask(img_hwc, radius=11, axis=None, clip=None)[source]¶
In [1] they use a radius of 11.0 on CIFAR-10.
- Parameters
img_hwc (ndarray) – assumed to be float 01
References
[1] Jo and Bengio “Measuring the tendency of CNNs to Learn Surface Statistical Regularities” 2017. https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/py_transforms/py_fourier_transform/py_fourier_transform.html
Example
>>> from kwimage.im_filter import * # NOQA >>> import kwimage >>> img_hwc = kwimage.grab_test_image() >>> img_hwc = kwimage.ensure_float01(img_hwc) >>> out_hwc = radial_fourier_mask(img_hwc, radius=11) >>> # xdoc: REQUIRES(--show) >>> import kwplot >>> plt = kwplot.autoplt() >>> def keepdim(func): >>> def _wrap(im): >>> needs_transpose = (im.shape[0] == 3) >>> if needs_transpose: >>> im = im.transpose(1, 2, 0) >>> out = func(im) >>> if needs_transpose: >>> out = out.transpose(2, 0, 1) >>> return out >>> return _wrap >>> @keepdim >>> def rgb_to_lab(im): >>> return kwimage.convert_colorspace(im, src_space='rgb', dst_space='lab') >>> @keepdim >>> def lab_to_rgb(im): >>> return kwimage.convert_colorspace(im, src_space='lab', dst_space='rgb') >>> @keepdim >>> def rgb_to_yuv(im): >>> return kwimage.convert_colorspace(im, src_space='rgb', dst_space='yuv') >>> @keepdim >>> def yuv_to_rgb(im): >>> return kwimage.convert_colorspace(im, src_space='yuv', dst_space='rgb') >>> def show_data(img_hwc): >>> # dpath = ub.ensuredir('./fouriertest') >>> kwplot.imshow(img_hwc, fnum=1) >>> pnum_ = kwplot.PlotNums(nRows=4, nCols=5) >>> for r in range(0, 17): >>> imgt = radial_fourier_mask(img_hwc, r, clip=(0, 1)) >>> kwplot.imshow(imgt, pnum=pnum_(), fnum=2) >>> plt.gca().set_title('r = {}'.format(r)) >>> kwplot.set_figtitle('RGB') >>> # plt.gcf().savefig(join(dpath, '{}_{:08d}.png'.format('rgb', x))) >>> pnum_ = kwplot.PlotNums(nRows=4, nCols=5) >>> for r in range(0, 17): >>> imgt = lab_to_rgb(radial_fourier_mask(rgb_to_lab(img_hwc), r)) >>> kwplot.imshow(imgt, pnum=pnum_(), fnum=3) >>> plt.gca().set_title('r = {}'.format(r)) >>> kwplot.set_figtitle('LAB') >>> # plt.gcf().savefig(join(dpath, '{}_{:08d}.png'.format('lab', x))) >>> pnum_ = kwplot.PlotNums(nRows=4, nCols=5) >>> for r in range(0, 17): >>> imgt = yuv_to_rgb(radial_fourier_mask(rgb_to_yuv(img_hwc), r)) >>> kwplot.imshow(imgt, pnum=pnum_(), fnum=4) >>> plt.gca().set_title('r = {}'.format(r)) >>> kwplot.set_figtitle('YUV') >>> # plt.gcf().savefig(join(dpath, '{}_{:08d}.png'.format('yuv', x))) >>> show_data(img_hwc) >>> kwplot.show_if_requested()
- kwimage.imread(fpath, space='auto', backend='auto')[source]¶
Reads image data in a specified format using some backend implementation.
- Parameters
fpath (str) – path to the file to be read
space (str, default=’auto’) – The desired colorspace of the image. Can by any colorspace accepted by convert_colorspace, or it can be ‘auto’, in which case the colorspace of the image is unmodified (except in the case where a color image is read by opencv, in which case we convert BGR to RGB by default). If None, then no modification is made to whatever backend is used to read the image.
New in version 0.7.10: when the backend does not resolve to “cv2” the “auto” space resolves to None, thus the image is read as-is.
backend (str, default=’auto’) – which backend reader to use. By default the file extension is used to determine this, but it can be manually overridden. Valid backends are ‘gdal’, ‘skimage’, ‘itk’, and ‘cv2’.
- Returns
the image data in the specified color space.
- Return type
ndarray
Note
if space is something non-standard like HSV or LAB, then the file must be a normal 8-bit color image, otherwise an error will occur.
- Raises
IOError - If the image cannot be read –
ImportError - If trying to read a nitf without gdal –
NotImplementedError - if trying to read a corner-case image –
Example
>>> # xdoctest: +REQUIRES(--network) >>> from kwimage.im_io import * # NOQA >>> import tempfile >>> from os.path import splitext # NOQA >>> # Test a non-standard image, which encodes a depth map >>> fpath = ub.grabdata( >>> 'http://www.topcoder.com/contest/problem/UrbanMapper3D/JAX_Tile_043_DTM.tif', >>> hasher='sha256', hash_prefix='64522acba6f0fb7060cd4c202ed32c5163c34e63d386afdada4190cce51ff4d4') >>> img1 = imread(fpath) >>> # Check that write + read preserves data >>> tmp = tempfile.NamedTemporaryFile(suffix=splitext(fpath)[1]) >>> imwrite(tmp.name, img1) >>> img2 = imread(tmp.name) >>> assert np.all(img2 == img1) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(img1, pnum=(1, 2, 1), fnum=1, norm=True) >>> kwplot.imshow(img2, pnum=(1, 2, 2), fnum=1, norm=True)
Example
>>> # xdoctest: +REQUIRES(--network) >>> import tempfile >>> img1 = imread(ub.grabdata( >>> 'http://i.imgur.com/iXNf4Me.png', fname='ada.png', hasher='sha256', >>> hash_prefix='898cf2588c40baf64d6e09b6a93b4c8dcc0db26140639a365b57619e17dd1c77')) >>> tmp_tif = tempfile.NamedTemporaryFile(suffix='.tif') >>> tmp_png = tempfile.NamedTemporaryFile(suffix='.png') >>> imwrite(tmp_tif.name, img1) >>> imwrite(tmp_png.name, img1) >>> tif_im = imread(tmp_tif.name) >>> png_im = imread(tmp_png.name) >>> assert np.all(tif_im == png_im) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(png_im, pnum=(1, 2, 1), fnum=1) >>> kwplot.imshow(tif_im, pnum=(1, 2, 2), fnum=1)
Example
>>> # xdoctest: +REQUIRES(--network) >>> import tempfile >>> tif_fpath = ub.grabdata( >>> 'https://ghostscript.com/doc/tiff/test/images/rgb-3c-16b.tiff', >>> fname='pepper.tif', hasher='sha256', >>> hash_prefix='31ff3a1f416cb7281acfbcbb4b56ee8bb94e9f91489602ff2806e5a49abc03c0') >>> img1 = imread(tif_fpath) >>> tmp_tif = tempfile.NamedTemporaryFile(suffix='.tif') >>> tmp_png = tempfile.NamedTemporaryFile(suffix='.png') >>> imwrite(tmp_tif.name, img1) >>> imwrite(tmp_png.name, img1) >>> tif_im = imread(tmp_tif.name) >>> png_im = imread(tmp_png.name) >>> assert np.all(tif_im == png_im) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(png_im / 2 ** 16, pnum=(1, 2, 1), fnum=1) >>> kwplot.imshow(tif_im / 2 ** 16, pnum=(1, 2, 2), fnum=1)
Example
>>> # xdoctest: +REQUIRES(module:itk, --network) >>> import kwimage >>> import ubelt as ub >>> # Grab an image that ITK can read >>> fpath = ub.grabdata( >>> url='https://data.kitware.com/api/v1/file/606754e32fa25629b9476f9e/download', >>> fname='brainweb1e5a10f17Rot20Tx20.mha', >>> hash_prefix='08f0812591691ae24a29788ba8cd1942e91', hasher='sha512') >>> # Read the image (this is actually a DxHxW stack of images) >>> img1_stack = kwimage.imread(fpath) >>> # Check that write + read preserves data >>> import tempfile >>> tmp_file = tempfile.NamedTemporaryFile(suffix='.mha') >>> kwimage.imwrite(tmp_file.name, img1_stack) >>> recon = kwimage.imread(tmp_file.name) >>> assert not np.may_share_memory(recon, img1_stack) >>> assert np.all(recon == img1_stack) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(kwimage.stack_images_grid(recon[0::20])) >>> kwplot.show_if_requested()
- Benchmark:
>>> from kwimage.im_io import * # NOQA >>> import timerit >>> import kwimage >>> import tempfile >>> # >>> dsize = (1920, 1080) >>> img1 = kwimage.grab_test_image('amazon', dsize=dsize) >>> ti = timerit.Timerit(10, bestof=3, verbose=1, unit='us') >>> formats = {} >>> dpath = ub.ensure_app_cache_dir('cache') >>> space = 'auto' >>> formats['png'] = kwimage.imwrite(join(dpath, '.png'), img1, space=space, backend='cv2') >>> formats['jpg'] = kwimage.imwrite(join(dpath, '.jpg'), img1, space=space, backend='cv2') >>> formats['tif_raw'] = kwimage.imwrite(join(dpath, '.raw.tif'), img1, space=space, backend='gdal', compress='RAW') >>> formats['tif_deflate'] = kwimage.imwrite(join(dpath, '.deflate.tif'), img1, space=space, backend='gdal', compress='DEFLATE') >>> formats['tif_lzw'] = kwimage.imwrite(join(dpath, '.lzw.tif'), img1, space=space, backend='gdal', compress='LZW') >>> grid = [ >>> ('cv2', 'png'), >>> ('cv2', 'jpg'), >>> ('gdal', 'jpg'), >>> ('turbojpeg', 'jpg'), >>> ('gdal', 'tif_raw'), >>> ('gdal', 'tif_lzw'), >>> ('gdal', 'tif_deflate'), >>> ('skimage', 'tif_raw'), >>> ] >>> backend, filefmt = 'cv2', 'png' >>> for backend, filefmt in grid: >>> for timer in ti.reset(f'imread-{filefmt}-{backend}'): >>> with timer: >>> kwimage.imread(formats[filefmt], space=space, backend=backend) >>> # Test all formats in auto mode >>> for filefmt in formats.keys(): >>> for timer in ti.reset(f'kwimage.imread-{filefmt}-auto'): >>> with timer: >>> kwimage.imread(formats[filefmt], space=space, backend='auto') >>> ti.measures = ub.map_vals(ub.sorted_vals, ti.measures) >>> import netharn as nh >>> print('ti.measures = {}'.format(nh.util.align(ub.repr2(ti.measures['min'], nl=2), ':'))) Timed best=42891.504 µs, mean=44008.439 ± 1409.2 µs for imread-png-cv2 Timed best=33146.808 µs, mean=34185.172 ± 656.3 µs for imread-jpg-cv2 Timed best=40120.306 µs, mean=41220.927 ± 1010.9 µs for imread-jpg-gdal Timed best=30798.162 µs, mean=31573.070 ± 737.0 µs for imread-jpg-turbojpeg Timed best=6223.170 µs, mean=6370.462 ± 150.7 µs for imread-tif_raw-gdal Timed best=42459.404 µs, mean=46519.940 ± 5664.9 µs for imread-tif_lzw-gdal Timed best=36271.175 µs, mean=37301.108 ± 861.1 µs for imread-tif_deflate-gdal Timed best=5239.503 µs, mean=6566.574 ± 1086.2 µs for imread-tif_raw-skimage ti.measures = { 'imread-tif_raw-skimage' : 0.0052395030070329085, 'imread-tif_raw-gdal' : 0.006223169999429956, 'imread-jpg-turbojpeg' : 0.030798161998973228, 'imread-jpg-cv2' : 0.03314680799667258, 'imread-tif_deflate-gdal': 0.03627117499127053, 'imread-jpg-gdal' : 0.040120305988239124, 'imread-tif_lzw-gdal' : 0.042459404008695856, 'imread-png-cv2' : 0.042891503995633684, }
>>> print('ti.measures = {}'.format(nh.util.align(ub.repr2(ti.measures['mean'], nl=2), ':')))
- kwimage.imwrite(fpath, image, space='auto', backend='auto', **kwargs)[source]¶
Writes image data to disk.
- Parameters
fpath (PathLike) – location to save the image
image (ndarray) – image data
space (str | None, default=’auto’) – the colorspace of the image to save. Can by any colorspace accepted by convert_colorspace, or it can be ‘auto’, in which case we assume the input image is either RGB, RGBA or grayscale. If None, then absolutely no color modification is made and whatever backend is used writes the image as-is.
New in version 0.7.10: when the backend does not resolve to “cv2”, the “auto” space resolves to None, thus the image is saved as-is.
backend (str, default=’auto’) – which backend writer to use. By default the file extension is used to determine this. Valid backends are ‘gdal’, ‘skimage’, ‘itk’, and ‘cv2’.
**kwargs – args passed to the backend writer
- Returns
path to the written file
- Return type
Notes
The image may be modified to preserve its colorspace depending on which backend is used to write the image.
When saving as a jpeg or png, the image must be encoded with the uint8 data type. When saving as a tiff, any data type is allowed.
- Raises
Exception – if the image cannot be written
- Doctest:
>>> # xdoctest: +REQUIRES(--network) >>> # This should be moved to a unit test >>> import tempfile >>> test_image_paths = [ >>> ub.grabdata('https://ghostscript.com/doc/tiff/test/images/rgb-3c-16b.tiff', fname='pepper.tif'), >>> ub.grabdata('http://i.imgur.com/iXNf4Me.png', fname='ada.png'), >>> #ub.grabdata('http://www.topcoder.com/contest/problem/UrbanMapper3D/JAX_Tile_043_DTM.tif'), >>> ub.grabdata('https://upload.wikimedia.org/wikipedia/commons/f/fa/Grayscale_8bits_palette_sample_image.png', fname='parrot.png') >>> ] >>> for fpath in test_image_paths: >>> for space in ['auto', 'rgb', 'bgr', 'gray', 'rgba']: >>> img1 = imread(fpath, space=space) >>> print('Test im-io consistency of fpath = {!r} in {} space, shape={}'.format(fpath, space, img1.shape)) >>> # Write the image in TIF and PNG format >>> tmp_tif = tempfile.NamedTemporaryFile(suffix='.tif') >>> tmp_png = tempfile.NamedTemporaryFile(suffix='.png') >>> imwrite(tmp_tif.name, img1, space=space, backend='skimage') >>> imwrite(tmp_png.name, img1, space=space) >>> tif_im = imread(tmp_tif.name, space=space) >>> png_im = imread(tmp_png.name, space=space) >>> assert np.all(tif_im == png_im), 'im-read/write inconsistency' >>> if _have_gdal: >>> tmp_tif2 = tempfile.NamedTemporaryFile(suffix='.tif') >>> imwrite(tmp_tif2.name, img1, space=space, backend='gdal') >>> tif_im2 = imread(tmp_tif2.name, space=space) >>> assert np.all(tif_im == tif_im2), 'im-read/write inconsistency' >>> if space == 'gray': >>> assert tif_im.ndim == 2 >>> assert png_im.ndim == 2 >>> elif space in ['rgb', 'bgr']: >>> assert tif_im.shape[2] == 3 >>> assert png_im.shape[2] == 3 >>> elif space in ['rgba', 'bgra']: >>> assert tif_im.shape[2] == 4 >>> assert png_im.shape[2] == 4
- Benchmark:
>>> import timerit >>> import os >>> import kwimage >>> import tempfile >>> # >>> img1 = kwimage.grab_test_image('astro', dsize=(1920, 1080)) >>> space = 'auto' >>> # >>> file_sizes = {} >>> # >>> ti = timerit.Timerit(10, bestof=3, verbose=2) >>> # >>> for timer in ti.reset('imwrite-skimage-tif'): >>> with timer: >>> tmp = tempfile.NamedTemporaryFile(suffix='.tif') >>> kwimage.imwrite(tmp.name, img1, space=space, backend='skimage') >>> file_sizes[ti.label] = os.stat(tmp.name).st_size >>> # >>> for timer in ti.reset('imwrite-cv2-png'): >>> with timer: >>> tmp = tempfile.NamedTemporaryFile(suffix='.png') >>> kwimage.imwrite(tmp.name, img1, space=space, backend='cv2') >>> file_sizes[ti.label] = os.stat(tmp.name).st_size >>> # >>> for timer in ti.reset('imwrite-cv2-jpg'): >>> with timer: >>> tmp = tempfile.NamedTemporaryFile(suffix='.jpg') >>> kwimage.imwrite(tmp.name, img1, space=space, backend='cv2') >>> file_sizes[ti.label] = os.stat(tmp.name).st_size >>> # >>> for timer in ti.reset('imwrite-gdal-raw'): >>> with timer: >>> tmp = tempfile.NamedTemporaryFile(suffix='.tif') >>> kwimage.imwrite(tmp.name, img1, space=space, backend='gdal', compress='RAW') >>> file_sizes[ti.label] = os.stat(tmp.name).st_size >>> # >>> for timer in ti.reset('imwrite-gdal-lzw'): >>> with timer: >>> tmp = tempfile.NamedTemporaryFile(suffix='.tif') >>> kwimage.imwrite(tmp.name, img1, space=space, backend='gdal', compress='LZW') >>> file_sizes[ti.label] = os.stat(tmp.name).st_size >>> # >>> for timer in ti.reset('imwrite-gdal-zstd'): >>> with timer: >>> tmp = tempfile.NamedTemporaryFile(suffix='.tif') >>> kwimage.imwrite(tmp.name, img1, space=space, backend='gdal', compress='ZSTD') >>> file_sizes[ti.label] = os.stat(tmp.name).st_size >>> # >>> for timer in ti.reset('imwrite-gdal-deflate'): >>> with timer: >>> tmp = tempfile.NamedTemporaryFile(suffix='.tif') >>> kwimage.imwrite(tmp.name, img1, space=space, backend='gdal', compress='DEFLATE') >>> file_sizes[ti.label] = os.stat(tmp.name).st_size >>> # >>> for timer in ti.reset('imwrite-gdal-jpeg'): >>> with timer: >>> tmp = tempfile.NamedTemporaryFile(suffix='.tif') >>> kwimage.imwrite(tmp.name, img1, space=space, backend='gdal', compress='JPEG') >>> file_sizes[ti.label] = os.stat(tmp.name).st_size >>> # >>> file_sizes = ub.sorted_vals(file_sizes) >>> import xdev >>> file_sizes_human = ub.map_vals(lambda x: xdev.byte_str(x, 'MB'), file_sizes) >>> print('ti.rankings = {}'.format(ub.repr2(ti.rankings, nl=2))) >>> print('file_sizes = {}'.format(ub.repr2(file_sizes_human, nl=1)))
Example
>>> # Test saving a multi-band file >>> import kwimage >>> import tempfile >>> # In this case the backend will not resolve to cv2, so >>> # we should not need to specify space. >>> data = np.random.rand(32, 32, 13).astype(np.float32) >>> temp = tempfile.NamedTemporaryFile(suffix='.tif') >>> fpath = temp.name >>> kwimage.imwrite(fpath, data) >>> recon = kwimage.imread(fpath) >>> assert np.all(recon == data)
>>> kwimage.imwrite(fpath, data, backend='skimage') >>> recon = kwimage.imread(fpath) >>> assert np.all(recon == data)
>>> import pytest >>> # In this case the backend will resolve to cv2, and thus we expect >>> # a failure >>> temp = tempfile.NamedTemporaryFile(suffix='.png') >>> fpath = temp.name >>> with pytest.raises(NotImplementedError): >>> kwimage.imwrite(fpath, data)
- kwimage.load_image_shape(fpath)[source]¶
Determine the height/width/channels of an image without reading the entire file.
- Parameters
fpath (str) – path to an image
- Returns
- Tuple - shape of the dataset.
Recall this library uses the convention that “shape” is refers to height,width,channels and “size” is width,height ordering.
- Benchmark:
>>> # For large files, PIL is much faster >>> import gdal >>> from PIL import Image >>> # >>> import kwimage >>> fpath = kwimage.grab_test_image_fpath() >>> # >>> ti = ub.Timerit(100, bestof=10, verbose=2) >>> for timer in ti.reset('gdal'): >>> with timer: >>> gdal_dset = gdal.Open(fpath, gdal.GA_ReadOnly) >>> width = gdal_dset.RasterXSize >>> height = gdal_dset.RasterYSize >>> gdal_dset = None >>> # >>> for timer in ti.reset('PIL'): >>> with timer: >>> pil_img = Image.open(fpath) >>> width, height = pil_img.size >>> pil_img.close() Timed gdal for: 100 loops, best of 10 time per loop: best=62.967 µs, mean=63.991 ± 0.8 µs Timed PIL for: 100 loops, best of 10 time per loop: best=46.640 µs, mean=47.314 ± 0.4 µs
- kwimage.decode_run_length(counts, shape, binary=False, dtype=np.uint8, order='C')[source]¶
Decode run length encoding back into an image.
- Parameters
counts (ndarray) – the run-length encoding
shape (Tuple[int, int])
binary (bool) – if the RLE is binary or non-binary. Set to True for compatibility with COCO.
dtype (dtype, default=np.uint8) – data type for decoded image
order ({‘C’, ‘F’}, default=’C’) – row-major (C) or column-major (F)
- Returns
the reconstructed image
- Return type
ndarray
Example
>>> from kwimage.im_runlen import * # NOQA >>> img = np.array([[1, 0, 1, 1, 1, 0, 0, 1, 0]]) >>> encoded = encode_run_length(img, binary=True) >>> recon = decode_run_length(**encoded) >>> assert np.all(recon == img)
>>> import ubelt as ub >>> lines = ub.codeblock( >>> ''' >>> .......... >>> ......111. >>> ..2...111. >>> .222..111. >>> 22222..... >>> .222...... >>> ..2....... >>> ''').replace('.', '0').splitlines() >>> img = np.array([list(map(int, line)) for line in lines]) >>> encoded = encode_run_length(img) >>> recon = decode_run_length(**encoded) >>> assert np.all(recon == img)
- kwimage.encode_run_length(img, binary=False, order='C')[source]¶
Construct the run length encoding (RLE) of an image.
- Parameters
img (ndarray) – 2D image
binary (bool, default=False) – If true, assume that the input image only contains 0’s and 1’s. Set to True for compatibility with COCO (which does not support multi-value RLE encodings).
order ({‘C’, ‘F’}, default=’C’) – row-major (C) or column-major (F)
- Returns
encoding: dictionary items are:
counts (ndarray): the run length encoding
- shape (Tuple): the original image shape.
This should be in standard shape row-major (e.g. h/w) order
- binary (bool):
if True, the counts are assumed to encode only 0’s and 1’s, otherwise the counts encoding specifies any numeric values.
order ({‘C’, ‘F’}, default=’C’): encoding order
- Return type
- SeeAlso:
kwimage.Mask - a cython-backed data structure to handle coco-style RLEs
Example
>>> import ubelt as ub >>> lines = ub.codeblock( >>> ''' >>> .......... >>> ......111. >>> ..2...111. >>> .222..111. >>> 22222..... >>> .222...... >>> ..2....... >>> ''').replace('.', '0').splitlines() >>> img = np.array([list(map(int, line)) for line in lines]) >>> encoding = encode_run_length(img) >>> target = np.array([0,16,1,3,0,3,2,1,0,3,1,3,0,2,2,3,0,2,1,3,0,1,2,5,0,6,2,3,0,8,2,1,0,7]) >>> assert np.all(target == encoding['counts'])
Example
>>> binary = True >>> img = np.array([[1, 0, 1, 1, 1, 0, 0, 1, 0]]) >>> encoding = encode_run_length(img, binary=True) >>> assert encoding['counts'].tolist() == [0, 1, 1, 3, 2, 1, 1]
- kwimage.rle_translate(rle, offset, output_shape=None)[source]¶
Translates a run-length encoded image in RLE-space.
- Parameters
rle (dict) – an enconding dict returned by encode_run_length
offset (Tuple) – x,y offset, CAREFUL, this can only accept integers
output_shape (Tuple, optional) – h,w of transformed mask. If unspecified the input rle shape is used.
- SeeAlso:
# ITK has some RLE code that looks like it can perform translations https://github.com/KitwareMedical/ITKRLEImage/blob/master/include/itkRLERegionOfInterestImageFilter.h
- Doctest:
>>> # test that translate works on all zero images >>> img = np.zeros((7, 8), dtype=np.uint8) >>> rle = encode_run_length(img, binary=True, order='F') >>> new_rle = rle_translate(rle, (1, 2), (6, 9)) >>> assert np.all(new_rle['counts'] == [54])
Example
>>> from kwimage.im_runlen import * # NOQA >>> img = np.array([ >>> [1, 1, 1, 1], >>> [0, 1, 0, 0], >>> [0, 1, 0, 1], >>> [1, 1, 1, 1],], dtype=np.uint8) >>> rle = encode_run_length(img, binary=True, order='C') >>> offset = (1, -1) >>> output_shape = (3, 5) >>> new_rle = rle_translate(rle, offset, output_shape) >>> decoded = decode_run_length(**new_rle) >>> print(decoded) [[0 0 1 0 0] [0 0 1 0 1] [0 1 1 1 1]]
Example
>>> from kwimage.im_runlen import * # NOQA >>> img = np.array([ >>> [0, 0, 0], >>> [0, 1, 0], >>> [0, 0, 0]], dtype=np.uint8) >>> rle = encode_run_length(img, binary=True, order='C') >>> new_rle = rle_translate(rle, (1, 0)) >>> decoded = decode_run_length(**new_rle) >>> print(decoded) [[0 0 0] [0 0 1] [0 0 0]] >>> new_rle = rle_translate(rle, (0, 1)) >>> decoded = decode_run_length(**new_rle) >>> print(decoded) [[0 0 0] [0 0 0] [0 1 0]]
- kwimage.stack_images(images, axis=0, resize=None, interpolation=None, overlap=0, return_info=False, bg_value=None)[source]¶
Make a new image with the input images side-by-side
- Parameters
images (Iterable[ndarray[ndim=2]]) – image data
axis (int) – axis to stack on (either 0 or 1)
resize (int, str, or None) – if None image sizes are not modified, otherwise resize resize can be either 0 or 1. We resize the resize-th image to match the 1 - resize-th image. Can also be strings “larger” or “smaller”.
interpolation (int or str) – string or cv2-style interpolation type. only used if resize or overlap > 0
overlap (int) – number of pixels to overlap. Using a negative number results in a border.
return_info (bool) – if True, returns transforms (scales and translations) to map from original image to its new location.
- Returns
an image of stacked images side by side
OR
- Tuple[ndarray, List]: where the first item is the aformentioned stacked
image and the second item is a list of transformations for each input image mapping it to its location in the returned image.
- Return type
ndarray
Example
>>> import kwimage >>> img1 = kwimage.grab_test_image('carl', space='rgb') >>> img2 = kwimage.grab_test_image('astro', space='rgb') >>> images = [img1, img2] >>> imgB, transforms = stack_images(images, axis=0, resize='larger', >>> overlap=-10, return_info=True) >>> print('imgB.shape = {}'.format(imgB.shape)) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> import kwimage >>> kwplot.autompl() >>> kwplot.imshow(imgB, colorspace='rgb') >>> wh1 = np.multiply(img1.shape[0:2][::-1], transforms[0].scale) >>> wh2 = np.multiply(img2.shape[0:2][::-1], transforms[1].scale) >>> xoff1, yoff1 = transforms[0].translation >>> xoff2, yoff2 = transforms[1].translation >>> xywh1 = (xoff1, yoff1, wh1[0], wh1[1]) >>> xywh2 = (xoff2, yoff2, wh2[0], wh2[1]) >>> kwplot.draw_boxes(kwimage.Boxes([xywh1], 'xywh'), color=(1.0, 0, 0)) >>> kwplot.draw_boxes(kwimage.Boxes([xywh2], 'xywh'), color=(1.0, 0, 0)) >>> kwplot.show_if_requested() ((662, 512, 3), (0.0, 0.0), (0, 150))
- kwimage.stack_images_grid(images, chunksize=None, axis=0, overlap=0, return_info=False, bg_value=None)[source]¶
Stacks images in a grid. Optionally return transforms of original image positions in the output image.
- Parameters
images (Iterable[ndarray[ndim=2]]) – image data
chunksize (int, default=None) – number of rows per column or columns per row depending on the value of axis. If unspecified, computes this as int(sqrt(len(images))).
axis (int, default=0) – If 0, chunksize is columns per row. If 1, chunksize is rows per column.
overlap (int) – number of pixels to overlap. Using a negative number results in a border.
return_info (bool) – if True, returns transforms (scales and translations) to map from original image to its new location.
- Returns
an image of stacked images in a grid pattern
OR
- Tuple[ndarray, List]: where the first item is the aformentioned stacked
image and the second item is a list of transformations for each input image mapping it to its location in the returned image.
- Return type
ndarray
- class kwimage.Boxes(data, format=None, check=True)¶
Bases:
_BoxConversionMixins
,_BoxPropertyMixins
,_BoxTransformMixins
,_BoxDrawMixins
,ubelt.NiceRepr
Converts boxes between different formats as long as the last dimension contains 4 coordinates and the format is specified.
This is a convinience class, and should not not store the data for very long. The general idiom should be create class, convert data, and then get the raw data and let the class be garbage collected. This will help ensure that your code is portable and understandable if this class is not available.
Example
>>> # xdoctest: +IGNORE_WHITESPACE >>> import kwimage >>> import numpy as np >>> # Given an array / tensor that represents one or more boxes >>> data = np.array([[ 0, 0, 10, 10], >>> [ 5, 5, 50, 50], >>> [20, 0, 30, 10]]) >>> # The kwimage.Boxes data structure is a thin fast wrapper >>> # that provides methods for operating on the boxes. >>> # It requires that the user explicitly provide a code that denotes >>> # the format of the boxes (i.e. what each column represents) >>> boxes = kwimage.Boxes(data, 'ltrb') >>> # This means that there is no ambiguity about box format >>> # The representation string of the Boxes object demonstrates this >>> print('boxes = {!r}'.format(boxes)) boxes = <Boxes(ltrb, array([[ 0, 0, 10, 10], [ 5, 5, 50, 50], [20, 0, 30, 10]]))> >>> # if you pass this data around. You can convert to other formats >>> # For docs on available format codes see :class:`BoxFormat`. >>> # In this example we will convert (left, top, right, bottom) >>> # to (left-x, top-y, width, height). >>> boxes.toformat('xywh') <Boxes(xywh, array([[ 0, 0, 10, 10], [ 5, 5, 45, 45], [20, 0, 10, 10]]))> >>> # In addition to format conversion there are other operations >>> # We can quickly (using a C-backend) find IoUs >>> ious = boxes.ious(boxes) >>> print('{}'.format(ub.repr2(ious, nl=1, precision=2, with_dtype=False))) np.array([[1. , 0.01, 0. ], [0.01, 1. , 0.02], [0. , 0.02, 1. ]]) >>> # We can ask for the area of each box >>> print('boxes.area = {}'.format(ub.repr2(boxes.area, nl=0, with_dtype=False))) boxes.area = np.array([[ 100],[2025],[ 100]]) >>> # We can ask for the center of each box >>> print('boxes.center = {}'.format(ub.repr2(boxes.center, nl=1, with_dtype=False))) boxes.center = ( np.array([[ 5. ],[27.5],[25. ]]), np.array([[ 5. ],[27.5],[ 5. ]]), ) >>> # We can translate / scale the boxes >>> boxes.translate((10, 10)).scale(100) <Boxes(ltrb, array([[1000., 1000., 2000., 2000.], [1500., 1500., 6000., 6000.], [3000., 1000., 4000., 2000.]]))> >>> # We can clip the bounding boxes >>> boxes.translate((10, 10)).scale(100).clip(1200, 1200, 1700, 1800) <Boxes(ltrb, array([[1200., 1200., 1700., 1800.], [1500., 1500., 1700., 1800.], [1700., 1200., 1700., 1800.]]))> >>> # We can perform arbitrary warping of the boxes >>> # (note that if the transform is not axis aligned, the axis aligned >>> # bounding box of the transform result will be returned) >>> transform = np.array([[-0.83907153, 0.54402111, 0. ], >>> [-0.54402111, -0.83907153, 0. ], >>> [ 0. , 0. , 1. ]]) >>> boxes.warp(transform) <Boxes(ltrb, array([[ -8.3907153 , -13.8309264 , 5.4402111 , 0. ], [-39.23347095, -69.154632 , 23.00569785, -6.9154632 ], [-25.1721459 , -24.7113486 , -11.3412195 , -10.8804222 ]]))> >>> # Note, that we can transform the box to a Polygon for more >>> # accurate warping. >>> transform = np.array([[-0.83907153, 0.54402111, 0. ], >>> [-0.54402111, -0.83907153, 0. ], >>> [ 0. , 0. , 1. ]]) >>> warped_polys = boxes.to_polygons().warp(transform) >>> print(ub.repr2(warped_polys.data, sv=1)) [ <Polygon({ 'exterior': <Coords(data= array([[ 0. , 0. ], [ 5.4402111, -8.3907153], [ -2.9505042, -13.8309264], [ -8.3907153, -5.4402111], [ 0. , 0. ]]))>, 'interiors': [], })>, <Polygon({ 'exterior': <Coords(data= array([[ -1.4752521 , -6.9154632 ], [ 23.00569785, -44.67368205], [-14.752521 , -69.154632 ], [-39.23347095, -31.39641315], [ -1.4752521 , -6.9154632 ]]))>, 'interiors': [], })>, <Polygon({ 'exterior': <Coords(data= array([[-16.7814306, -10.8804222], [-11.3412195, -19.2711375], [-19.7319348, -24.7113486], [-25.1721459, -16.3206333], [-16.7814306, -10.8804222]]))>, 'interiors': [], })>, ] >>> # The kwimage.Boxes data structure is also convertable to >>> # several alternative data structures, like shapely, coco, and imgaug. >>> print(ub.repr2(boxes.to_shapely(), sv=1)) [ POLYGON ((0 0, 0 10, 10 10, 10 0, 0 0)), POLYGON ((5 5, 5 50, 50 50, 50 5, 5 5)), POLYGON ((20 0, 20 10, 30 10, 30 0, 20 0)), ] >>> # xdoctest: +REQUIRES(module:imgaug) >>> print(ub.repr2(boxes[0:1].to_imgaug(shape=(100, 100)), sv=1)) BoundingBoxesOnImage([BoundingBox(x1=0.0000, y1=0.0000, x2=10.0000, y2=10.0000, label=None)], shape=(100, 100)) >>> # xdoctest: -REQUIRES(module:imgaug) >>> print(ub.repr2(list(boxes.to_coco()), sv=1)) [ [0, 0, 10, 10], [5, 5, 45, 45], [20, 0, 10, 10], ] >>> # Finally, when you are done with your boxes object, you can >>> # unwrap the raw data by using the ``.data`` attribute >>> # all operations are done on this data, which gives the >>> # kwiamge.Boxes data structure almost no overhead when >>> # inserted into existing code. >>> print('boxes.data =\n{}'.format(ub.repr2(boxes.data, nl=1))) boxes.data = np.array([[ 0, 0, 10, 10], [ 5, 5, 50, 50], [20, 0, 30, 10]], dtype=np.int64) >>> # xdoctest: +REQUIRES(module:torch) >>> # This data structure was designed for use with both torch >>> # and numpy, the underlying data can be either an array or tensor. >>> boxes.tensor() <Boxes(ltrb, tensor([[ 0, 0, 10, 10], [ 5, 5, 50, 50], [20, 0, 30, 10]]))> >>> boxes.numpy() <Boxes(ltrb, array([[ 0, 0, 10, 10], [ 5, 5, 50, 50], [20, 0, 30, 10]]))>
Example
>>> # xdoctest: +IGNORE_WHITESPACE >>> from kwimage.structs.boxes import * # NOQA >>> # Demo of conversion methods >>> import kwimage >>> kwimage.Boxes([[25, 30, 15, 10]], 'xywh') <Boxes(xywh, array([[25, 30, 15, 10]]))> >>> kwimage.Boxes([[25, 30, 15, 10]], 'xywh').to_xywh() <Boxes(xywh, array([[25, 30, 15, 10]]))> >>> kwimage.Boxes([[25, 30, 15, 10]], 'xywh').to_cxywh() <Boxes(cxywh, array([[32.5, 35. , 15. , 10. ]]))> >>> kwimage.Boxes([[25, 30, 15, 10]], 'xywh').to_ltrb() <Boxes(ltrb, array([[25, 30, 40, 40]]))> >>> kwimage.Boxes([[25, 30, 15, 10]], 'xywh').scale(2).to_ltrb() <Boxes(ltrb, array([[50., 60., 80., 80.]]))> >>> # xdoctest: +REQUIRES(module:torch) >>> kwimage.Boxes(torch.FloatTensor([[25, 30, 15, 20]]), 'xywh').scale(.1).to_ltrb() <Boxes(ltrb, tensor([[ 2.5000, 3.0000, 4.0000, 5.0000]]))>
Notes
In the following examples we show cases where
Boxes
can hold a single 1-dimensional box array. This is a holdover from an older codebase, and some functions may assume that the input is at least 2-D. Thus when representing a single bounding box it is best practice to view it as a list of 1 box. While many function will work in the 1-D case, not all functions have been tested and thus we cannot gaurentee correctness.Example
>>> # xdoctest: +IGNORE_WHITESPACE >>> Boxes([25, 30, 15, 10], 'xywh') <Boxes(xywh, array([25, 30, 15, 10]))> >>> Boxes([25, 30, 15, 10], 'xywh').to_xywh() <Boxes(xywh, array([25, 30, 15, 10]))> >>> Boxes([25, 30, 15, 10], 'xywh').to_cxywh() <Boxes(cxywh, array([32.5, 35. , 15. , 10. ]))> >>> Boxes([25, 30, 15, 10], 'xywh').to_ltrb() <Boxes(ltrb, array([25, 30, 40, 40]))> >>> Boxes([25, 30, 15, 10], 'xywh').scale(2).to_ltrb() <Boxes(ltrb, array([50., 60., 80., 80.]))> >>> # xdoctest: +REQUIRES(module:torch) >>> Boxes(torch.FloatTensor([[25, 30, 15, 20]]), 'xywh').scale(.1).to_ltrb() <Boxes(ltrb, tensor([[ 2.5000, 3.0000, 4.0000, 5.0000]]))>
Example
>>> datas = [ >>> [1, 2, 3, 4], >>> [[1, 2, 3, 4], [4, 5, 6, 7]], >>> [[[1, 2, 3, 4], [4, 5, 6, 7]]], >>> ] >>> formats = BoxFormat.cannonical >>> for format1 in formats: >>> for data in datas: >>> self = box1 = Boxes(data, format1) >>> for format2 in formats: >>> box2 = box1.toformat(format2) >>> back = box2.toformat(format1) >>> assert box1 == back
- __getitem__(self, index)¶
- __eq__(self, other)¶
Tests equality of two Boxes objects
Example
>>> box0 = box1 = Boxes([[1, 2, 3, 4]], 'xywh') >>> box2 = Boxes(box0.data, 'ltrb') >>> box3 = Boxes([[0, 2, 3, 4]], box0.format) >>> box4 = Boxes(box0.data, box2.format) >>> assert box0 == box1 >>> assert not box0 == box2 >>> assert not box2 == box3 >>> assert box2 == box4
- __len__(self)¶
- __nice__(self)¶
- __repr__(self)¶
Return repr(self).
- classmethod random(Boxes, num=1, scale=1.0, format=BoxFormat.XYWH, anchors=None, anchor_std=1.0 / 6, tensor=False, rng=None)¶
Makes random boxes; typically for testing purposes
- Parameters
num (int) – number of boxes to generate
scale (float | Tuple[float, float]) – size of imgdims
format (str) – format of boxes to be created (e.g. ltrb, xywh)
anchors (ndarray) – normalized width / heights of anchor boxes to perterb and randomly place. (must be in range 0-1)
anchor_std (float) – magnitude of noise applied to anchor shapes
tensor (bool) – if True, returns boxes in tensor format
rng (None | int | RandomState) – initial random seed
Example
>>> # xdoctest: +IGNORE_WHITESPACE >>> Boxes.random(3, rng=0, scale=100) <Boxes(xywh, array([[54, 54, 6, 17], [42, 64, 1, 25], [79, 38, 17, 14]]))> >>> # xdoctest: +REQUIRES(module:torch) >>> Boxes.random(3, rng=0, scale=100).tensor() <Boxes(xywh, tensor([[ 54, 54, 6, 17], [ 42, 64, 1, 25], [ 79, 38, 17, 14]]))> >>> anchors = np.array([[.5, .5], [.3, .3]]) >>> Boxes.random(3, rng=0, scale=100, anchors=anchors) <Boxes(xywh, array([[ 2, 13, 51, 51], [32, 51, 32, 36], [36, 28, 23, 26]]))>
Example
>>> # Boxes position/shape within 0-1 space should be uniform. >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> fig = kwplot.figure(fnum=1, doclf=True) >>> fig.gca().set_xlim(0, 128) >>> fig.gca().set_ylim(0, 128) >>> import kwimage >>> kwimage.Boxes.random(num=10).scale(128).draw()
- copy(self)¶
- classmethod concatenate(cls, boxes, axis=0)¶
Concatenates multiple boxes together
- Parameters
boxes (Sequence[Boxes]) – list of boxes to concatenate
axis (int, default=0) – axis to stack on
- Returns
stacked boxes
- Return type
Example
>>> boxes = [Boxes.random(3) for _ in range(3)] >>> new = Boxes.concatenate(boxes) >>> assert len(new) == 9 >>> assert np.all(new.data[3:6] == boxes[1].data)
Example
>>> boxes = [Boxes.random(3) for _ in range(3)] >>> boxes[0].data = boxes[0].data[0] >>> boxes[1].data = boxes[0].data[0:0] >>> new = Boxes.concatenate(boxes) >>> assert len(new) == 4 >>> # xdoctest: +REQUIRES(module:torch) >>> new = Boxes.concatenate([b.tensor() for b in boxes]) >>> assert len(new) == 4
- compress(self, flags, axis=0, inplace=False)¶
Filters boxes based on a boolean criterion
- Parameters
flags (ArrayLike[bool]) – true for items to be kept
axis (int) – you usually want this to be 0
inplace (bool) – if True, modifies this object
Example
>>> self = Boxes([[25, 30, 15, 10]], 'ltrb') >>> self.compress([True]) <Boxes(ltrb, array([[25, 30, 15, 10]]))> >>> self.compress([False]) <Boxes(ltrb, array([], shape=(0, 4), dtype=int64))>
- take(self, idxs, axis=0, inplace=False)¶
Takes a subset of items at specific indices
- Parameters
indices (ArrayLike[int]) – indexes of items to take
axis (int) – you usually want this to be 0
inplace (bool) – if True, modifies this object
Example
>>> self = Boxes([[25, 30, 15, 10]], 'ltrb') >>> self.take([0]) <Boxes(ltrb, array([[25, 30, 15, 10]]))> >>> self.take([]) <Boxes(ltrb, array([], shape=(0, 4), dtype=int64))>
- is_tensor(self)¶
is the backend fueled by torch?
- is_numpy(self)¶
is the backend fueled by numpy?
- _impl(self)¶
returns the kwarray.ArrayAPI implementation for the data
Example
>>> assert Boxes.random().numpy()._impl.is_numpy >>> # xdoctest: +REQUIRES(module:torch) >>> assert Boxes.random().tensor()._impl.is_tensor
- property device(self)¶
If the backend is torch returns the data device, otherwise None
- astype(self, dtype)¶
Changes the type of the internal array used to represent the boxes
Notes
this operation is not inplace
Example
>>> # xdoctest: +IGNORE_WHITESPACE >>> # xdoctest: +REQUIRES(module:torch) >>> Boxes.random(3, 100, rng=0).tensor().astype('int32') <Boxes(xywh, tensor([[54, 54, 6, 17], [42, 64, 1, 25], [79, 38, 17, 14]], dtype=torch.int32))> >>> Boxes.random(3, 100, rng=0).numpy().astype('int32') <Boxes(xywh, array([[54, 54, 6, 17], [42, 64, 1, 25], [79, 38, 17, 14]], dtype=int32))> >>> Boxes.random(3, 100, rng=0).tensor().astype('float32') >>> Boxes.random(3, 100, rng=0).numpy().astype('float32')
- round(self, inplace=False)¶
Rounds data coordinates to the nearest integer.
This operation is applied directly to the box coordinates, so its output will depend on the format the boxes are stored in.
- Parameters
inplace (bool, default=False) – if True, modifies this object
- SeeAlso:
Example
>>> import kwimage >>> self = kwimage.Boxes.random(3, rng=0).scale(10) >>> new = self.round() >>> print('self = {!r}'.format(self)) >>> print('new = {!r}'.format(new)) self = <Boxes(xywh, array([[5.48813522, 5.44883192, 0.53949833, 1.70306146], [4.23654795, 6.4589411 , 0.13932407, 2.45878875], [7.91725039, 3.83441508, 1.71937704, 1.45453393]]))> new = <Boxes(xywh, array([[5., 5., 1., 2.], [4., 6., 0., 2.], [8., 4., 2., 1.]]))>
- quantize(self, inplace=False, dtype=np.int32)¶
Converts the box to integer coordinates.
This operation takes the floor of the left side and the ceil of the right side. Thus the area of the box will never decreases.
- Parameters
inplace (bool, default=False) – if True, modifies this object
dtype (type) – type to cast as
- SeeAlso:
Example
>>> import kwimage >>> self = kwimage.Boxes.random(3, rng=0).scale(10) >>> new = self.quantize() >>> print('self = {!r}'.format(self)) >>> print('new = {!r}'.format(new)) self = <Boxes(xywh, array([[5.48813522, 5.44883192, 0.53949833, 1.70306146], [4.23654795, 6.4589411 , 0.13932407, 2.45878875], [7.91725039, 3.83441508, 1.71937704, 1.45453393]]))> new = <Boxes(xywh, array([[5, 5, 2, 3], [4, 6, 1, 3], [7, 3, 3, 3]], dtype=int32))>
Example
>>> import kwimage >>> self = kwimage.Boxes.random(3, rng=0) >>> orig = self.copy() >>> self.quantize(inplace=True) >>> assert np.any(self.data != orig.data)
- numpy(self)¶
Converts tensors to numpy. Does not change memory if possible.
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> self = Boxes.random(3).tensor() >>> newself = self.numpy() >>> self.data[0, 0] = 0 >>> assert newself.data[0, 0] == 0 >>> self.data[0, 0] = 1 >>> assert self.data[0, 0] == 1
- tensor(self, device=ub.NoParam)¶
Converts numpy to tensors. Does not change memory if possible.
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> self = Boxes.random(3) >>> # xdoctest: +REQUIRES(module:torch) >>> newself = self.tensor() >>> self.data[0, 0] = 0 >>> assert newself.data[0, 0] == 0 >>> self.data[0, 0] = 1 >>> assert self.data[0, 0] == 1
- ious(self, other, bias=0, impl='auto', mode=None)¶
Intersection over union.
Compute IOUs (intersection area over union area) between these boxes and another set of boxes. This is a symmetric measure of similarity between boxes.
Todo
- [ ] Add pairwise flag to toggle between one-vs-one and all-vs-all
computation. I.E. Add option for componentwise calculation.
- Parameters
other (Boxes) – boxes to compare IoUs against
bias (int, default=0) – either 0 or 1, does TL=BR have area of 0 or 1?
impl (str, default=’auto’) – code to specify implementation used to ious. Can be either torch, py, c, or auto. Efficiency and the exact result will vary by implementation, but they will always be close. Some implementations only accept certain data types (e.g. impl=’c’, only accepts float32 numpy arrays). See ~/code/kwimage/dev/bench_bbox.py for benchmark details. On my system the torch impl was fastest (when the data was on the GPU).
mode – depricated, use impl
- SeeAlso:
iooas - for a measure of coverage between boxes
Examples
>>> import kwimage >>> self = kwimage.Boxes(np.array([[ 0, 0, 10, 10], >>> [10, 0, 20, 10], >>> [20, 0, 30, 10]]), 'ltrb') >>> other = kwimage.Boxes(np.array([6, 2, 20, 10]), 'ltrb') >>> overlaps = self.ious(other, bias=1).round(2) >>> assert np.all(np.isclose(overlaps, [0.21, 0.63, 0.04])), repr(overlaps)
Examples
>>> import kwimage >>> boxes1 = kwimage.Boxes(np.array([[ 0, 0, 10, 10], >>> [10, 0, 20, 10], >>> [20, 0, 30, 10]]), 'ltrb') >>> other = kwimage.Boxes(np.array([[6, 2, 20, 10], >>> [100, 200, 300, 300]]), 'ltrb') >>> overlaps = boxes1.ious(other) >>> print('{}'.format(ub.repr2(overlaps, precision=2, nl=1))) np.array([[0.18, 0. ], [0.61, 0. ], [0. , 0. ]]...)
Examples
>>> # xdoctest: +IGNORE_WHITESPACE >>> Boxes(np.empty(0), 'xywh').ious(Boxes(np.empty(4), 'xywh')).shape (0,) >>> #Boxes(np.empty(4), 'xywh').ious(Boxes(np.empty(0), 'xywh')).shape >>> Boxes(np.empty((0, 4)), 'xywh').ious(Boxes(np.empty((0, 4)), 'xywh')).shape (0, 0) >>> Boxes(np.empty((1, 4)), 'xywh').ious(Boxes(np.empty((0, 4)), 'xywh')).shape (1, 0) >>> Boxes(np.empty((0, 4)), 'xywh').ious(Boxes(np.empty((1, 4)), 'xywh')).shape (0, 1)
Examples
>>> # xdoctest: +REQUIRES(module:torch) >>> formats = BoxFormat.cannonical >>> istensors = [False, True] >>> results = {} >>> for format in formats: >>> for tensor in istensors: >>> boxes1 = Boxes.random(5, scale=10.0, rng=0, format=format, tensor=tensor) >>> boxes2 = Boxes.random(7, scale=10.0, rng=1, format=format, tensor=tensor) >>> ious = boxes1.ious(boxes2) >>> results[(format, tensor)] = ious >>> results = {k: v.numpy() if torch.is_tensor(v) else v for k, v in results.items() } >>> results = {k: v.tolist() for k, v in results.items()} >>> print(ub.repr2(results, sk=True, precision=3, nl=2)) >>> from functools import partial >>> assert ub.allsame(results.values(), partial(np.allclose, atol=1e-07))
- Ignore:
>>> # does this work with backprop? >>> # xdoctest: +REQUIRES(module:torch) >>> import torch >>> import kwimage >>> num = 1000 >>> true_boxes = kwimage.Boxes.random(num).tensor() >>> inputs = torch.rand(num, 10) >>> regress = torch.nn.Linear(10, 4) >>> energy = regress(inputs) >>> energy.retain_grad() >>> outputs = energy.sigmoid() >>> outputs.retain_grad() >>> out_boxes = kwimage.Boxes(outputs, 'cxywh') >>> ious = out_boxes.ious(true_boxes) >>> loss = ious.sum() >>> loss.backward()
- iooas(self, other, bias=0)¶
Intersection over other area.
This is an asymetric measure of coverage. How much of the “other” boxes are covered by these boxes. It is the area of intersection between each pair of boxes and the area of the “other” boxes.
- SeeAlso:
ious - for a measure of similarity between boxes
- Parameters
other (Boxes) – boxes to compare IoOA against
bias (int, default=0) – either 0 or 1, does TL=BR have area of 0 or 1?
Examples
>>> self = Boxes(np.array([[ 0, 0, 10, 10], >>> [10, 0, 20, 10], >>> [20, 0, 30, 10]]), 'ltrb') >>> other = Boxes(np.array([[6, 2, 20, 10], [0, 0, 0, 3]]), 'xywh') >>> coverage = self.iooas(other, bias=0).round(2) >>> print('coverage = {!r}'.format(coverage))
- isect_area(self, other, bias=0)¶
Intersection part of intersection over union computation
Examples
>>> # xdoctest: +IGNORE_WHITESPACE >>> self = Boxes.random(5, scale=10.0, rng=0, format='ltrb') >>> other = Boxes.random(3, scale=10.0, rng=1, format='ltrb') >>> isect = self.isect_area(other, bias=0) >>> ious_v1 = isect / ((self.area + other.area.T) - isect) >>> ious_v2 = self.ious(other, bias=0) >>> assert np.allclose(ious_v1, ious_v2)
- intersection(self, other)¶
Componentwise intersection between two sets of Boxes
intersections of boxes are always boxes, so this works
- Returns
intersected boxes
- Return type
Examples
>>> # xdoctest: +IGNORE_WHITESPACE >>> from kwimage.structs.boxes import * # NOQA >>> self = Boxes.random(5, rng=0).scale(10.) >>> other = self.translate(1) >>> new = self.intersection(other) >>> new_area = np.nan_to_num(new.area).ravel() >>> alt_area = np.diag(self.isect_area(other)) >>> close = np.isclose(new_area, alt_area) >>> assert np.all(close)
- union_hull(self, other)¶
Componentwise hull union between two sets of Boxes
NOTE: convert to polygon to do a real union.
- Returns
unioned boxes
- Return type
Examples
>>> # xdoctest: +IGNORE_WHITESPACE >>> from kwimage.structs.boxes import * # NOQA >>> self = Boxes.random(5, rng=0).scale(10.) >>> other = self.translate(1) >>> new = self.union_hull(other) >>> new_area = np.nan_to_num(new.area).ravel()
- bounding_box(self)¶
Returns the box that bounds all of the contained boxes
- Returns
a single box
- Return type
Examples
>>> # xdoctest: +IGNORE_WHITESPACE >>> from kwimage.structs.boxes import * # NOQA >>> self = Boxes.random(5, rng=0).scale(10.) >>> other = self.translate(1) >>> new = self.union_hull(other) >>> new_area = np.nan_to_num(new.area).ravel()
- contains(self, other)¶
Determine of points are completely contained by these boxes
- Parameters
other (Points) – points to test for containment. TODO: support generic data types
- Returns
- N x M boolean matrix indicating which box
contains which points, where N is the number of boxes and M is the number of points.
- Return type
flags (ArrayLike)
Examples
>>> import kwimage >>> self = kwimage.Boxes.random(10).scale(10).round() >>> other = kwimage.Points.random(10).scale(10).round() >>> flags = self.contains(other) >>> flags = self.contains(self.xy_center) >>> assert np.all(np.diag(flags))
- view(self, *shape)¶
Passthrough method to view or reshape
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> self = Boxes.random(6, scale=10.0, rng=0, format='xywh').tensor() >>> assert list(self.view(3, 2, 4).data.shape) == [3, 2, 4] >>> self = Boxes.random(6, scale=10.0, rng=0, format='ltrb').tensor() >>> assert list(self.view(3, 2, 4).data.shape) == [3, 2, 4]
- class kwimage.Coords(data=None, meta=None)¶
Bases:
kwimage.structs._generic.Spatial
,ubelt.NiceRepr
A data structure to store n-dimensional coordinate geometry.
Currently it is up to the user to maintain what coordinate system this geometry belongs to.
Note
This class was designed to hold coordinates in r/c format, but in general this class is anostic to dimension ordering as long as you are consistent. However, there are two places where this matters:
(1) drawing and (2) gdal/imgaug-warping. In these places we will assume x/y for legacy reasons. This may change in the future.
The term axes with resepct to
Coords
always refers to the final numpy axis. In other words the final numpy-axis represents ALL of the coordinate-axes.- CommandLine:
xdoctest -m kwimage.structs.coords Coords
Example
>>> from kwimage.structs.coords import * # NOQA >>> import kwarray >>> rng = kwarray.ensure_rng(0) >>> self = Coords.random(num=4, dim=3, rng=rng) >>> print('self = {}'.format(self)) self = <Coords(data= array([[0.5488135 , 0.71518937, 0.60276338], [0.54488318, 0.4236548 , 0.64589411], [0.43758721, 0.891773 , 0.96366276], [0.38344152, 0.79172504, 0.52889492]]))> >>> matrix = rng.rand(4, 4) >>> self.warp(matrix) <Coords(data= array([[0.71037426, 1.25229659, 1.39498435], [0.60799503, 1.26483447, 1.42073131], [0.72106004, 1.39057144, 1.38757508], [0.68384299, 1.23914654, 1.29258196]]))> >>> self.translate(3, inplace=True) <Coords(data= array([[3.5488135 , 3.71518937, 3.60276338], [3.54488318, 3.4236548 , 3.64589411], [3.43758721, 3.891773 , 3.96366276], [3.38344152, 3.79172504, 3.52889492]]))> >>> self.translate(3, inplace=True) <Coords(data= array([[6.5488135 , 6.71518937, 6.60276338], [6.54488318, 6.4236548 , 6.64589411], [6.43758721, 6.891773 , 6.96366276], [6.38344152, 6.79172504, 6.52889492]]))> >>> self.scale(2) <Coords(data= array([[13.09762701, 13.43037873, 13.20552675], [13.08976637, 12.8473096 , 13.29178823], [12.87517442, 13.783546 , 13.92732552], [12.76688304, 13.58345008, 13.05778984]]))> >>> # xdoctest: +REQUIRES(module:torch) >>> self.tensor() >>> self.tensor().tensor().numpy().numpy() >>> self.numpy() >>> #self.draw_on()
- __repr__¶
- __nice__(self)¶
- __len__(self)¶
- property dtype(self)¶
- property dim(self)¶
- property shape(self)¶
- copy(self)¶
- classmethod random(Coords, num=1, dim=2, rng=None, meta=None)¶
Makes random coordinates; typically for testing purposes
- is_numpy(self)¶
- is_tensor(self)¶
- compress(self, flags, axis=0, inplace=False)¶
Filters items based on a boolean criterion
- Parameters
flags (ArrayLike[bool]) – true for items to be kept
axis (int) – you usually want this to be 0
inplace (bool, default=False) – if True, modifies this object
- Returns
filtered coords
- Return type
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10, rng=0) >>> self.compress([True] * len(self)) >>> self.compress([False] * len(self)) <Coords(data=array([], shape=(0, 2), dtype=float64))> >>> # xdoctest: +REQUIRES(module:torch) >>> self = self.tensor() >>> self.compress([True] * len(self)) >>> self.compress([False] * len(self))
- take(self, indices, axis=0, inplace=False)¶
Takes a subset of items at specific indices
- Parameters
indices (ArrayLike[int]) – indexes of items to take
axis (int) – you usually want this to be 0
inplace (bool, default=False) – if True, modifies this object
- Returns
filtered coords
- Return type
Example
>>> self = Coords(np.array([[25, 30, 15, 10]])) >>> self.take([0]) <Coords(data=array([[25, 30, 15, 10]]))> >>> self.take([]) <Coords(data=array([], shape=(0, 4), dtype=int64))>
- astype(self, dtype, inplace=False)¶
Changes the data type
- Parameters
dtype – new type
inplace (bool, default=False) – if True, modifies this object
- Returns
modified coordinates
- Return type
- round(self, inplace=False)¶
Rounds data to the nearest integer
- Parameters
inplace (bool, default=False) – if True, modifies this object
Example
>>> import kwimage >>> self = kwimage.Coords.random(3).scale(10) >>> self.round()
- view(self, *shape)¶
Passthrough method to view or reshape
- Parameters
*shape – new shape of the data
- Returns
modified coordinates
- Return type
Example
>>> self = Coords.random(6, dim=4).numpy() >>> assert list(self.view(3, 2, 4).data.shape) == [3, 2, 4] >>> # xdoctest: +REQUIRES(module:torch) >>> self = Coords.random(6, dim=4).tensor() >>> assert list(self.view(3, 2, 4).data.shape) == [3, 2, 4]
- classmethod concatenate(cls, coords, axis=0)¶
Concatenates lists of coordinates together
- Parameters
coords (Sequence[Coords]) – list of coords to concatenate
axis (int, default=0) – axis to stack on
- Returns
stacked coords
- Return type
- CommandLine:
xdoctest -m kwimage.structs.coords Coords.concatenate
Example
>>> coords = [Coords.random(3) for _ in range(3)] >>> new = Coords.concatenate(coords) >>> assert len(new) == 9 >>> assert np.all(new.data[3:6] == coords[1].data)
- property device(self)¶
If the backend is torch returns the data device, otherwise None
- property _impl(self)¶
Returns the internal tensor/numpy ArrayAPI implementation
- tensor(self, device=ub.NoParam)¶
Converts numpy to tensors. Does not change memory if possible.
- Returns
modified coordinates
- Return type
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> self = Coords.random(3).numpy() >>> newself = self.tensor() >>> self.data[0, 0] = 0 >>> assert newself.data[0, 0] == 0 >>> self.data[0, 0] = 1 >>> assert self.data[0, 0] == 1
- numpy(self)¶
Converts tensors to numpy. Does not change memory if possible.
- Returns
modified coordinates
- Return type
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> self = Coords.random(3).tensor() >>> newself = self.numpy() >>> self.data[0, 0] = 0 >>> assert newself.data[0, 0] == 0 >>> self.data[0, 0] = 1 >>> assert self.data[0, 0] == 1
- reorder_axes(self, new_order, inplace=False)¶
Change the ordering of the coordinate axes.
- Parameters
new_order (Tuple[int]) –
new_order[i]
should specify which axes in the original coordinates should be mapped to thei-th
position in the returned axes.inplace (bool, default=False) – if True, modifies data inplace
- Returns
modified coordinates
- Return type
Note
This is the ordering of the “columns” in final numpy axis, not the numpy axes themselves.
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords(data=np.array([ >>> [7, 11], >>> [13, 17], >>> [21, 23], >>> ])) >>> new = self.reorder_axes((1, 0)) >>> print('new = {!r}'.format(new)) new = <Coords(data= array([[11, 7], [17, 13], [23, 21]]))>
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10, rng=0) >>> new = self.reorder_axes((1, 0)) >>> # Remapping using 1, 0 reverses the axes >>> assert np.all(new.data[:, 0] == self.data[:, 1]) >>> assert np.all(new.data[:, 1] == self.data[:, 0]) >>> # Remapping using 0, 1 does nothing >>> eye = self.reorder_axes((0, 1)) >>> assert np.all(eye.data == self.data) >>> # Remapping using 0, 0, destroys the 1-th column >>> bad = self.reorder_axes((0, 0)) >>> assert np.all(bad.data[:, 0] == self.data[:, 0]) >>> assert np.all(bad.data[:, 1] == self.data[:, 0])
- warp(self, transform, input_dims=None, output_dims=None, inplace=False)¶
Generalized coordinate transform.
- Parameters
transform (GeometricTransform | ArrayLike | Augmenter | callable) – scikit-image tranform, a 3x3 transformation matrix, an imgaug Augmenter, or generic callable which transforms an NxD ndarray.
input_dims (Tuple) – shape of the image these objects correspond to (only needed / used when transform is an imgaug augmenter)
output_dims (Tuple) – unused in non-raster structures, only exists for compatibility.
inplace (bool, default=False) – if True, modifies data inplace
- Returns
modified coordinates
- Return type
Notes
Let D = self.dims
- transformation matrices can be either:
(D + 1) x (D + 1) # for homog
D x D # for scale / rotate
D x (D + 1) # for affine
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10, rng=0) >>> transform = skimage.transform.AffineTransform(scale=(2, 2)) >>> new = self.warp(transform) >>> assert np.all(new.data == self.scale(2).data)
- Doctest:
>>> self = Coords.random(10, rng=0) >>> assert np.all(self.warp(np.eye(3)).data == self.data) >>> assert np.all(self.warp(np.eye(2)).data == self.data)
- Doctest:
>>> # xdoctest: +REQUIRES(module:osgeo) >>> from osgeo import osr >>> wgs84_crs = osr.SpatialReference() >>> wgs84_crs.ImportFromEPSG(4326) >>> dst_crs = osr.SpatialReference() >>> dst_crs.ImportFromEPSG(2927) >>> transform = osr.CoordinateTransformation(wgs84_crs, dst_crs) >>> self = Coords.random(10, rng=0) >>> new = self.warp(transform) >>> assert np.all(new.data != self.data)
>>> # Alternative using generic func >>> def _gdal_coord_tranform(pts): ... return np.array([transform.TransformPoint(x, y, 0)[0:2] ... for x, y in pts]) >>> alt = self.warp(_gdal_coord_tranform) >>> assert np.all(alt.data != self.data) >>> assert np.all(alt.data == new.data)
- Doctest:
>>> # can use a generic function >>> def func(xy): ... return np.zeros_like(xy) >>> self = Coords.random(10, rng=0) >>> assert np.all(self.warp(func).data == 0)
- _warp_imgaug(self, augmenter, input_dims, inplace=False)¶
Warps by applying an augmenter from the imgaug library
Note
We are assuming you are using X/Y coordinates here.
- Parameters
augmenter (imgaug.augmenters.Augmenter)
input_dims (Tuple) – h/w of the input image
inplace (bool, default=False) – if True, modifies data inplace
- CommandLine:
xdoctest -m ~/code/kwimage/kwimage/structs/coords.py Coords._warp_imgaug
Example
>>> # xdoctest: +REQUIRES(module:imgaug) >>> from kwimage.structs.coords import * # NOQA >>> import imgaug >>> input_dims = (10, 10) >>> self = Coords.random(10).scale(input_dims) >>> augmenter = imgaug.augmenters.Fliplr(p=1) >>> new = self._warp_imgaug(augmenter, input_dims) >>> # y coordinate should not change >>> assert np.allclose(self.data[:, 1], new.data[:, 1]) >>> assert np.allclose(input_dims[0] - self.data[:, 0], new.data[:, 0])
>>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.figure(fnum=1, doclf=True) >>> from matplotlib import pyplot as pl >>> ax = plt.gca() >>> ax.set_xlim(0, input_dims[0]) >>> ax.set_ylim(0, input_dims[1]) >>> self.draw(color='red', alpha=.4, radius=0.1) >>> new.draw(color='blue', alpha=.4, radius=0.1)
Example
>>> # xdoctest: +REQUIRES(module:imgaug) >>> from kwimage.structs.coords import * # NOQA >>> import imgaug >>> input_dims = (32, 32) >>> inplace = 0 >>> self = Coords.random(1000, rng=142).scale(input_dims).scale(.8) >>> self.data = self.data.astype(np.int32).astype(np.float32) >>> augmenter = imgaug.augmenters.CropAndPad(px=(-4, 4), keep_size=1).to_deterministic() >>> new = self._warp_imgaug(augmenter, input_dims) >>> # Change should be linear >>> norm1 = (self.data - self.data.min(axis=0)) / (self.data.max(axis=0) - self.data.min(axis=0)) >>> norm2 = (new.data - new.data.min(axis=0)) / (new.data.max(axis=0) - new.data.min(axis=0)) >>> diff = norm1 - norm2 >>> assert np.allclose(diff, 0, atol=1e-6, rtol=1e-4) >>> #assert np.allclose(self.data[:, 1], new.data[:, 1]) >>> #assert np.allclose(input_dims[0] - self.data[:, 0], new.data[:, 0]) >>> # xdoc: +REQUIRES(--show) >>> import kwimage >>> im = kwimage.imresize(kwimage.grab_test_image(), dsize=input_dims[::-1]) >>> new_im = augmenter.augment_image(im) >>> import kwplot >>> plt = kwplot.autoplt() >>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.imshow(im, pnum=(1, 2, 1), fnum=1) >>> self.draw(color='red', alpha=.8, radius=0.5) >>> kwplot.imshow(new_im, pnum=(1, 2, 2), fnum=1) >>> new.draw(color='blue', alpha=.8, radius=0.5, coord_axes=[1, 0])
- to_imgaug(self, input_dims)¶
Translate to an imgaug object
- Returns
imgaug data structure
- Return type
imgaug.KeypointsOnImage
Example
>>> # xdoctest: +REQUIRES(module:imgaug) >>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10) >>> input_dims = (10, 10) >>> kpoi = self.to_imgaug(input_dims) >>> new = Coords.from_imgaug(kpoi) >>> assert np.allclose(new.data, self.data)
- classmethod from_imgaug(cls, kpoi)¶
- scale(self, factor, about=None, output_dims=None, inplace=False)¶
Scale coordinates by a factor
- Parameters
factor (float or Tuple[float, float]) – scale factor as either a scalar or per-dimension tuple.
about (Tuple | None) – if unspecified scales about the origin (0, 0), otherwise the rotation is about this point.
output_dims (Tuple) – unused in non-raster spatial structures
inplace (bool, default=False) – if True, modifies data inplace
- Returns
modified coordinates
- Return type
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10, rng=0) >>> new = self.scale(10) >>> assert new.data.max() <= 10
>>> self = Coords.random(10, rng=0) >>> self.data = (self.data * 10).astype(int) >>> new = self.scale(10) >>> assert new.data.dtype.kind == 'i' >>> new = self.scale(10.0) >>> assert new.data.dtype.kind == 'f'
- translate(self, offset, output_dims=None, inplace=False)¶
Shift the coordinates
- Parameters
offset (float or Tuple[float]) – transation offset as either a scalar or a per-dimension tuple.
output_dims (Tuple) – unused in non-raster spatial structures
inplace (bool, default=False) – if True, modifies data inplace
- Returns
modified coordinates
- Return type
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10, dim=3, rng=0) >>> new = self.translate(10) >>> assert new.data.min() >= 10 >>> assert new.data.max() <= 11 >>> Coords.random(3, dim=3, rng=0) >>> Coords.random(3, dim=3, rng=0).translate((1, 2, 3))
- rotate(self, theta, about=None, output_dims=None, inplace=False)¶
Rotate the coordinates about a point.
- Parameters
theta (float) – rotation angle in radians
about (Tuple | None) – if unspecified rotates about the origin (0, 0), otherwise the rotation is about this point.
output_dims (Tuple) – unused in non-raster spatial structures
inplace (bool, default=False) – if True, modifies data inplace
- Returns
modified coordinates
- Return type
Todo
[ ] Generalized ND Rotations?
References
https://math.stackexchange.com/questions/197772/gen-rot-matrix
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10, dim=2, rng=0) >>> theta = np.pi / 2 >>> new = self.rotate(theta)
>>> # Test rotate agrees with warp >>> sin_ = np.sin(theta) >>> cos_ = np.cos(theta) >>> rot_ = np.array([[cos_, -sin_], [sin_, cos_]]) >>> new2 = self.warp(rot_) >>> assert np.allclose(new.data, new2.data)
>>> # >>> # Rotate about a custom point >>> theta = np.pi / 2 >>> new3 = self.rotate(theta, about=(0.5, 0.5)) >>> # >>> # Rotate about the center of mass >>> about = self.data.mean(axis=0) >>> new4 = self.rotate(theta, about=about) >>> # xdoc: +REQUIRES(--show) >>> # xdoc: +REQUIRES(module:kwplot) >>> import kwplot >>> kwplot.figure(fnum=1, doclf=True) >>> plt = kwplot.autoplt() >>> self.draw(radius=0.01, color='blue', alpha=.5, coord_axes=[1, 0], setlim='grow') >>> plt.gca().set_aspect('equal') >>> new3.draw(radius=0.01, color='red', alpha=.5, coord_axes=[1, 0], setlim='grow')
- _rectify_about(self, about)¶
Ensures that about returns a specified point. Allows for special keys like center to be used.
Example
>>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10, dim=2, rng=0)
- fill(self, image, value, coord_axes=None, interp='bilinear')¶
Sets sub-coordinate locations in a grid to a particular value
- Parameters
coord_axes (Tuple) – specify which image axes each coordinate dim corresponds to. For 2D images, if you are storing r/c data, set to [0,1], if you are storing x/y data, set to [1,0].
- Returns
image with coordinates rasterized on it
- Return type
ndarray
- soft_fill(self, image, coord_axes=None, radius=5)¶
Used for drawing keypoint truth in heatmaps
- Parameters
coord_axes (Tuple) – specify which image axes each coordinate dim corresponds to. For 2D images, if you are storing r/c data, set to [0,1], if you are storing x/y data, set to [1,0].
In other words the i-th entry in coord_axes specifies which row-major spatial dimension the i-th column of a coordinate corresponds to. The index is the coordinate dimension and the value is the axes dimension.
- Returns
image with coordinates rasterized on it
- Return type
ndarray
References
https://stackoverflow.com/questions/54726703/generating-keypoint-heatmaps-in-tensorflow
Example
>>> from kwimage.structs.coords import * # NOQA >>> s = 64 >>> self = Coords.random(10, meta={'shape': (s, s)}).scale(s) >>> # Put points on edges to to verify "edge cases" >>> self.data[1] = [0, 0] # top left >>> self.data[2] = [s, s] # bottom right >>> self.data[3] = [0, s + 10] # bottom left >>> self.data[4] = [-3, s // 2] # middle left >>> self.data[5] = [s + 1, -1] # top right >>> # Put points in the middle to verify overlap blending >>> self.data[6] = [32.5, 32.5] # middle >>> self.data[7] = [34.5, 34.5] # middle >>> fill_value = 1 >>> coord_axes = [1, 0] >>> radius = 10 >>> image1 = np.zeros((s, s)) >>> self.soft_fill(image1, coord_axes=coord_axes, radius=radius) >>> radius = 3.0 >>> image2 = np.zeros((s, s)) >>> self.soft_fill(image2, coord_axes=coord_axes, radius=radius) >>> # xdoc: +REQUIRES(--show) >>> # xdoc: +REQUIRES(module:kwplot) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(image1, pnum=(1, 2, 1)) >>> kwplot.imshow(image2, pnum=(1, 2, 2))
- draw_on(self, image=None, fill_value=1, coord_axes=[1, 0], interp='bilinear')¶
Note
unlike other methods, the defaults assume x/y internal data
- Parameters
coord_axes (Tuple) – specify which image axes each coordinate dim corresponds to. For 2D images, if you are storing r/c data, set to [0,1], if you are storing x/y data, set to [1,0].
In other words the i-th entry in coord_axes specifies which row-major spatial dimension the i-th column of a coordinate corresponds to. The index is the coordinate dimension and the value is the axes dimension.
- Returns
image with coordinates drawn on it
- Return type
ndarray
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> from kwimage.structs.coords import * # NOQA >>> s = 256 >>> self = Coords.random(10, meta={'shape': (s, s)}).scale(s) >>> self.data[0] = [10, 10] >>> self.data[1] = [20, 40] >>> image = np.zeros((s, s)) >>> fill_value = 1 >>> image = self.draw_on(image, fill_value, coord_axes=[1, 0], interp='bilinear') >>> # image = self.draw_on(image, fill_value, coord_axes=[0, 1], interp='nearest') >>> # image = self.draw_on(image, fill_value, coord_axes=[1, 0], interp='bilinear') >>> # image = self.draw_on(image, fill_value, coord_axes=[1, 0], interp='nearest') >>> # xdoc: +REQUIRES(--show) >>> # xdoc: +REQUIRES(module:kwplot) >>> import kwplot >>> kwplot.autompl() >>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.imshow(image) >>> self.draw(radius=3, alpha=.5, coord_axes=[1, 0])
- draw(self, color='blue', ax=None, alpha=None, coord_axes=[1, 0], radius=1, setlim=False)¶
Note
unlike other methods, the defaults assume x/y internal data
- Parameters
setlim (bool) – if True ensures the limits of the axes contains the polygon
coord_axes (Tuple) – specify which image axes each coordinate dim corresponds to. For 2D images,
if you are storing r/c data, set to [0,1], if you are storing x/y data, set to [1,0].
- Returns
drawn matplotlib objects
- Return type
List[mpl.collections.PatchCollection]
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> from kwimage.structs.coords import * # NOQA >>> self = Coords.random(10) >>> # xdoc: +REQUIRES(--show) >>> self.draw(radius=3.0, setlim=True) >>> import kwplot >>> kwplot.autompl() >>> self.draw(radius=3.0)
- class kwimage.Detections(data=None, meta=None, datakeys=None, metakeys=None, checks=True, **kwargs)¶
Bases:
ubelt.NiceRepr
,_DetAlgoMixin
,_DetDrawMixin
Container for holding and manipulating multiple detections.
- Variables
data (Dict) –
dictionary containing corresponding lists. The length of each list is the number of detections. This contains the bounding boxes, confidence scores, and class indices. Details of the most common keys and types are as follows:
boxes (kwimage.Boxes[ArrayLike]): multiple bounding boxes scores (ArrayLike): associated scores class_idxs (ArrayLike): associated class indices segmentations (ArrayLike): segmentations masks for each box,
members can be
Mask
orMultiPolygon
.- keypoints (ArrayLike): keypoints for each box. Members should
be
Points
.
Additional custom keys may be specified as long as (a) the values are array-like and the first axis corresponds to the standard data values and (b) are custom keys are listed in the datakeys kwargs when constructing the Detections.
meta (Dict) – This contains contextual information about the detections. This includes the class names, which can be indexed into via the class indexes.
Example
>>> import kwimage >>> dets = kwimage.Detections( >>> # there are expected keys that do not need registration >>> boxes=kwimage.Boxes.random(3), >>> class_idxs=[0, 1, 1], >>> classes=['a', 'b'], >>> # custom data attrs must align with boxes >>> myattr1=np.random.rand(3), >>> myattr2=np.random.rand(3, 2, 8), >>> # there are no restrictions on metadata >>> mymeta='a custom metadata string', >>> # Note that any key not in kwimage.Detections.__datakeys__ or >>> # kwimage.Detections.__metakeys__ must be registered at the >>> # time of construction. >>> datakeys=['myattr1', 'myattr2'], >>> metakeys=['mymeta'], >>> checks=True, >>> ) >>> print('dets = {}'.format(dets)) dets = <Detections(3)>
- __datakeys__ = ['boxes', 'scores', 'class_idxs', 'probs', 'weights', 'keypoints', 'segmentations']¶
- __metakeys__ = ['classes']¶
- __nice__(self)¶
- __len__(self)¶
- copy(self)¶
Returns a deep copy of this Detections object
- classmethod coerce(cls, data=None, **kwargs)¶
The “try-anything to get what I want” constructor
- Parameters
data
**kwargs – currently boxes and cnames
Example
>>> from kwimage.structs.detections import * # NOQA >>> import kwimage >>> kwargs = dict( >>> boxes=kwimage.Boxes.random(4), >>> cnames=['a', 'b', 'c', 'c'], >>> ) >>> data = {} >>> self = kwimage.Detections.coerce(data, **kwargs)
- classmethod from_coco_annots(cls, anns, cats=None, classes=None, kp_classes=None, shape=None, dset=None)¶
Create a Detections object from a list of coco-like annotations.
- Parameters
anns (List[Dict]) – list of coco-like annotation objects
dset (CocoDataset) – if specified, cats, classes, and kp_classes can are ignored.
cats (List[Dict]) – coco-format category information. Used only if dset is not specified.
classes (ndsampler.CategoryTree) – category tree with coco class info. Used only if dset is not specified.
kp_classes (ndsampler.CategoryTree) – keypoint category tree with coco keypoint class info. Used only if dset is not specified.
shape (tuple) – shape of parent image
- Returns
a detections object
- Return type
Example
>>> from kwimage.structs.detections import * # NOQA >>> # xdoctest: +REQUIRES(--module:ndsampler) >>> anns = [{ >>> 'id': 0, >>> 'image_id': 1, >>> 'category_id': 2, >>> 'bbox': [2, 3, 10, 10], >>> 'keypoints': [4.5, 4.5, 2], >>> 'segmentation': { >>> 'counts': '_11a04M2O0O20N101N3L_5', >>> 'size': [20, 20], >>> }, >>> }] >>> dataset = { >>> 'images': [], >>> 'annotations': [], >>> 'categories': [ >>> {'id': 0, 'name': 'background'}, >>> {'id': 2, 'name': 'class1', 'keypoints': ['spot']} >>> ] >>> } >>> #import ndsampler >>> #dset = ndsampler.CocoDataset(dataset) >>> cats = dataset['categories'] >>> dets = Detections.from_coco_annots(anns, cats)
Example
>>> # xdoctest: +REQUIRES(--module:ndsampler) >>> # Test case with no category information >>> from kwimage.structs.detections import * # NOQA >>> anns = [{ >>> 'id': 0, >>> 'image_id': 1, >>> 'category_id': None, >>> 'bbox': [2, 3, 10, 10], >>> 'prob': [.1, .9], >>> }] >>> cats = [ >>> {'id': 0, 'name': 'background'}, >>> {'id': 2, 'name': 'class1'} >>> ] >>> dets = Detections.from_coco_annots(anns, cats)
Example
>>> import kwimage >>> # xdoctest: +REQUIRES(--module:ndsampler) >>> import ndsampler >>> sampler = ndsampler.CocoSampler.demo('photos') >>> iminfo, anns = sampler.load_image_with_annots(1) >>> shape = iminfo['imdata'].shape[0:2] >>> kp_classes = sampler.dset.keypoint_categories() >>> dets = kwimage.Detections.from_coco_annots( >>> anns, sampler.dset.dataset['categories'], sampler.catgraph, >>> kp_classes, shape=shape)
- to_coco(self, cname_to_cat=None, style='orig', image_id=None, dset=None)¶
Converts this set of detections into coco-like annotation dictionaries.
Notes
Not all aspects of the MS-COCO format can be accurately represented, so some liberties are taken. The MS-COCO standard defines that annotations should specifiy a category_id field, but in some cases this information is not available so we will populate a ‘category_name’ field if possible and in the worst case fall back to ‘category_index’.
Additionally, detections may contain additional information beyond the MS-COCO standard, and this information (e.g. weight, prob, score) is added as forign fields.
- Parameters
cname_to_cat – currently ignored.
style (str, default=’orig’) – either ‘orig’ (for the original coco format) or ‘new’ for the more general kwcoco-style coco format.
image_id (int, default=None) – if specified, populates the image_id field of each image
dset (CocoDataset, default=None) – if specified, attempts to populate the category_id field to be compatible with this coco dataset.
- Yields
dict – coco-like annotation structures
Example
>>> # xdoctest: +REQUIRES(module:ndsampler) >>> from kwimage.structs.detections import * >>> self = Detections.demo()[0] >>> cname_to_cat = None >>> list(self.to_coco())
- property boxes(self)¶
- property class_idxs(self)¶
- property scores(self)¶
typically only populated for predicted detections
- property probs(self)¶
typically only populated for predicted detections
- property weights(self)¶
typically only populated for groundtruth detections
- property classes(self)¶
- num_boxes(self)¶
- warp(self, transform, input_dims=None, output_dims=None, inplace=False)¶
Spatially warp the detections.
Example
>>> import skimage >>> transform = skimage.transform.AffineTransform(scale=(2, 3), translation=(4, 5)) >>> self = Detections.random(2) >>> new = self.warp(transform) >>> assert new.boxes == self.boxes.warp(transform) >>> assert new != self
- scale(self, factor, output_dims=None, inplace=False)¶
Spatially warp the detections.
Example
>>> import skimage >>> transform = skimage.transform.AffineTransform(scale=(2, 3), translation=(4, 5)) >>> self = Detections.random(2) >>> new = self.warp(transform) >>> assert new.boxes == self.boxes.warp(transform) >>> assert new != self
- translate(self, offset, output_dims=None, inplace=False)¶
Spatially warp the detections.
Example
>>> import skimage >>> self = Detections.random(2) >>> new = self.translate(10)
- classmethod concatenate(cls, dets)¶
- Parameters
boxes (Sequence[Detections]) – list of detections to concatenate
- Returns
stacked detections
- Return type
Example
>>> self = Detections.random(2) >>> other = Detections.random(3) >>> dets = [self, other] >>> new = Detections.concatenate(dets) >>> assert new.num_boxes() == 5
>>> self = Detections.random(2, segmentations=True) >>> other = Detections.random(3, segmentations=True) >>> dets = [self, other] >>> new = Detections.concatenate(dets) >>> assert new.num_boxes() == 5
- argsort(self, reverse=True)¶
Sorts detection indices by descending (or ascending) scores
- Returns
sorted indices
- Return type
ndarray[int]
- sort(self, reverse=True)¶
Sorts detections by descending (or ascending) scores
- Returns
sorted copy of self
- Return type
- compress(self, flags, axis=0)¶
Returns a subset where corresponding locations are True.
- Parameters
flags (ndarray[bool]) – mask marking selected items
- Returns
subset of self
- Return type
- CommandLine:
xdoctest -m kwimage.structs.detections Detections.compress
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> import kwimage >>> dets = kwimage.Detections.random(keypoints='dense') >>> flags = np.random.rand(len(dets)) > 0.5 >>> subset = dets.compress(flags) >>> assert len(subset) == flags.sum() >>> subset = dets.tensor().compress(flags) >>> assert len(subset) == flags.sum()
- take(self, indices, axis=0)¶
Returns a subset specified by indices
- Parameters
indices (ndarray[int]) – indices to select
- Returns
subset of self
- Return type
Example
>>> import kwimage >>> dets = kwimage.Detections(boxes=kwimage.Boxes.random(10)) >>> subset = dets.take([2, 3, 5, 7]) >>> assert len(subset) == 4 >>> # xdoctest: +REQUIRES(module:torch) >>> subset = dets.tensor().take([2, 3, 5, 7]) >>> assert len(subset) == 4
- __getitem__(self, index)¶
Fancy slicing / subset / indexing.
Note: scalar indices are always coerced into index lists of length 1.
Example
>>> import kwimage >>> import kwarray >>> dets = kwimage.Detections(boxes=kwimage.Boxes.random(10)) >>> indices = [2, 3, 5, 7] >>> flags = kwarray.boolmask(indices, len(dets)) >>> assert dets[flags].data == dets[indices].data
- property device(self)¶
If the backend is torch returns the data device, otherwise None
- is_tensor(self)¶
is the backend fueled by torch?
- is_numpy(self)¶
is the backend fueled by numpy?
- numpy(self)¶
Converts tensors to numpy. Does not change memory if possible.
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> self = Detections.random(3).tensor() >>> newself = self.numpy() >>> self.scores[0] = 0 >>> assert newself.scores[0] == 0 >>> self.scores[0] = 1 >>> assert self.scores[0] == 1 >>> self.numpy().numpy()
- property dtype(self)¶
- tensor(self, device=ub.NoParam)¶
Converts numpy to tensors. Does not change memory if possible.
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> from kwimage.structs.detections import * >>> self = Detections.random(3) >>> newself = self.tensor() >>> self.scores[0] = 0 >>> assert newself.scores[0] == 0 >>> self.scores[0] = 1 >>> assert self.scores[0] == 1 >>> self.tensor().tensor()
- classmethod demo(Detections)¶
- classmethod random(cls, num=10, scale=1.0, classes=3, keypoints=False, segmentations=False, tensor=False, rng=None)¶
Creates dummy data, suitable for use in tests and benchmarks
- Parameters
num (int) – number of boxes
scale (float | tuple, default=1.0) – bounding image size
classes (int | Sequence) – list of class labels or number of classes
keypoints (bool, default=False) – if True include random keypoints for each box.
segmentations (bool, default=False) – if True include random segmentations for each box.
tensor (bool, default=False) – determines backend. DEPRECATED. Call tensor on resulting object instead.
rng (np.random.RandomState) – random state
Example
>>> import kwimage >>> dets = kwimage.Detections.random(keypoints='jagged') >>> dets.data['keypoints'].data[0].data >>> dets.data['keypoints'].meta >>> dets = kwimage.Detections.random(keypoints='dense') >>> dets = kwimage.Detections.random(keypoints='dense', segmentations=True).scale(1000) >>> # xdoctest:+REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> dets.draw(setlim=True)
Example
>>> import kwimage >>> dets = kwimage.Detections.random( >>> keypoints='jagged', segmentations=True, rng=0).scale(1000) >>> print('dets = {}'.format(dets)) dets = <Detections(10)> >>> dets.data['boxes'].quantize(inplace=True) >>> print('dets.data = {}'.format(ub.repr2( >>> dets.data, nl=1, with_dtype=False, strvals=True))) dets.data = { 'boxes': <Boxes(xywh, array([[548, 544, 55, 172], [423, 645, 15, 247], [791, 383, 173, 146], [ 71, 87, 498, 839], [ 20, 832, 759, 39], [461, 780, 518, 20], [118, 639, 26, 306], [264, 414, 258, 361], [ 18, 568, 439, 50], [612, 616, 332, 66]], dtype=int32))>, 'class_idxs': [1, 2, 0, 0, 2, 0, 0, 0, 0, 0], 'keypoints': <PointsList(n=10)>, 'scores': [0.3595079 , 0.43703195, 0.6976312 , 0.06022547, 0.66676672, 0.67063787,0.21038256, 0.1289263 , 0.31542835, 0.36371077], 'segmentations': <SegmentationList(n=10)>, } >>> # xdoctest:+REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> dets.draw(setlim=True)
Example
>>> # Boxes position/shape within 0-1 space should be uniform. >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> fig = kwplot.figure(fnum=1, doclf=True) >>> fig.gca().set_xlim(0, 128) >>> fig.gca().set_ylim(0, 128) >>> import kwimage >>> kwimage.Detections.random(num=10, segmentations=True).scale(128).draw()
- class kwimage.Heatmap(data=None, meta=None, **kwargs)¶
Bases:
kwimage.structs._generic.Spatial
,_HeatmapDrawMixin
,_HeatmapWarpMixin
,_HeatmapAlgoMixin
Keeps track of a downscaled heatmap and how to transform it to overlay the original input image. Heatmaps generally are used to estimate class probabilites at each pixel. This data struction additionally contains logic to augment pixel with offset (dydx) and scale (diamter) information.
- Variables
data (Dict[str, ArrayLike]) –
dictionary containing spatially aligned heatmap data. Valid keys are as follows.
- class_probs (ArrayLike[C, H, W] | ArrayLike[C, D, H, W]):
A probability map for each class. C is the number of classes.
- offset (ArrayLike[2, H, W] | ArrayLike[3, D, H, W], optional):
object center position offset in y,x / t,y,x coordinates
- diamter (ArrayLike[2, H, W] | ArrayLike[3, D, H, W], optional):
object bounding box sizes in h,w / d,h,w coordinates
- keypoints (ArrayLike[2, K, H, W] | ArrayLike[3, K, D, H, W], optional):
y/x offsets for K different keypoint classes
dictionary containing miscellanious metadata about the heatmap data. Valid keys are as follows.
- img_dims (Tuple[H, W] | Tuple[D, H, W]):
original image dimension
- tf_data_to_image (skimage.transform._geometric.GeometricTransform):
transformation matrix (typically similarity or affine) that projects the given, heatmap onto the image dimensions such that the image and heatmap are spatially aligned.
- classes (List[str] | ndsampler.CategoryTree):
information about which index in data[‘class_probs’] corresponds to which semantic class.
dims (Tuple) – dimensions of the heatmap (See `image_dims) for the original image dimensions.
**kwargs – any key that is accepted by the data or meta dictionaries can be specified as a keyword argument to this class and it will be properly placed in the appropriate internal dictionary.
- CommandLine:
xdoctest -m ~/code/kwimage/kwimage/structs/heatmap.py Heatmap –show
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> from kwimage.structs.heatmap import * # NOQA >>> import kwimage >>> class_probs = kwimage.grab_test_image(dsize=(32, 32), space='gray')[None, ] / 255.0 >>> img_dims = (220, 220) >>> tf_data_to_img = skimage.transform.AffineTransform(translation=(-18, -18), scale=(8, 8)) >>> self = Heatmap(class_probs=class_probs, img_dims=img_dims, >>> tf_data_to_img=tf_data_to_img) >>> aligned = self.upscale() >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(aligned[0]) >>> kwplot.show_if_requested()
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> import kwimage >>> self = Heatmap.random() >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> self.draw()
- __datakeys__ = ['class_probs', 'offset', 'diameter', 'keypoints', 'class_idx', 'class_energy']¶
- __metakeys__ = ['img_dims', 'tf_data_to_img', 'classes', 'kp_classes']¶
- __spatialkeys__ = ['offset', 'diameter', 'keypoints']¶
- __nice__(self)¶
- __getitem__(self, index)¶
- __len__(self)¶
- property shape(self)¶
- property bounds(self)¶
- property dims(self)¶
space-time dimensions of this heatmap
- is_numpy(self)¶
- is_tensor(self)¶
- property _impl(self)¶
Returns the internal tensor/numpy ArrayAPI implementation
- Returns
kwarray.ArrayAPI
- classmethod random(cls, dims=(10, 10), classes=3, diameter=True, offset=True, keypoints=False, img_dims=None, dets=None, nblips=10, noise=0.0, rng=None)¶
Creates dummy data, suitable for use in tests and benchmarks
- Parameters
dims (Tuple) – dimensions of the heatmap
img_dims (Tuple) – dimensions of the image the heatmap corresponds to
Example
>>> from kwimage.structs.heatmap import * # NOQA >>> self = Heatmap.random((128, 128), img_dims=(200, 200), >>> classes=3, nblips=10, rng=0, noise=0.1) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(self.colorize(0, imgspace=0), fnum=1, pnum=(1, 4, 1), doclf=1) >>> kwplot.imshow(self.colorize(1, imgspace=0), fnum=1, pnum=(1, 4, 2)) >>> kwplot.imshow(self.colorize(2, imgspace=0), fnum=1, pnum=(1, 4, 3)) >>> kwplot.imshow(self.colorize(3, imgspace=0), fnum=1, pnum=(1, 4, 4))
- Ignore:
self.detect(0).sort().non_max_supress()[-np.arange(1, 4)].draw() from kwimage.structs.heatmap import * # NOQA import xdev globals().update(xdev.get_func_kwargs(Heatmap.random))
Example
>>> # xdoctest: +REQUIRES(module:ndsampler) >>> import kwimage >>> self = kwimage.Heatmap.random(dims=(50, 200), dets='coco', >>> keypoints=True) >>> image = np.zeros(self.img_dims) >>> # xdoctest: +REQUIRES(module:kwplot) >>> toshow = self.draw_on(image, 1, vecs=True, kpts=0, with_alpha=0.85) >>> # xdoctest: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.imshow(toshow)
- Ignore:
>>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.imshow(image) >>> dets.draw() >>> dets.data['keypoints'].draw(radius=6) >>> dets.data['segmentations'].draw()
>>> self.draw()
- property class_probs(self)¶
- property offset(self)¶
- property diameter(self)¶
- property img_dims(self)¶
- property tf_data_to_img(self)¶
- property classes(self)¶
- numpy(self)¶
Converts underlying data to numpy arrays
- tensor(self, device=ub.NoParam)¶
Converts underlying data to torch tensors
- class kwimage.Mask(data=None, format=None)¶
Bases:
ubelt.NiceRepr
,_MaskConversionMixin
,_MaskConstructorMixin
,_MaskTransformMixin
,_MaskDrawMixin
Manages a single segmentation mask and can convert to and from multiple formats including:
bytes_rle - byte encoded run length encoding
array_rle - raw run length encoding
c_mask - c-style binary mask
f_mask - fortran-style binary mask
Example
>>> # xdoc: +REQUIRES(--mask) >>> # a ms-coco style compressed bytes rle segmentation >>> segmentation = {'size': [5, 9], 'counts': ';?1B10O30O4'} >>> mask = Mask(segmentation, 'bytes_rle') >>> # convert to binary numpy representation >>> binary_mask = mask.to_c_mask().data >>> print(ub.repr2(binary_mask.tolist(), nl=1, nobr=1)) [0, 0, 0, 1, 1, 1, 1, 1, 0], [0, 0, 1, 1, 1, 0, 0, 0, 0], [0, 0, 1, 1, 1, 1, 1, 1, 0], [0, 0, 1, 1, 1, 0, 1, 1, 0], [0, 0, 1, 1, 1, 0, 1, 1, 0],
- property dtype(self)¶
- __nice__(self)¶
- classmethod random(Mask, rng=None, shape=(32, 32))¶
Create a random binary mask object
- Parameters
rng (int | RandomState | None) – the random seed
shape (Tuple[int, int]) – the height / width of the returned mask
- Returns
the random mask
- Return type
Example
>>> import kwimage >>> mask = kwimage.Mask.random() >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> mask.draw() >>> kwplot.show_if_requested()
- classmethod demo(cls)¶
Demo mask with holes and disjoint shapes
- Returns
the demo mask
- Return type
- copy(self)¶
Performs a deep copy of the mask data
- Returns
the copied mask
- Return type
Example
>>> self = Mask.random(shape=(8, 8), rng=0) >>> other = self.copy() >>> assert other.data is not self.data
- union(self, *others)¶
This can be used as a staticmethod or an instancemethod
- Parameters
*others – multiple input masks to union
- Returns
the unioned mask
- Return type
Example
>>> # xdoc: +REQUIRES(--mask) >>> from kwimage.structs.mask import * # NOQA >>> masks = [Mask.random(shape=(8, 8), rng=i) for i in range(2)] >>> mask = Mask.union(*masks) >>> print(mask.area) >>> masks = [m.to_c_mask() for m in masks] >>> mask = Mask.union(*masks) >>> print(mask.area)
>>> masks = [m.to_bytes_rle() for m in masks] >>> mask = Mask.union(*masks) >>> print(mask.area)
- Benchmark:
import ubelt as ub ti = ub.Timerit(100, bestof=10, verbose=2)
masks = [Mask.random(shape=(172, 172), rng=i) for i in range(2)]
- for timer in ti.reset(‘native rle union’):
masks = [m.to_bytes_rle() for m in masks] with timer:
mask = Mask.union(*masks)
- for timer in ti.reset(‘native cmask union’):
masks = [m.to_c_mask() for m in masks] with timer:
mask = Mask.union(*masks)
- for timer in ti.reset(‘cmask->rle union’):
masks = [m.to_c_mask() for m in masks] with timer:
mask = Mask.union(*[m.to_bytes_rle() for m in masks])
- intersection(self, *others)¶
This can be used as a staticmethod or an instancemethod
- Parameters
*others – multiple input masks to intersect
- Returns
the intersection of the masks
- Return type
Example
>>> n = 3 >>> masks = [Mask.random(shape=(8, 8), rng=i) for i in range(n)] >>> items = masks >>> mask = Mask.intersection(*masks) >>> areas = [item.area for item in items] >>> print('areas = {!r}'.format(areas)) >>> print(mask.area) >>> print(Mask.intersection(*masks).area / Mask.union(*masks).area)
- property shape(self)¶
- property area(self)¶
Returns the number of non-zero pixels
- Returns
the number of non-zero pixels
- Return type
Example
>>> self = Mask.demo() >>> self.area 150
- get_patch(self)¶
Extract the patch with non-zero data
Example
>>> # xdoc: +REQUIRES(--mask) >>> from kwimage.structs.mask import * # NOQA >>> self = Mask.random(shape=(8, 8), rng=0) >>> self.get_patch()
- get_xywh(self)¶
Gets the bounding xywh box coordinates of this mask
- Returns
- x, y, w, h: Note we dont use a Boxes object because
a general singular version does not yet exist.
- Return type
ndarray
Example
>>> # xdoc: +REQUIRES(--mask) >>> self = Mask.random(shape=(8, 8), rng=0) >>> self.get_xywh().tolist() >>> self = Mask.random(rng=0).translate((10, 10)) >>> self.get_xywh().tolist()
Example
>>> # test empty case >>> import kwimage >>> self = kwimage.Mask(np.empty((0, 0), dtype=np.uint8), format='c_mask') >>> assert self.get_xywh().tolist() == [0, 0, 0, 0]
- Ignore:
>>> import kwimage >>> self = kwimage.Mask(np.zeros((768, 768), dtype=np.uint8), format='c_mask') >>> x_coords = np.array([621, 752]) >>> y_coords = np.array([366, 292]) >>> self.data[y_coords, x_coords] = 1 >>> self.get_xywh()
>>> # References: >>> # https://stackoverflow.com/questions/33281957/faster-alternative-to-numpy-where >>> # https://answers.opencv.org/question/4183/what-is-the-best-way-to-find-bounding-box-for-binary-mask/ >>> import timerit >>> ti = timerit.Timerit(100, bestof=10, verbose=2) >>> for timer in ti.reset('time'): >>> with timer: >>> y_coords, x_coords = np.where(self.data) >>> # >>> for timer in ti.reset('time'): >>> with timer: >>> cv2.findNonZero(data)
self.data = np.random.rand(800, 700) > 0.5
import timerit ti = timerit.Timerit(100, bestof=10, verbose=2) for timer in ti.reset(‘time’):
- with timer:
y_coords, x_coords = np.where(self.data)
# for timer in ti.reset(‘time’):
- with timer:
data = np.ascontiguousarray(self.data).astype(np.uint8) cv2_coords = cv2.findNonZero(data)
>>> poly = self.to_multi_polygon()
- get_polygon(self)¶
DEPRECATED: USE to_multi_polygon
Returns a list of (x,y)-coordinate lists. The length of the list is equal to the number of disjoint regions in the mask.
- Returns
- polygon around each connected component of the
mask. Each ndarray is an Nx2 array of xy points.
- Return type
List[ndarray]
Note
The returned polygon may not surround points that are only one pixel thick.
Example
>>> # xdoc: +REQUIRES(--mask) >>> from kwimage.structs.mask import * # NOQA >>> self = Mask.random(shape=(8, 8), rng=0) >>> polygons = self.get_polygon() >>> print('polygons = ' + ub.repr2(polygons)) >>> polygons = self.get_polygon() >>> self = self.to_bytes_rle() >>> other = Mask.from_polygons(polygons, self.shape) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> image = np.ones(self.shape) >>> image = self.draw_on(image, color='blue') >>> image = other.draw_on(image, color='red') >>> kwplot.imshow(image)
- polygons = [
np.array([[6, 4],[7, 4]], dtype=np.int32), np.array([[0, 1],[0, 3],[2, 3],[2, 1]], dtype=np.int32),
]
- to_mask(self, dims=None)¶
Converts to a mask object (which does nothing because this already is mask object!)
- Returns
kwimage.Mask
- to_boxes(self)¶
Returns the bounding box of the mask.
- Returns
kwimage.Boxes
- to_multi_polygon(self)¶
Returns a MultiPolygon object fit around this raster including disjoint pieces and holes.
- Returns
vectorized representation
- Return type
Example
>>> # xdoc: +REQUIRES(--mask) >>> from kwimage.structs.mask import * # NOQA >>> self = Mask.demo() >>> self = self.scale(5) >>> multi_poly = self.to_multi_polygon() >>> # xdoc: +REQUIRES(module:kwplot) >>> # xdoc: +REQUIRES(--show) >>> self.draw(color='red') >>> multi_poly.scale(1.1).draw(color='blue')
>>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> image = np.ones(self.shape) >>> image = self.draw_on(image, color='blue') >>> #image = other.draw_on(image, color='red') >>> kwplot.imshow(image) >>> multi_poly.draw()
Example
>>> import kwimage >>> self = kwimage.Mask(np.empty((0, 0), dtype=np.uint8), format='c_mask') >>> poly = self.to_multi_polygon() >>> poly.to_multi_polygon()
Example
# Corner case, only two pixels are on >>> import kwimage >>> self = kwimage.Mask(np.zeros((768, 768), dtype=np.uint8), format=’c_mask’) >>> x_coords = np.array([621, 752]) >>> y_coords = np.array([366, 292]) >>> self.data[y_coords, x_coords] = 1 >>> poly = self.to_multi_polygon()
poly.to_mask(self.shape).data.sum()
self.to_array_rle().to_c_mask().data.sum() temp.to_c_mask().data.sum()
Example
>>> # TODO: how do we correctly handle the 1 or 2 point to a poly >>> # case? >>> import kwimage >>> data = np.zeros((8, 8), dtype=np.uint8) >>> data[0, 3:5] = 1 >>> data[7, 3:5] = 1 >>> data[3:5, 0:2] = 1 >>> self = kwimage.Mask.coerce(data) >>> polys = self.to_multi_polygon() >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(data) >>> polys.draw(border=True, linewidth=5, alpha=0.5, radius=0.2)
- get_convex_hull(self)¶
Returns a list of xy points around the convex hull of this mask
Note
The returned polygon may not surround points that are only one pixel thick.
Example
>>> # xdoc: +REQUIRES(--mask) >>> self = Mask.random(shape=(8, 8), rng=0) >>> polygons = self.get_convex_hull() >>> print('polygons = ' + ub.repr2(polygons)) >>> other = Mask.from_polygons(polygons, self.shape)
- iou(self, other)¶
The area of intersection over the area of union
Todo
- [ ] Write plural Masks version of this class, which should
be able to perform this operation more efficiently.
- CommandLine:
xdoctest -m kwimage.structs.mask Mask.iou
Example
>>> # xdoc: +REQUIRES(--mask) >>> self = Mask.demo() >>> other = self.translate(1) >>> iou = self.iou(other) >>> print('iou = {:.4f}'.format(iou)) iou = 0.0830 >>> iou2 = self.intersection(other).area / self.union(other).area >>> print('iou2 = {:.4f}'.format(iou2))
- classmethod coerce(Mask, data, dims=None)¶
Attempts to auto-inspect the format of the data and conver to Mask
- Parameters
data – the data to coerce
dims (Tuple) – required for certain formats like polygons height / width of the source image
- Returns
the constructed mask object
- Return type
Example
>>> # xdoc: +REQUIRES(--mask) >>> segmentation = {'size': [5, 9], 'counts': ';?1B10O30O4'} >>> polygon = [ >>> [np.array([[3, 0],[2, 1],[2, 4],[4, 4],[4, 3],[7, 0]])], >>> [np.array([[2, 1],[2, 2],[4, 2],[4, 1]])], >>> ] >>> dims = (9, 5) >>> mask = (np.random.rand(32, 32) > .5).astype(np.uint8) >>> Mask.coerce(polygon, dims).to_bytes_rle() >>> Mask.coerce(segmentation).to_bytes_rle() >>> Mask.coerce(mask).to_bytes_rle()
- _to_coco(self)¶
use to_coco instead
- to_coco(self, style='orig')¶
Convert the Mask to a COCO json representation based on the current format.
A COCO mask is formatted as a run-length-encoding (RLE), of which there are two variants: (1) a array RLE, which is slightly more readable and extensible, and (2) a bytes RLE, which is slightly more concise. The returned format will depend on the current format of the Mask object. If it is in “bytes_rle” format, it will be returned in that format, otherwise it will be converted to the “array_rle” format and returned as such.
- Parameters
style (str) – Does nothing for this particular method, exists for API compatibility and if alternate encoding styles are implemented in the future.
- Returns
- either a bytes-rle or array-rle encoding, depending
on the current mask format. The keys in this dictionary are as follows:
counts (List[int] | str): the array or bytes rle encoding
- size (Tuple[int]): the height and width of the encoded mask
see note.
- shape (Tuple[int]): only present in array-rle mode. This
is also the height/width of the underlying encoded array. This exists for semantic consistency with other kwimage conventions, and is not part of the original coco spec.
- order (str): only present in array-rle mode.
Either C or F, indicating if counts is aranged in row-major or column-major order. For COCO-compatibility this is always returned in F (column-major) order.
- binary (bool): only present in array-rle mode.
For COCO-compatibility this is always returned as False, indicating the mask only contains binary 0 or 1 values.
- Return type
Note
The output dictionary will contain a key named “size”, this is the only location in kwimage where “size” refers to a tuple in (height/width) order, in order to be backwards compatible with the original coco spec. In all other locations in kwimage a “size” will refer to a (width/height) ordered tuple.
- SeeAlso:
- func
kwimage.im_runlen.encode_run_length - backend function that does array-style run length encoding.
Example
>>> # xdoc: +REQUIRES(--mask) >>> from kwimage.structs.mask import * # NOQA >>> self = Mask.demo() >>> coco_data1 = self.toformat('array_rle').to_coco() >>> coco_data2 = self.toformat('bytes_rle').to_coco() >>> print('coco_data1 = {}'.format(ub.repr2(coco_data1, nl=1))) >>> print('coco_data2 = {}'.format(ub.repr2(coco_data2, nl=1))) coco_data1 = { 'binary': True, 'counts': [47, 5, 3, 1, 14, ... 1, 4, 19, 141], 'order': 'F', 'shape': (23, 32), 'size': (23, 32), } coco_data2 = { 'counts': '_153L;4EL...ON3060L0N060L0Nb0Y4', 'size': [23, 32], }
- class kwimage.MaskList¶
Bases:
kwimage.structs._generic.ObjectList
Store and manipulate multiple masks, usually within the same image
- to_polygon_list(self)¶
Converts all mask objects to multi-polygon objects
- Returns
kwimage.PolygonList
- to_segmentation_list(self)¶
Converts all items to segmentation objects
- Returns
kwimage.SegmentationList
- to_mask_list(self)¶
returns this object
- Returns
kwimage.MaskList
- class kwimage.MultiPolygon¶
Bases:
kwimage.structs._generic.ObjectList
Data structure for storing multiple polygons (typically related to the same underlying but potentitally disjoing object)
- Variables
data (List[Polygon]) –
- classmethod random(self, n=3, n_holes=0, rng=None, tight=False)¶
Create a random MultiPolygon
- Returns
MultiPolygon
- fill(self, image, value=1)¶
Inplace fill in an image based on this multi-polyon.
- Parameters
image (ndarray) – image to draw on (inplace)
value (int | Tuple[int], default=1) – value fill in with
- Returns
the image that has been modified in place
- Return type
ndarray
- to_multi_polygon(self)¶
- to_boxes(self)¶
Deprecated: lossy conversion use ‘bounding_box’ instead
- bounding_box(self)¶
Return the bounding box of the multi polygon
- Returns
- a Boxes object with one box that encloses all
polygons
- Return type
Example
>>> from kwimage.structs.polygon import * # NOQA >>> self = MultiPolygon.random(rng=0, n=10) >>> boxes = self.to_boxes() >>> sub_boxes = [d.to_boxes() for d in self.data] >>> areas1 = np.array([s.intersection(boxes).area[0] for s in sub_boxes]) >>> areas2 = np.array([s.area[0] for s in sub_boxes]) >>> assert np.allclose(areas1, areas2)
- to_mask(self, dims=None)¶
Returns a mask object indication regions occupied by this multipolygon
Example
>>> from kwimage.structs.polygon import * # NOQA >>> s = 100 >>> self = MultiPolygon.random(rng=0).scale(s) >>> dims = (s, s) >>> mask = self.to_mask(dims)
>>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.figure(fnum=1, doclf=True) >>> from matplotlib import pyplot as pl >>> ax = plt.gca() >>> ax.set_xlim(0, s) >>> ax.set_ylim(0, s) >>> self.draw(color='red', alpha=.4) >>> mask.draw(color='blue', alpha=.4)
- to_relative_mask(self)¶
Returns a translated mask such the mask dimensions are minimal.
In other words, we move the polygon all the way to the top-left and return a mask just big enough to fit the polygon.
- Returns
Mask
- classmethod coerce(cls, data, dims=None)¶
Attempts to construct a MultiPolygon instance from the input data
See Mask.coerce
- to_shapely(self)¶
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> # xdoc: +REQUIRES(module:shapely) >>> from kwimage.structs.polygon import * # NOQA >>> self = MultiPolygon.random(rng=0) >>> geom = self.to_shapely() >>> print('geom = {!r}'.format(geom))
- classmethod from_shapely(MultiPolygon, geom)¶
Convert a shapely polygon or multipolygon to a kwimage.MultiPolygon
- classmethod from_geojson(MultiPolygon, data_geojson)¶
Convert a geojson polygon or multipolygon to a kwimage.MultiPolygon
Example
>>> import kwimage >>> orig = kwimage.MultiPolygon.random() >>> data_geojson = orig.to_geojson() >>> self = kwimage.MultiPolygon.from_geojson(data_geojson)
- to_geojson(self)¶
Converts polygon to a geojson structure
- classmethod from_coco(cls, data, dims=None)¶
Accepts either new-style or old-style coco multi-polygons
- _to_coco(self, style='orig')¶
- to_coco(self, style='orig')¶
Example
>>> from kwimage.structs.polygon import * # NOQA >>> self = MultiPolygon.random(1, rng=0) >>> self.to_coco()
- swap_axes(self, inplace=False)¶
- class kwimage.Points(data=None, meta=None, datakeys=None, metakeys=None, **kwargs)¶
Bases:
kwimage.structs._generic.Spatial
,_PointsWarpMixin
Stores multiple keypoints for a single object.
This stores both the geometry and the class metadata if available
- Ignore:
- meta = {
“names” = [‘head’, ‘nose’, ‘tail’], “skeleton” = [(0, 1), (0, 2)],
}
Example
>>> from kwimage.structs.points import * # NOQA >>> xy = np.random.rand(10, 2) >>> pts = Points(xy=xy) >>> print('pts = {!r}'.format(pts))
- __datakeys__ = ['xy', 'class_idxs', 'visible']¶
- __metakeys__ = ['classes']¶
- __repr__¶
- __nice__(self)¶
- __len__(self)¶
- property shape(self)¶
- property xy(self)¶
- classmethod random(Points, num=1, classes=None, rng=None)¶
Makes random points; typically for testing purposes
Example
>>> import kwimage >>> self = kwimage.Points.random(classes=[1, 2, 3]) >>> self.data >>> print('self.data = {!r}'.format(self.data))
- is_numpy(self)¶
- is_tensor(self)¶
- _impl(self)¶
- tensor(self, device=ub.NoParam)¶
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> from kwimage.structs.points import * # NOQA >>> self = Points.random(10) >>> self.tensor()
- round(self, inplace=False)¶
Rounds data to the nearest integer
- Parameters
inplace (bool, default=False) – if True, modifies this object
Example
>>> import kwimage >>> self = kwimage.Points.random(3).scale(10) >>> self.round()
- numpy(self)¶
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> from kwimage.structs.points import * # NOQA >>> self = Points.random(10) >>> self.tensor().numpy().tensor().numpy()
- draw_on(self, image, color='white', radius=None, copy=False)¶
- CommandLine:
xdoctest -m ~/code/kwimage/kwimage/structs/points.py Points.draw_on –show
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> from kwimage.structs.points import * # NOQA >>> s = 128 >>> image = np.zeros((s, s)) >>> self = Points.random(10).scale(s) >>> image = self.draw_on(image) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.autompl() >>> kwplot.imshow(image) >>> self.draw(radius=3, alpha=.5) >>> kwplot.show_if_requested()
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> from kwimage.structs.points import * # NOQA >>> s = 128 >>> image = np.zeros((s, s)) >>> self = Points.random(10).scale(s) >>> image = self.draw_on(image, radius=3, color='distinct') >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.autompl() >>> kwplot.imshow(image) >>> self.draw(radius=3, alpha=.5, color='classes') >>> kwplot.show_if_requested()
Example
>>> import kwimage >>> s = 32 >>> self = kwimage.Points.random(10).scale(s) >>> color = 'blue' >>> # Test drawong on all channel + dtype combinations >>> im3 = np.zeros((s, s, 3), dtype=np.float32) >>> im_chans = { >>> 'im3': im3, >>> 'im1': kwimage.convert_colorspace(im3, 'rgb', 'gray'), >>> 'im4': kwimage.convert_colorspace(im3, 'rgb', 'rgba'), >>> } >>> inputs = {} >>> for k, im in im_chans.items(): >>> inputs[k + '_01'] = (kwimage.ensure_float01(im.copy()), {'radius': None}) >>> inputs[k + '_255'] = (kwimage.ensure_uint255(im.copy()), {'radius': None}) >>> outputs = {} >>> for k, v in inputs.items(): >>> im, kw = v >>> outputs[k] = self.draw_on(im, color=color, **kw) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.figure(fnum=2, doclf=True) >>> kwplot.autompl() >>> pnum_ = kwplot.PlotNums(nCols=2, nRows=len(inputs)) >>> for k in inputs.keys(): >>> kwplot.imshow(inputs[k][0], fnum=2, pnum=pnum_(), title=k) >>> kwplot.imshow(outputs[k], fnum=2, pnum=pnum_(), title=k) >>> kwplot.show_if_requested()
- draw(self, color='blue', ax=None, alpha=None, radius=1, **kwargs)¶
TODO: can use kwplot.draw_points
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> from kwimage.structs.points import * # NOQA >>> pts = Points.random(10) >>> # xdoc: +REQUIRES(--show) >>> pts.draw(radius=0.01)
>>> from kwimage.structs.points import * # NOQA >>> self = Points.random(10, classes=['a', 'b', 'c']) >>> self.draw(radius=0.01, color='classes')
- compress(self, flags, axis=0, inplace=False)¶
Filters items based on a boolean criterion
Example
>>> from kwimage.structs.points import * # NOQA >>> self = Points.random(4) >>> flags = [1, 0, 1, 1] >>> other = self.compress(flags) >>> assert len(self) == 4 >>> assert len(other) == 3
>>> # xdoctest: +REQUIRES(module:torch) >>> other = self.tensor().compress(flags) >>> assert len(other) == 3
- take(self, indices, axis=0, inplace=False)¶
Takes a subset of items at specific indices
Example
>>> from kwimage.structs.points import * # NOQA >>> self = Points.random(4) >>> indices = [1, 3] >>> other = self.take(indices) >>> assert len(self) == 4 >>> assert len(other) == 2
>>> # xdoctest: +REQUIRES(module:torch) >>> other = self.tensor().take(indices) >>> assert len(other) == 2
- classmethod concatenate(cls, points, axis=0)¶
- to_coco(self, style='orig')¶
Converts to an mscoco-like representation
Note
items that are usually id-references to other objects may need to be rectified.
- Parameters
style (str) – either orig, new, new-id, or new-name
- Returns
mscoco-like representation
- Return type
Dict
Example
>>> from kwimage.structs.points import * # NOQA >>> self = Points.random(4, classes=['a', 'b']) >>> orig = self._to_coco(style='orig') >>> print('orig = {!r}'.format(orig)) >>> new_name = self._to_coco(style='new-name') >>> print('new_name = {}'.format(ub.repr2(new_name, nl=-1))) >>> # xdoctest: +REQUIRES(module:ndsampler) >>> import ndsampler >>> self.meta['classes'] = ndsampler.CategoryTree.coerce(self.meta['classes']) >>> new_id = self._to_coco(style='new-id') >>> print('new_id = {}'.format(ub.repr2(new_id, nl=-1)))
- _to_coco(self, style='orig')¶
See to_coco
- classmethod coerce(cls, data)¶
Attempt to coerce data into a Points object
- classmethod _from_coco(cls, coco_kpts, class_idxs=None, classes=None)¶
- classmethod from_coco(cls, coco_kpts, class_idxs=None, classes=None, warn=False)¶
- Parameters
coco_kpts (list | dict) – either the original list keypoint encoding or the new dict keypoint encoding.
class_idxs (list) – only needed if using old style
classes (list | CategoryTree) – list of all keypoint category names
warn (bool, default=False) – if True raise warnings
Example
>>> ## >>> classes = ['mouth', 'left-hand', 'right-hand'] >>> coco_kpts = [ >>> {'xy': (0, 0), 'visible': 2, 'keypoint_category': 'left-hand'}, >>> {'xy': (1, 2), 'visible': 2, 'keypoint_category': 'mouth'}, >>> ] >>> Points.from_coco(coco_kpts, classes=classes) >>> # Test without classes >>> Points.from_coco(coco_kpts) >>> # Test without any category info >>> coco_kpts2 = [ub.dict_diff(d, {'keypoint_category'}) for d in coco_kpts] >>> Points.from_coco(coco_kpts2) >>> # Test without category instead of keypoint_category >>> coco_kpts3 = [ub.map_keys(lambda x: x.replace('keypoint_', ''), d) for d in coco_kpts] >>> Points.from_coco(coco_kpts3) >>> # >>> # Old style >>> coco_kpts = [0, 0, 2, 0, 1, 2] >>> Points.from_coco(coco_kpts) >>> # Fail case >>> coco_kpts4 = [{'xy': [4686.5, 1341.5], 'category': 'dot'}] >>> Points.from_coco(coco_kpts4, classes=[])
Example
>>> # xdoctest: +REQUIRES(module:ndsampler) >>> import ndsampler >>> classes = ndsampler.CategoryTree.from_coco([ >>> {'name': 'mouth', 'id': 2}, {'name': 'left-hand', 'id': 3}, {'name': 'right-hand', 'id': 5} >>> ]) >>> coco_kpts = [ >>> {'xy': (0, 0), 'visible': 2, 'keypoint_category_id': 5}, >>> {'xy': (1, 2), 'visible': 2, 'keypoint_category_id': 2}, >>> ] >>> pts = Points.from_coco(coco_kpts, classes=classes) >>> assert pts.data['class_idxs'].tolist() == [2, 0]
- class kwimage.PointsList¶
Bases:
kwimage.structs._generic.ObjectList
Stores a list of Points, each item usually corresponds to a different object.
Notes
# TODO: when the data is homogenous we can use a more efficient # representation, otherwise we have to use heterogenous storage.
- class kwimage.Polygon(data=None, meta=None, datakeys=None, metakeys=None, **kwargs)¶
Bases:
kwimage.structs._generic.Spatial
,_PolyArrayBackend
,_PolyWarpMixin
,ubelt.NiceRepr
Represents a single polygon as set of exterior boundary points and a list of internal polygons representing holes.
By convention exterior boundaries should be counterclockwise and interior holes should be clockwise.
Example
>>> import kwimage >>> data = { >>> 'exterior': np.array([[13, 1], [13, 19], [25, 19], [25, 1]]), >>> 'interiors': [ >>> np.array([[13, 13], [14, 12], [24, 12], [25, 13], [25, 18], >>> [24, 19], [14, 19], [13, 18]]), >>> np.array([[13, 2], [14, 1], [24, 1], [25, 2], [25, 11], >>> [24, 12], [14, 12], [13, 11]])] >>> } >>> self = kwimage.Polygon(**data) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> self.draw(setlim=True)
Example
>>> import kwimage >>> self = kwimage.Polygon.random( >>> n=5, n_holes=1, convex=False, rng=0) >>> print('self = {}'.format(self)) self = <Polygon({ 'exterior': <Coords(data= array([[0.30371392, 0.97195856], [0.24372304, 0.60568445], [0.21408694, 0.34884262], [0.5799477 , 0.44020379], [0.83720288, 0.78367234]]))>, 'interiors': [<Coords(data= array([[0.50164209, 0.83520279], [0.25835064, 0.40313428], [0.28778562, 0.74758761], [0.30341266, 0.93748088]]))>], })> >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> self.draw(setlim=True)
- __datakeys__ = ['exterior', 'interiors']¶
- __metakeys__ = ['classes']¶
- property exterior(self)¶
- property interiors(self)¶
- __nice__(self)¶
- classmethod circle(cls, xy, r, resolution=64)¶
Create a circular polygon
Example
>>> xy = (0.5, 0.5) >>> r = .3 >>> poly = Polygon.circle(xy, r)
- classmethod random(cls, n=6, n_holes=0, convex=True, tight=False, rng=None)¶
- Parameters
n (int) – number of points in the polygon (must be 3 or more)
n_holes (int) – number of holes
tight (bool, default=False) – fits the minimum and maximum points between 0 and 1
convex (bool, default=True) – force resulting polygon will be convex (may remove exterior points)
- CommandLine:
xdoctest -m kwimage.structs.polygon Polygon.random
Example
>>> rng = None >>> n = 4 >>> n_holes = 1 >>> cls = Polygon >>> self = Polygon.random(n=n, rng=rng, n_holes=n_holes, convex=1) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.figure(fnum=1, doclf=True) >>> kwplot.autompl() >>> self.draw()
References
https://gis.stackexchange.com/questions/207731/random-multipolygon https://stackoverflow.com/questions/8997099/random-polygon https://stackoverflow.com/questions/27548363/from-voronoi-tessellation-to-shapely-polygons https://stackoverflow.com/questions/8997099/algorithm-to-generate-random-2d-polygon
- _impl(self)¶
- to_mask(self, dims=None)¶
Convert this polygon to a mask
Todo
[ ] currently not efficient
- Parameters
dims (Tuple) – height and width of the output mask
- Returns
kwimage.Mask
Example
>>> from kwimage.structs.polygon import * # NOQA >>> self = Polygon.random(n_holes=1).scale(128) >>> mask = self.to_mask((128, 128)) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.figure(fnum=1, doclf=True) >>> mask.draw(color='blue') >>> mask.to_multi_polygon().draw(color='red', alpha=.5)
- to_relative_mask(self)¶
Returns a translated mask such the mask dimensions are minimal.
In other words, we move the polygon all the way to the top-left and return a mask just big enough to fit the polygon.
- Returns
kwimage.Mask
Example
>>> from kwimage.structs.polygon import * # NOQA >>> self = Polygon.random().scale(8).translate(100, 100) >>> mask = self.to_relative_mask() >>> assert mask.shape <= (8, 8) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.figure(fnum=1, doclf=True) >>> mask.draw(color='blue') >>> mask.to_multi_polygon().draw(color='red', alpha=.5)
- fill(self, image, value=1)¶
Inplace fill in an image based on this polyon.
- Parameters
image (ndarray) – image to draw on
value (int | Tuple[int], default=1) – value fill in with
- Returns
the image that has been modified in place
- Return type
ndarray
- _to_cv_countours(self)¶
OpenCV polygon representation, which is a list of points. Holes are implicitly represented. When another polygon is drawn over an existing polyon via cv2.fillPoly
- Returns
- where each ndarray is of shape [N, 1, 2],
where N is the number of points on the boundary, the middle dimension is always 1, and the trailing dimension represents x and y coordinates respectively.
- Return type
List[ndarray]
- classmethod coerce(Polygon, data)¶
Try to autodetermine format of input polygon and coerce it into a kwimage.Polygon.
- Parameters
data (object) – some type of data that can be interpreted as a polygon.
- Returns
kwimage.Polygon
Example
>>> import kwimage >>> self = kwimage.Polygon.random() >>> self.coerce(self) >>> self.coerce(self.exterior) >>> self.coerce(self.exterior.data) >>> self.coerce(self.data) >>> self.coerce(self.to_geojson())
- classmethod from_shapely(Polygon, geom)¶
Convert a shapely polygon to a kwimage.Polygon
- Parameters
geom (shapely.geometry.polygon.Polygon) – a shapely polygon
- Returns
kwimage.Polygon
- classmethod from_wkt(Polygon, data)¶
Convert a WKT string to a kwimage.Polygon
- Parameters
data (str) – a WKT polygon string
- Returns
kwimage.Polygon
Example
>>> import kwimage >>> data = 'POLYGON ((0.11 0.61, 0.07 0.588, 0.015 0.50, 0.11 0.61))' >>> self = kwimage.Polygon.from_wkt(data) >>> assert len(self.exterior) == 4
- classmethod from_geojson(Polygon, data_geojson)¶
Convert a geojson polygon to a kwimage.Polygon
- Parameters
data_geojson (dict) – geojson data
References
https://geojson.org/geojson-spec.html
Example
>>> from kwimage.structs.polygon import * # NOQA >>> self = Polygon.random(n_holes=2) >>> data_geojson = self.to_geojson() >>> new = Polygon.from_geojson(data_geojson)
- to_shapely(self)¶
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> # xdoc: +REQUIRES(module:shapely) >>> from kwimage.structs.polygon import * # NOQA >>> self = Polygon.random(n_holes=1) >>> self = self.scale(100) >>> geom = self.to_shapely() >>> print('geom = {!r}'.format(geom))
- to_geojson(self)¶
Converts polygon to a geojson structure
- Returns
Dict[str, object]
Example
>>> import kwimage >>> self = kwimage.Polygon.random() >>> print(self.to_geojson())
- to_wkt(self)¶
Convert a kwimage.Polygon to WKT string
Example
>>> import kwimage >>> self = kwimage.Polygon.random() >>> print(self.to_wkt())
- classmethod from_coco(cls, data, dims=None)¶
Accepts either new-style or old-style coco polygons
- _to_coco(self, style='orig')¶
- to_coco(self, style='orig')¶
- Returns
coco-style polygons
- Return type
List | Dict
- to_multi_polygon(self)¶
- to_boxes(self)¶
Deprecated: lossy conversion use ‘bounding_box’ instead
- property centroid(self)¶
- bounding_box(self)¶
Returns an axis-aligned bounding box for the segmentation
- Returns
kwimage.Boxes
- bounding_box_polygon(self)¶
Returns an axis-aligned bounding polygon for the segmentation.
Notes
This Polygon will be a Box, not a convex hull! Use shapely for convex hulls.
- Returns
kwimage.Polygon
- copy(self)¶
- clip(self, x_min, y_min, x_max, y_max, inplace=False)¶
Clip polygon to image boundaries.
Example
>>> from kwimage.structs.polygon import * >>> self = Polygon.random().scale(10).translate(-1) >>> self2 = self.clip(1, 1, 3, 3) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> self2.draw(setlim=True)
- draw_on(self, image, color='blue', fill=True, border=False, alpha=1.0, copy=False)¶
Rasterizes a polygon on an image. See draw for a vectorized matplotlib version.
- Parameters
image (ndarray) – image to raster polygon on.
color (str | tuple) – data coercable to a color
fill (bool, default=True) – draw the center mass of the polygon
border (bool, default=False) – draw the border of the polygon
alpha (float, default=1.0) – polygon transparency (setting alpha < 1 makes this function much slower).
copy (bool, default=False) – if False only copies if necessary
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> from kwimage.structs.polygon import * # NOQA >>> self = Polygon.random(n_holes=1).scale(128) >>> image = np.zeros((128, 128), dtype=np.float32) >>> image = self.draw_on(image) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.imshow(image, fnum=1)
Example
>>> import kwimage >>> color = 'blue' >>> self = kwimage.Polygon.random(n_holes=1).scale(128) >>> image = np.zeros((128, 128), dtype=np.float32) >>> # Test drawong on all channel + dtype combinations >>> im3 = np.random.rand(128, 128, 3) >>> im_chans = { >>> 'im3': im3, >>> 'im1': kwimage.convert_colorspace(im3, 'rgb', 'gray'), >>> 'im4': kwimage.convert_colorspace(im3, 'rgb', 'rgba'), >>> } >>> inputs = {} >>> for k, im in im_chans.items(): >>> inputs[k + '_01'] = (kwimage.ensure_float01(im.copy()), {'alpha': None}) >>> inputs[k + '_255'] = (kwimage.ensure_uint255(im.copy()), {'alpha': None}) >>> inputs[k + '_01_a'] = (kwimage.ensure_float01(im.copy()), {'alpha': 0.5}) >>> inputs[k + '_255_a'] = (kwimage.ensure_uint255(im.copy()), {'alpha': 0.5}) >>> outputs = {} >>> for k, v in inputs.items(): >>> im, kw = v >>> outputs[k] = self.draw_on(im, color=color, **kw) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.figure(fnum=2, doclf=True) >>> kwplot.autompl() >>> pnum_ = kwplot.PlotNums(nCols=2, nRows=len(inputs)) >>> for k in inputs.keys(): >>> kwplot.imshow(inputs[k][0], fnum=2, pnum=pnum_(), title=k) >>> kwplot.imshow(outputs[k], fnum=2, pnum=pnum_(), title=k) >>> kwplot.show_if_requested()
- draw(self, color='blue', ax=None, alpha=1.0, radius=1, setlim=False, border=False, linewidth=2)¶
Draws polygon in a matplotlib axes. See draw_on for in-memory image modification.
- Parameters
setlim (bool) – if True ensures the limits of the axes contains the polygon
color (str | Tuple) – coercable color
alpha (float) – fill transparency
setlim (bool) – if True, modify the x and y limits of the matplotlib axes such that the polygon is can be seen.
border (bool, default=False) – if True, draws an edge border on the polygon.
linewidth (bool) – width of the border
Todo
[ ] Rework arguments in favor of matplotlib standards
Example
>>> # xdoc: +REQUIRES(module:kwplot) >>> from kwimage.structs.polygon import * # NOQA >>> self = Polygon.random(n_holes=1) >>> self = self.scale(100) >>> # xdoc: +REQUIRES(--show) >>> self.draw() >>> import kwplot >>> kwplot.autompl() >>> from matplotlib import pyplot as plt >>> kwplot.figure(fnum=2) >>> self.draw(setlim=True)
- _ensure_vertex_order(self, inplace=False)¶
Fixes vertex ordering so the exterior ring is CCW and the interior rings are CW.
Example
>>> import kwimage >>> self = kwimage.Polygon.random(n=3, n_holes=2, rng=0) >>> print('self = {!r}'.format(self)) >>> new = self._ensure_vertex_order() >>> print('new = {!r}'.format(new))
>>> self = kwimage.Polygon.random(n=3, n_holes=2, rng=0).swap_axes() >>> print('self = {!r}'.format(self)) >>> new = self._ensure_vertex_order() >>> print('new = {!r}'.format(new))
- class kwimage.PolygonList¶
Bases:
kwimage.structs._generic.ObjectList
Stores and allows manipluation of multiple polygons, usually within the same image.
- to_mask_list(self, dims=None)¶
Converts all items to masks
- to_polygon_list(self)¶
- to_segmentation_list(self)¶
Converts all items to segmentation objects
- swap_axes(self, inplace=False)¶
- to_geojson(self, as_collection=False)¶
Converts a list of polygons/multipolygons to a geojson structure
- Parameters
as_collection (bool) – if True, wraps the polygon geojson items in a geojson feature collection, otherwise just return a list of items.
- Returns
items or geojson data
- Return type
List[Dict] | Dict
Example
>>> import kwimage >>> data = [kwimage.Polygon.random(), >>> kwimage.Polygon.random(n_holes=1), >>> kwimage.MultiPolygon.random(n_holes=1), >>> kwimage.MultiPolygon.random()] >>> self = kwimage.PolygonList(data) >>> geojson = self.to_geojson(as_collection=True) >>> items = self.to_geojson(as_collection=False) >>> print('geojson = {}'.format(ub.repr2(geojson, nl=-2, precision=1))) >>> print('items = {}'.format(ub.repr2(items, nl=-2, precision=1)))
- class kwimage.Segmentation(data, format=None)¶
Bases:
_WrapperObject
Either holds a MultiPolygon, Polygon, or Mask
- Parameters
data (object) – the underlying object
format (str) – either ‘mask’, ‘polygon’, or ‘multipolygon’
- classmethod random(cls, rng=None)¶
Example
>>> self = Segmentation.random() >>> print('self = {!r}'.format(self)) >>> # xdoc: +REQUIRES(--show) >>> import kwplot >>> kwplot.autompl() >>> kwplot.figure(fnum=1, doclf=True) >>> self.draw() >>> kwplot.show_if_requested()
- to_multi_polygon(self)¶
- to_mask(self, dims=None)¶
- property meta(self)¶
- classmethod coerce(cls, data, dims=None)¶
- class kwimage.SegmentationList¶
Bases:
kwimage.structs._generic.ObjectList
Store and manipulate multiple segmentations (masks or polygons), usually within the same image
- to_polygon_list(self)¶
Converts all mask objects to multi-polygon objects
- to_mask_list(self, dims=None)¶
Converts all mask objects to multi-polygon objects
- to_segmentation_list(self)¶
- classmethod coerce(cls, data)¶
Interpret data as a list of Segmentations
- kwimage.smooth_prob(prob, k=3, inplace=False, eps=1e-09)¶
Smooths the probability map, but preserves the magnitude of the peaks.
Notes
even if inplace is true, we still need to make a copy of the input array, however, we do ensure that it is cleaned up before we leave the function scope.
sigma=0.8 @ k=3, sigma=1.1 @ k=5, sigma=1.4 @ k=7
- class kwimage.Affine(matrix)[source]¶
Bases:
Projective
Helper for making affine transform matrices.
Example
>>> self = Affine(np.eye(3)) >>> m1 = np.eye(3) @ self >>> m2 = self @ np.eye(3)
Example
>>> from kwimage.transform import * # NOQA >>> m = {} >>> # Works, and returns a Affine >>> m[len(m)] = x = Affine.random() @ np.eye(3) >>> assert isinstance(x, Affine) >>> m[len(m)] = x = Affine.random() @ None >>> assert isinstance(x, Affine) >>> # Works, and returns an ndarray >>> m[len(m)] = x = np.eye(3) @ Affine.random(3) >>> assert isinstance(x, np.ndarray) >>> # Works, and returns an Matrix >>> m[len(m)] = x = Affine.random() @ Matrix.random(3) >>> assert isinstance(x, Matrix) >>> m[len(m)] = x = Matrix.random(3) @ Affine.random() >>> assert isinstance(x, Matrix) >>> print('m = {}'.format(ub.repr2(m)))
- property shape(self)¶
- __getitem__(self, index)¶
- __json__(self)¶
- concise(self)¶
Return a concise coercable dictionary representation of this matrix
- Returns
- a small serializable dict that can be passed
to
Affine.coerce()
to reconstruct this object.
- Return type
- Returns
dictionary with consise parameters
- Return type
Dict
Example
>>> self = Affine.random(rng=0, scale=1) >>> params = self.concise() >>> assert np.allclose(Affine.coerce(params).matrix, self.matrix) >>> print('params = {}'.format(ub.repr2(params, nl=1, precision=2))) params = { 'offset': (0.08, 0.38), 'theta': 0.08, 'type': 'affine', }
Example
>>> self = Affine.random(rng=0, scale=2, offset=0) >>> params = self.concise() >>> assert np.allclose(Affine.coerce(params).matrix, self.matrix) >>> print('params = {}'.format(ub.repr2(params, nl=1, precision=2))) params = { 'scale': 2.00, 'theta': 0.04, 'type': 'affine', }
- classmethod coerce(cls, data=None, **kwargs)¶
Attempt to coerce the data into an affine object
- Parameters
data – some data we attempt to coerce to an Affine matrix
**kwargs – some data we attempt to coerce to an Affine matrix, mutually exclusive with data.
- Returns
Affine
Example
>>> import kwimage >>> kwimage.Affine.coerce({'type': 'affine', 'matrix': [[1, 0, 0], [0, 1, 0]]}) >>> kwimage.Affine.coerce({'scale': 2}) >>> kwimage.Affine.coerce({'offset': 3}) >>> kwimage.Affine.coerce(np.eye(3)) >>> kwimage.Affine.coerce(None) >>> kwimage.Affine.coerce(skimage.transform.AffineTransform(scale=30))
- decompose(self)¶
Decompose the affine matrix into its individual scale, translation, rotation, and skew parameters.
- Returns
decomposed offset, scale, theta, and shear params
- Return type
Dict
References
https://math.stackexchange.com/questions/612006/decompose-affine
Example
>>> self = Affine.random() >>> params = self.decompose() >>> recon = Affine.coerce(**params) >>> params2 = recon.decompose() >>> pt = np.vstack([np.random.rand(2, 1), [1]]) >>> result1 = self.matrix[0:2] @ pt >>> result2 = recon.matrix[0:2] @ pt >>> assert np.allclose(result1, result2)
>>> self = Affine.scale(0.001) @ Affine.random() >>> params = self.decompose() >>> self.det()
- classmethod scale(cls, scale)¶
Create a scale Affine object
- Parameters
scale (float | Tuple[float, float]) – x, y scale factor
- Returns
Affine
- classmethod translate(cls, offset)¶
Create a translation Affine object
- Parameters
offset (float | Tuple[float, float]) – x, y translation factor
- Returns
Affine
- classmethod rotate(cls, theta)¶
Create a rotation Affine object
- Parameters
theta (float) – counter-clockwise rotation angle in radians
- Returns
Affine
- classmethod random(cls, rng=None, **kw)¶
Create a random Affine object
- Parameters
rng – random number generator
**kw – passed to
Affine.random_params()
. can contain coercable random distributions for scale, offset, about, theta, and shear.
- Returns
Affine
- classmethod random_params(cls, rng=None, **kw)¶
- Parameters
rng – random number generator
**kw – can contain coercable random distributions for scale, offset, about, theta, and shear.
- Returns
affine parameters suitable to be passed to Affine.affine
- Return type
Dict
Todo
[ ] improve kwargs parameterization
- classmethod affine(cls, scale=None, offset=None, theta=None, shear=None, about=None)¶
Create an affine matrix from high-level parameters
- Parameters
scale (float | Tuple[float, float]) – x, y scale factor
offset (float | Tuple[float, float]) – x, y translation factor
theta (float) – counter-clockwise rotation angle in radians
shear (float) – counter-clockwise shear angle in radians
about (float | Tuple[float, float]) – x, y location of the origin
- Returns
the constructed Affine object
- Return type
Example
>>> rng = kwarray.ensure_rng(None) >>> scale = rng.randn(2) * 10 >>> offset = rng.randn(2) * 10 >>> about = rng.randn(2) * 10 >>> theta = rng.randn() * 10 >>> shear = rng.randn() * 10 >>> # Create combined matrix from all params >>> F = Affine.affine( >>> scale=scale, offset=offset, theta=theta, shear=shear, >>> about=about) >>> # Test that combining components matches >>> S = Affine.affine(scale=scale) >>> T = Affine.affine(offset=offset) >>> R = Affine.affine(theta=theta) >>> H = Affine.affine(shear=shear) >>> O = Affine.affine(offset=about) >>> # combine (note shear must be on the RHS of rotation) >>> alt = O @ T @ R @ H @ S @ O.inv() >>> print('F = {}'.format(ub.repr2(F.matrix.tolist(), nl=1))) >>> print('alt = {}'.format(ub.repr2(alt.matrix.tolist(), nl=1))) >>> assert np.all(np.isclose(alt.matrix, F.matrix)) >>> pt = np.vstack([np.random.rand(2, 1), [[1]]]) >>> warp_pt1 = (F.matrix @ pt) >>> warp_pt2 = (alt.matrix @ pt) >>> assert np.allclose(warp_pt2, warp_pt1)
- Sympy:
>>> # xdoctest: +SKIP >>> import sympy >>> # Shows the symbolic construction of the code >>> # https://groups.google.com/forum/#!topic/sympy/k1HnZK_bNNA >>> from sympy.abc import theta >>> x0, y0, sx, sy, theta, shear, tx, ty = sympy.symbols( >>> 'x0, y0, sx, sy, theta, shear, tx, ty') >>> # move the center to 0, 0 >>> tr1_ = np.array([[1, 0, -x0], >>> [0, 1, -y0], >>> [0, 0, 1]]) >>> # Define core components of the affine transform >>> S = np.array([ # scale >>> [sx, 0, 0], >>> [ 0, sy, 0], >>> [ 0, 0, 1]]) >>> H = np.array([ # shear >>> [1, -sympy.sin(shear), 0], >>> [0, sympy.cos(shear), 0], >>> [0, 0, 1]]) >>> R = np.array([ # rotation >>> [sympy.cos(theta), -sympy.sin(theta), 0], >>> [sympy.sin(theta), sympy.cos(theta), 0], >>> [ 0, 0, 1]]) >>> T = np.array([ # translation >>> [ 1, 0, tx], >>> [ 0, 1, ty], >>> [ 0, 0, 1]]) >>> # Contruct the affine 3x3 about the origin >>> aff0 = np.array(sympy.simplify(T @ R @ H @ S)) >>> # move 0, 0 back to the specified origin >>> tr2_ = np.array([[1, 0, x0], >>> [0, 1, y0], >>> [0, 0, 1]]) >>> # combine transformations >>> aff = tr2_ @ aff0 @ tr1_ >>> print('aff = {}'.format(ub.repr2(aff.tolist(), nl=1)))
- class kwimage.Linear(matrix)[source]¶
Bases:
Matrix
Base class for matrix-based transform.
Example
>>> from kwimage.transform import * # NOQA >>> ms = {} >>> ms['random()'] = Matrix.random() >>> ms['eye()'] = Matrix.eye() >>> ms['random(3)'] = Matrix.random(3) >>> ms['random(4, 4)'] = Matrix.random(4, 4) >>> ms['eye(3)'] = Matrix.eye(3) >>> ms['explicit'] = Matrix(np.array([[1.618]])) >>> for k, m in ms.items(): >>> print('----') >>> print(f'{k} = {m}') >>> print(f'{k}.inv() = {m.inv()}') >>> print(f'{k}.T = {m.T}') >>> print(f'{k}.det() = {m.det()}')
- class kwimage.Matrix(matrix)[source]¶
Bases:
Transform
Base class for matrix-based transform.
Example
>>> from kwimage.transform import * # NOQA >>> ms = {} >>> ms['random()'] = Matrix.random() >>> ms['eye()'] = Matrix.eye() >>> ms['random(3)'] = Matrix.random(3) >>> ms['random(4, 4)'] = Matrix.random(4, 4) >>> ms['eye(3)'] = Matrix.eye(3) >>> ms['explicit'] = Matrix(np.array([[1.618]])) >>> for k, m in ms.items(): >>> print('----') >>> print(f'{k} = {m}') >>> print(f'{k}.inv() = {m.inv()}') >>> print(f'{k}.T = {m.T}') >>> print(f'{k}.det() = {m.det()}')
- __nice__(self)¶
- __repr__(self)¶
Return repr(self).
- property shape(self)¶
- __json__(self)¶
- classmethod coerce(cls, data=None, **kwargs)¶
Example
>>> Matrix.coerce({'type': 'matrix', 'matrix': [[1, 0, 0], [0, 1, 0]]}) >>> Matrix.coerce(np.eye(3)) >>> Matrix.coerce(None)
- __array__(self)¶
Allow this object to be passed to np.asarray
References
- __imatmul__(self, other)¶
- __matmul__(self, other)¶
Example
>>> m = {} >>> # Works, and returns a Matrix >>> m[len(m)] = x = Matrix.random() @ np.eye(2) >>> assert isinstance(x, Matrix) >>> m[len(m)] = x = Matrix.random() @ None >>> assert isinstance(x, Matrix) >>> # Works, and returns an ndarray >>> m[len(m)] = x = np.eye(3) @ Matrix.random(3) >>> assert isinstance(x, np.ndarray) >>> # These do not work >>> # m[len(m)] = None @ Matrix.random() >>> # m[len(m)] = np.eye(3) @ None >>> print('m = {}'.format(ub.repr2(m)))
- inv(self)¶
Returns the inverse of this matrix
- Returns
Matrix
- property T(self)¶
Transpose the underlying matrix
- det(self)¶
Compute the determinant of the underlying matrix
- Returns
float
- classmethod eye(cls, shape=None, rng=None)¶
Construct an identity
- classmethod random(cls, shape=None, rng=None)¶
- class kwimage.Projective(matrix)[source]¶
Bases:
Linear
Currently just a stub class that may be used to implement projective / homography transforms in the future.
- class kwimage.Transform[source]¶
Bases:
ubelt.NiceRepr
Inherit from this class and define
__nice__
to “nicely” print your objects.Defines
__str__
and__repr__
in terms of__nice__
function Classes that inherit fromNiceRepr
should redefine__nice__
. If the inheriting class has a__len__
, method then the default__nice__
method will return its length.Example
>>> import ubelt as ub >>> class Foo(ub.NiceRepr): ... def __nice__(self): ... return 'info' >>> foo = Foo() >>> assert str(foo) == '<Foo(info)>' >>> assert repr(foo).startswith('<Foo(info) at ')
Example
>>> import ubelt as ub >>> class Bar(ub.NiceRepr): ... pass >>> bar = Bar() >>> import pytest >>> with pytest.warns(RuntimeWarning) as record: >>> assert 'object at' in str(bar) >>> assert 'object at' in repr(bar)
Example
>>> import ubelt as ub >>> class Baz(ub.NiceRepr): ... def __len__(self): ... return 5 >>> baz = Baz() >>> assert str(baz) == '<Baz(5)>'
Example
>>> import ubelt as ub >>> # If your nice message has a bug, it shouldn't bring down the house >>> class Foo(ub.NiceRepr): ... def __nice__(self): ... assert False >>> foo = Foo() >>> import pytest >>> with pytest.warns(RuntimeWarning) as record: >>> print('foo = {!r}'.format(foo)) foo = <...Foo ...>
- kwimage.add_homog(pts)[source]¶
Add a homogenous coordinate to a point array
This is a convinience function, it is not particularly efficient.
- SeeAlso:
cv2.convertPointsToHomogeneous
Example
>>> pts = np.random.rand(10, 2) >>> add_homog(pts)
- Benchmark:
>>> import timerit >>> ti = timerit.Timerit(1000, bestof=10, verbose=2) >>> pts = np.random.rand(1000, 2) >>> for timer in ti.reset('kwimage'): >>> with timer: >>> kwimage.add_homog(pts) >>> for timer in ti.reset('cv2'): >>> with timer: >>> cv2.convertPointsToHomogeneous(pts) >>> # cv2 is 4x faster, but has more restrictive inputs
- kwimage.remove_homog(pts, mode='divide')[source]¶
Remove homogenous coordinate to a point array.
This is a convinience function, it is not particularly efficient.
- SeeAlso:
cv2.convertPointsFromHomogeneous
Example
>>> homog_pts = np.random.rand(10, 3) >>> remove_homog(homog_pts, 'divide') >>> remove_homog(homog_pts, 'drop')
- kwimage.subpixel_accum(dst, src, index, interp_axes=None)[source]¶
Add the source values array into the destination array at a particular subpixel index.
- Parameters
dst (ArrayLike) – destination accumulation array
src (ArrayLike) – source array containing values to add
index (Tuple[slice]) – subpixel slice into dst that corresponds with src
interp_axes (tuple) – specify which axes should be spatially interpolated
Notes
- Inputs:
- +—+—+—+—+—+ dst.shape = (5,)
+—+—+ src.shape = (2,) |=======| index = 1.5:3.5
Subpixel shift the source by -0.5. When the index is non-integral, pad the aligned src with an extra value to ensure all dst pixels that would be influenced by the smaller subpixel shape are influenced by the aligned src. Note that we are not scaling.
+—+—+—+ aligned_src.shape = (3,) |===========| aligned_index = 1:4
Example
>>> dst = np.zeros(5) >>> src = np.ones(2) >>> index = [slice(1.5, 3.5)] >>> subpixel_accum(dst, src, index) >>> print(ub.repr2(dst, precision=2, with_dtype=0)) np.array([0. , 0.5, 1. , 0.5, 0. ])
Example
>>> dst = np.zeros((6, 6)) >>> src = np.ones((3, 3)) >>> index = (slice(1.5, 4.5), slice(1, 4)) >>> subpixel_accum(dst, src, index) >>> print(ub.repr2(dst, precision=2, with_dtype=0)) np.array([[0. , 0. , 0. , 0. , 0. , 0. ], [0. , 0.5, 0.5, 0.5, 0. , 0. ], [0. , 1. , 1. , 1. , 0. , 0. ], [0. , 1. , 1. , 1. , 0. , 0. ], [0. , 0.5, 0.5, 0.5, 0. , 0. ], [0. , 0. , 0. , 0. , 0. , 0. ]]) >>> # xdoctest: +REQUIRES(module:torch) >>> dst = torch.zeros((1, 3, 6, 6)) >>> src = torch.ones((1, 3, 3, 3)) >>> index = (slice(None), slice(None), slice(1.5, 4.5), slice(1.25, 4.25)) >>> subpixel_accum(dst, src, index) >>> print(ub.repr2(dst.numpy()[0, 0], precision=2, with_dtype=0)) np.array([[0. , 0. , 0. , 0. , 0. , 0. ], [0. , 0.38, 0.5 , 0.5 , 0.12, 0. ], [0. , 0.75, 1. , 1. , 0.25, 0. ], [0. , 0.75, 1. , 1. , 0.25, 0. ], [0. , 0.38, 0.5 , 0.5 , 0.12, 0. ], [0. , 0. , 0. , 0. , 0. , 0. ]])
- Doctest:
>>> # TODO: move to a unit test file >>> subpixel_accum(np.zeros(5), np.ones(2), [slice(1.5, 3.5)]).tolist() [0.0, 0.5, 1.0, 0.5, 0.0] >>> subpixel_accum(np.zeros(5), np.ones(2), [slice(0, 2)]).tolist() [1.0, 1.0, 0.0, 0.0, 0.0] >>> subpixel_accum(np.zeros(5), np.ones(3), [slice(.5, 3.5)]).tolist() [0.5, 1.0, 1.0, 0.5, 0.0] >>> subpixel_accum(np.zeros(5), np.ones(3), [slice(-1, 2)]).tolist() [1.0, 1.0, 0.0, 0.0, 0.0] >>> subpixel_accum(np.zeros(5), np.ones(3), [slice(-1.5, 1.5)]).tolist() [1.0, 0.5, 0.0, 0.0, 0.0] >>> subpixel_accum(np.zeros(5), np.ones(3), [slice(10, 13)]).tolist() [0.0, 0.0, 0.0, 0.0, 0.0] >>> subpixel_accum(np.zeros(5), np.ones(3), [slice(3.25, 6.25)]).tolist() [0.0, 0.0, 0.0, 0.75, 1.0] >>> subpixel_accum(np.zeros(5), np.ones(3), [slice(4.9, 7.9)]).tolist() [0.0, 0.0, 0.0, 0.0, 0.099...] >>> subpixel_accum(np.zeros(5), np.ones(9), [slice(-1.5, 7.5)]).tolist() [1.0, 1.0, 1.0, 1.0, 1.0] >>> subpixel_accum(np.zeros(5), np.ones(9), [slice(2.625, 11.625)]).tolist() [0.0, 0.0, 0.375, 1.0, 1.0] >>> subpixel_accum(np.zeros(5), 1, [slice(2.625, 11.625)]).tolist() [0.0, 0.0, 0.375, 1.0, 1.0]
- kwimage.subpixel_align(dst, src, index, interp_axes=None)[source]¶
Returns an aligned version of the source tensor and destination index.
- Used as the backend to implement other subpixel functions like:
subpixel_accum, subpixel_maximum.
- kwimage.subpixel_getvalue(img, pts, coord_axes=None, interp='bilinear', bordermode='edge')[source]¶
Get values at subpixel locations
- Parameters
img (ArrayLike) – image to sample from
pts (ArrayLike) – subpixel rc-coordinates to sample
coord_axes (Sequence, default=None) – axes to perform interpolation on, if not specified the first d axes are interpolated, where d=pts.shape[-1]. IE: this indicates which axes each coordinate dimension corresponds to.
interp (str) – interpolation mode
bordermode (str) – how locations outside the image are handled
Example
>>> from kwimage.util_warp import * # NOQA >>> img = np.arange(3 * 3).reshape(3, 3) >>> pts = np.array([[1, 1], [1.5, 1.5], [1.9, 1.1]]) >>> subpixel_getvalue(img, pts) array([4. , 6. , 6.8]) >>> subpixel_getvalue(img, pts, coord_axes=(1, 0)) array([4. , 6. , 5.2]) >>> # xdoctest: +REQUIRES(module:torch) >>> img = torch.Tensor(img) >>> pts = torch.Tensor(pts) >>> subpixel_getvalue(img, pts) tensor([4.0000, 6.0000, 6.8000]) >>> subpixel_getvalue(img.numpy(), pts.numpy(), interp='nearest') array([4., 8., 7.], dtype=float32) >>> subpixel_getvalue(img.numpy(), pts.numpy(), interp='nearest', coord_axes=[1, 0]) array([4., 8., 5.], dtype=float32) >>> subpixel_getvalue(img, pts, interp='nearest') tensor([4., 8., 7.])
References
stackoverflow.com/uestions/12729228/simple-binlin-interp-images-numpy
- SeeAlso:
cv2.getRectSubPix(image, patchSize, center[, patch[, patchType]])
- kwimage.subpixel_maximum(dst, src, index, interp_axes=None)[source]¶
Take the max of the source values array into and the destination array at a particular subpixel index. Modifies the destination array.
- Parameters
dst (ArrayLike) – destination array to index into
src (ArrayLike) – source array that agrees with the index
index (Tuple[slice]) – subpixel slice into dst that corresponds with src
interp_axes (tuple) – specify which axes should be spatially interpolated
Example
>>> dst = np.array([0, 1.0, 1.0, 1.0, 0]) >>> src = np.array([2.0, 2.0]) >>> index = [slice(1.6, 3.6)] >>> subpixel_maximum(dst, src, index) >>> print(ub.repr2(dst, precision=2, with_dtype=0)) np.array([0. , 1. , 2. , 1.2, 0. ])
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> dst = torch.zeros((1, 3, 5, 5)) + .5 >>> src = torch.ones((1, 3, 3, 3)) >>> index = (slice(None), slice(None), slice(1.4, 4.4), slice(1.25, 4.25)) >>> subpixel_maximum(dst, src, index) >>> print(ub.repr2(dst.numpy()[0, 0], precision=2, with_dtype=0)) np.array([[0.5 , 0.5 , 0.5 , 0.5 , 0.5 ], [0.5 , 0.5 , 0.6 , 0.6 , 0.5 ], [0.5 , 0.75, 1. , 1. , 0.5 ], [0.5 , 0.75, 1. , 1. , 0.5 ], [0.5 , 0.5 , 0.5 , 0.5 , 0.5 ]])
- kwimage.subpixel_minimum(dst, src, index, interp_axes=None)[source]¶
Take the min of the source values array into and the destination array at a particular subpixel index. Modifies the destination array.
- Parameters
dst (ArrayLike) – destination array to index into
src (ArrayLike) – source array that agrees with the index
index (Tuple[slice]) – subpixel slice into dst that corresponds with src
interp_axes (tuple) – specify which axes should be spatially interpolated
Example
>>> dst = np.array([0, 1.0, 1.0, 1.0, 0]) >>> src = np.array([2.0, 2.0]) >>> index = [slice(1.6, 3.6)] >>> subpixel_minimum(dst, src, index) >>> print(ub.repr2(dst, precision=2, with_dtype=0)) np.array([0. , 0.8, 1. , 1. , 0. ])
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> dst = torch.zeros((1, 3, 5, 5)) + .5 >>> src = torch.ones((1, 3, 3, 3)) >>> index = (slice(None), slice(None), slice(1.4, 4.4), slice(1.25, 4.25)) >>> subpixel_minimum(dst, src, index) >>> print(ub.repr2(dst.numpy()[0, 0], precision=2, with_dtype=0)) np.array([[0.5 , 0.5 , 0.5 , 0.5 , 0.5 ], [0.5 , 0.45, 0.5 , 0.5 , 0.15], [0.5 , 0.5 , 0.5 , 0.5 , 0.25], [0.5 , 0.5 , 0.5 , 0.5 , 0.25], [0.5 , 0.3 , 0.4 , 0.4 , 0.1 ]])
- kwimage.subpixel_set(dst, src, index, interp_axes=None)[source]¶
Add the source values array into the destination array at a particular subpixel index.
- Parameters
dst (ArrayLike) – destination accumulation array
src (ArrayLike) – source array containing values to add
index (Tuple[slice]) – subpixel slice into dst that corresponds with src
interp_axes (tuple) – specify which axes should be spatially interpolated
Todo
[ ]: allow index to be a sequence indices
Example
>>> import kwimage >>> dst = np.zeros(5) + .1 >>> src = np.ones(2) >>> index = [slice(1.5, 3.5)] >>> kwimage.util_warp.subpixel_set(dst, src, index) >>> print(ub.repr2(dst, precision=2, with_dtype=0)) np.array([0.1, 0.5, 1. , 0.5, 0.1])
- kwimage.subpixel_setvalue(img, pts, value, coord_axes=None, interp='bilinear', bordermode='edge')[source]¶
Set values at subpixel locations
- Parameters
img (ArrayLike) – image to set values in
pts (ArrayLike) – subpixel rc-coordinates to set
value (ArrayLike) – value to place in the image
coord_axes (Sequence, default=None) – axes to perform interpolation on, if not specified the first d axes are interpolated, where d=pts.shape[-1]. IE: this indicates which axes each coordinate dimension corresponds to.
interp (str) – interpolation mode
bordermode (str) – how locations outside the image are handled
Example
>>> from kwimage.util_warp import * # NOQA >>> img = np.arange(3 * 3).reshape(3, 3).astype(float) >>> pts = np.array([[1, 1], [1.5, 1.5], [1.9, 1.1]]) >>> interp = 'bilinear' >>> value = 0 >>> print('img = {!r}'.format(img)) >>> pts = np.array([[1.5, 1.5]]) >>> img2 = subpixel_setvalue(img.copy(), pts, value) >>> print('img2 = {!r}'.format(img2)) >>> pts = np.array([[1.0, 1.0]]) >>> img2 = subpixel_setvalue(img.copy(), pts, value) >>> print('img2 = {!r}'.format(img2)) >>> pts = np.array([[1.1, 1.9]]) >>> img2 = subpixel_setvalue(img.copy(), pts, value) >>> print('img2 = {!r}'.format(img2)) >>> img2 = subpixel_setvalue(img.copy(), pts, value, coord_axes=[1, 0]) >>> print('img2 = {!r}'.format(img2))
- kwimage.subpixel_slice(inputs, index)[source]¶
Take a subpixel slice from a larger image. The returned output is left-aligned with the requested slice.
- Parameters
inputs (ArrayLike) – data
index (Tuple[slice]) – a slice to subpixel accuracy
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> import kwimage >>> import torch >>> # say we have a (576, 576) input space >>> # and a (9, 9) output space downsampled by 64x >>> ospc_feats = np.tile(np.arange(9 * 9).reshape(1, 9, 9), (1024, 1, 1)) >>> inputs = torch.from_numpy(ospc_feats) >>> # We detected a box in the input space >>> ispc_bbox = kwimage.Boxes([[64, 65, 100, 120]], 'ltrb') >>> # Get coordinates in the output space >>> ospc_bbox = ispc_bbox.scale(1 / 64) >>> tl_x, tl_y, br_x, br_y = ospc_bbox.data[0] >>> # Convert the box to a slice >>> index = [slice(None), slice(tl_y, br_y), slice(tl_x, br_x)] >>> # Note: I'm not 100% sure this work right with non-intergral slices >>> outputs = kwimage.subpixel_slice(inputs, index)
Example
>>> inputs = np.arange(5 * 5 * 3).reshape(5, 5, 3) >>> index = [slice(0, 3), slice(0, 3)] >>> outputs = subpixel_slice(inputs, index) >>> index = [slice(0.5, 3.5), slice(-0.5, 2.5)] >>> outputs = subpixel_slice(inputs, index)
>>> inputs = np.arange(5 * 5).reshape(1, 5, 5).astype(float) >>> index = [slice(None), slice(3, 6), slice(3, 6)] >>> outputs = subpixel_slice(inputs, index) >>> print(outputs) [[[18. 19. 0.] [23. 24. 0.] [ 0. 0. 0.]]] >>> index = [slice(None), slice(3.5, 6.5), slice(2.5, 5.5)] >>> outputs = subpixel_slice(inputs, index) >>> print(outputs) [[[20. 21. 10.75] [11.25 11.75 6. ] [ 0. 0. 0. ]]]
- kwimage.subpixel_translate(inputs, shift, interp_axes=None, output_shape=None)[source]¶
Translates an image by a subpixel shift value using bilinear interpolation
- Parameters
inputs (ArrayLike) – data to translate
shift (Sequence) – amount to translate each dimension specified by interp_axes. Note: if inputs contains more than one “image” then all “images” are translated by the same amount. This function contains no mechanism for translating each image differently. Note that by default this is a y,x shift for 2 dimensions.
interp_axes (Sequence, default=None) – axes to perform interpolation on, if not specified the final n axes are interpolated, where n=len(shift)
output_shape (tuple, default=None) – if specified the output is returned with this shape, otherwise
Notes
This function powers most other functions in this file. Speedups here can go a long way.
Example
>>> inputs = np.arange(5) + 1 >>> print(inputs.tolist()) [1, 2, 3, 4, 5] >>> outputs = subpixel_translate(inputs, 1.5) >>> print(outputs.tolist()) [0.0, 0.5, 1.5, 2.5, 3.5]
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> inputs = torch.arange(9).view(1, 1, 3, 3).float() >>> print(inputs.long()) tensor([[[[0, 1, 2], [3, 4, 5], [6, 7, 8]]]]) >>> outputs = subpixel_translate(inputs, (-.4, .5), output_shape=(1, 1, 2, 5)) >>> print(outputs) tensor([[[[0.6000, 1.7000, 2.7000, 1.6000, 0.0000], [2.1000, 4.7000, 5.7000, 3.1000, 0.0000]]]])
- Ignore:
>>> inputs = np.arange(5) >>> shift = -.6 >>> interp_axes = None >>> subpixel_translate(inputs, -.6) >>> subpixel_translate(inputs[None, None, None, :], -.6) >>> inputs = np.arange(25).reshape(5, 5) >>> shift = (-1.6, 2.3) >>> interp_axes = (0, 1) >>> subpixel_translate(inputs, shift, interp_axes, output_shape=(9, 9)) >>> subpixel_translate(inputs, shift, interp_axes, output_shape=(3, 4))
- kwimage.warp_points(matrix, pts, homog_mode='divide')[source]¶
Warp ND points / coordinates using a transformation matrix.
Homogoenous coordinates are added on the fly if needed. Works with both numpy and torch.
- Parameters
matrix (ArrayLike) – [D1 x D2] transformation matrix. if using homogenous coordinates D2=D + 1, otherwise D2=D. if using homogenous coordinates and the matrix represents an Affine transformation, then either D1=D or D1=D2, i.e. the last row of zeros and a one is optional.
pts (ArrayLike) – [N1 x … x D] points (usually x, y). If points are already in homogenous space, then the output will be returned in homogenous space. D is the dimensionality of the points. The leading axis may take any shape, but usually, shape will be [N x D] where N is the number of points.
homog_mode (str, default=’divide’) – what to do for homogenous coordinates. Can either divide, keep, or drop.
- Retrns:
new_pts (ArrayLike): the points after being transformed by the matrix
Example
>>> from kwimage.util_warp import * # NOQA >>> # --- with numpy >>> rng = np.random.RandomState(0) >>> pts = rng.rand(10, 2) >>> matrix = rng.rand(2, 2) >>> warp_points(matrix, pts) >>> # --- with torch >>> # xdoctest: +REQUIRES(module:torch) >>> pts = torch.Tensor(pts) >>> matrix = torch.Tensor(matrix) >>> warp_points(matrix, pts)
Example
>>> from kwimage.util_warp import * # NOQA >>> # --- with numpy >>> pts = np.ones((10, 2)) >>> matrix = np.diag([2, 3, 1]) >>> ra = warp_points(matrix, pts) >>> # xdoctest: +REQUIRES(module:torch) >>> rb = warp_points(torch.Tensor(matrix), torch.Tensor(pts)) >>> assert np.allclose(ra, rb.numpy())
Example
>>> from kwimage.util_warp import * # NOQA >>> # test different cases >>> rng = np.random.RandomState(0) >>> # Test 3x3 style projective matrices >>> pts = rng.rand(1000, 2) >>> matrix = rng.rand(3, 3) >>> ra33 = warp_points(matrix, pts) >>> # xdoctest: +REQUIRES(module:torch) >>> rb33 = warp_points(torch.Tensor(matrix), torch.Tensor(pts)) >>> assert np.allclose(ra33, rb33.numpy()) >>> # Test opencv style affine matrices >>> pts = rng.rand(10, 2) >>> matrix = rng.rand(2, 3) >>> ra23 = warp_points(matrix, pts) >>> rb23 = warp_points(torch.Tensor(matrix), torch.Tensor(pts)) >>> assert np.allclose(ra33, rb33.numpy())
- kwimage.warp_tensor(inputs, mat, output_dims, mode='bilinear', padding_mode='zeros', isinv=False, ishomog=None, align_corners=False, new_mode=False)[source]¶
A pytorch implementation of warp affine that works similarly to cv2.warpAffine / cv2.warpPerspective.
It is possible to use 3x3 transforms to warp 2D image data. It is also possible to use 4x4 transforms to warp 3D volumetric data.
- Parameters
inputs (Tensor[…, *DIMS]) – tensor to warp. Up to 3 (determined by output_dims) of the trailing space-time dimensions are warped. Best practice is to use inputs with the shape in [B, C, *DIMS].
mat (Tensor) – either a 3x3 / 4x4 single transformation matrix to apply to all inputs or Bx3x3 or Bx4x4 tensor that specifies a transformation matrix for each batch item.
output_dims (Tuple[int]*) –
- The output space-time dimensions. This can either be in the form
(W,), (H, W), or (D, H, W).
mode (str) – Can be bilinear or nearest. See torch.nn.functional.grid_sample
padding_mode (str) – Can be zeros, border, or reflection. See torch.nn.functional.grid_sample.
isinv (bool, default=False) – Set to true if mat is the inverse transform
ishomog (bool, default=None) – Set to True if the matrix is non-affine
align_corners (bool, default=False) – Note the default of False does not work correctly with grid_sample in torch <= 1.2, but using align_corners=True isnt typically what you want either. We will be stuck with buggy functionality until torch 1.3 is released.
However, using align_corners=0 does seem to reasonably correspond with opencv behavior.
Notes
Also, it may be possible to speed up the code with F.affine_grid
- KNOWN ISSUE: There appears to some difference with cv2.warpAffine when
rotation or shear are non-zero. I’m not sure what the cause is. It may just be floating point issues, but Im’ not sure.
Todo
[ ] FIXME: see example in Mask.scale where this algo breaks when
the matrix is 2x3 - [ ] Make this algo work when matrix ix 2x2
References
https://discuss.pytorch.org/t/affine-transformation-matrix-paramters-conversion/19522 https://github.com/pytorch/pytorch/issues/15386
Example
>>> # Create a relatively simple affine matrix >>> # xdoctest: +REQUIRES(module:torch) >>> import skimage >>> mat = torch.FloatTensor(skimage.transform.AffineTransform( >>> translation=[1, -1], scale=[.532, 2], >>> rotation=0, shear=0, >>> ).params) >>> # Create inputs and an output dimension >>> input_shape = [1, 1, 4, 5] >>> inputs = torch.arange(int(np.prod(input_shape))).reshape(*input_shape).float() >>> output_dims = (11, 7) >>> # Warp with our code >>> result1 = warp_tensor(inputs, mat, output_dims=output_dims, align_corners=0) >>> print('result1 =\n{}'.format(ub.repr2(result1.cpu().numpy()[0, 0], precision=2))) >>> # Warp with opencv >>> import cv2 >>> cv2_M = mat.cpu().numpy()[0:2] >>> src = inputs[0, 0].cpu().numpy() >>> dsize = tuple(output_dims[::-1]) >>> result2 = cv2.warpAffine(src, cv2_M, dsize=dsize, flags=cv2.INTER_LINEAR) >>> print('result2 =\n{}'.format(ub.repr2(result2, precision=2))) >>> # Ensure the results are the same (up to floating point errors) >>> assert np.all(np.isclose(result1[0, 0].cpu().numpy(), result2, atol=1e-2, rtol=1e-2))
Example
>>> # Create a relatively simple affine matrix >>> # xdoctest: +REQUIRES(module:torch) >>> import skimage >>> mat = torch.FloatTensor(skimage.transform.AffineTransform( >>> rotation=0.01, shear=0.1).params) >>> # Create inputs and an output dimension >>> input_shape = [1, 1, 4, 5] >>> inputs = torch.arange(int(np.prod(input_shape))).reshape(*input_shape).float() >>> output_dims = (11, 7) >>> # Warp with our code >>> result1 = warp_tensor(inputs, mat, output_dims=output_dims) >>> print('result1 =\n{}'.format(ub.repr2(result1.cpu().numpy()[0, 0], precision=2, supress_small=True))) >>> print('result1.shape = {}'.format(result1.shape)) >>> # Warp with opencv >>> import cv2 >>> cv2_M = mat.cpu().numpy()[0:2] >>> src = inputs[0, 0].cpu().numpy() >>> dsize = tuple(output_dims[::-1]) >>> result2 = cv2.warpAffine(src, cv2_M, dsize=dsize, flags=cv2.INTER_LINEAR) >>> print('result2 =\n{}'.format(ub.repr2(result2, precision=2))) >>> print('result2.shape = {}'.format(result2.shape)) >>> # Ensure the results are the same (up to floating point errors) >>> # NOTE: The floating point errors seem to be significant for rotation / shear >>> assert np.all(np.isclose(result1[0, 0].cpu().numpy(), result2, atol=1, rtol=1e-2))
Example
>>> # Create a random affine matrix >>> # xdoctest: +REQUIRES(module:torch) >>> import skimage >>> rng = np.random.RandomState(0) >>> mat = torch.FloatTensor(skimage.transform.AffineTransform( >>> translation=rng.randn(2), scale=1 + rng.randn(2), >>> rotation=rng.randn() / 10., shear=rng.randn() / 10., >>> ).params) >>> # Create inputs and an output dimension >>> input_shape = [1, 1, 5, 7] >>> inputs = torch.arange(int(np.prod(input_shape))).reshape(*input_shape).float() >>> output_dims = (3, 11) >>> # Warp with our code >>> result1 = warp_tensor(inputs, mat, output_dims=output_dims, align_corners=0) >>> print('result1 =\n{}'.format(ub.repr2(result1.cpu().numpy()[0, 0], precision=2))) >>> # Warp with opencv >>> import cv2 >>> cv2_M = mat.cpu().numpy()[0:2] >>> src = inputs[0, 0].cpu().numpy() >>> dsize = tuple(output_dims[::-1]) >>> result2 = cv2.warpAffine(src, cv2_M, dsize=dsize, flags=cv2.INTER_LINEAR) >>> print('result2 =\n{}'.format(ub.repr2(result2, precision=2))) >>> # Ensure the results are the same (up to floating point errors) >>> # NOTE: The errors seem to be significant for rotation / shear >>> assert np.all(np.isclose(result1[0, 0].cpu().numpy(), result2, atol=1, rtol=1e-2))
Example
>>> # Test 3D warping with identity >>> # xdoctest: +REQUIRES(module:torch) >>> mat = torch.eye(4) >>> input_dims = [2, 3, 3] >>> output_dims = (2, 3, 3) >>> input_shape = [1, 1] + input_dims >>> inputs = torch.arange(int(np.prod(input_shape))).reshape(*input_shape).float() >>> result = warp_tensor(inputs, mat, output_dims=output_dims) >>> print('result =\n{}'.format(ub.repr2(result.cpu().numpy()[0, 0], precision=2))) >>> assert torch.all(inputs == result)
Example
>>> # Test 3D warping with scaling >>> # xdoctest: +REQUIRES(module:torch) >>> mat = torch.FloatTensor([ >>> [0.8, 0, 0, 0], >>> [ 0, 1.0, 0, 0], >>> [ 0, 0, 1.2, 0], >>> [ 0, 0, 0, 1], >>> ]) >>> input_dims = [2, 3, 3] >>> output_dims = (2, 3, 3) >>> input_shape = [1, 1] + input_dims >>> inputs = torch.arange(int(np.prod(input_shape))).reshape(*input_shape).float() >>> result = warp_tensor(inputs, mat, output_dims=output_dims, align_corners=0) >>> print('result =\n{}'.format(ub.repr2(result.cpu().numpy()[0, 0], precision=2))) result = np.array([[[ 0. , 1.25, 1. ], [ 3. , 4.25, 2.5 ], [ 6. , 7.25, 4. ]], ... [[ 7.5 , 8.75, 4.75], [10.5 , 11.75, 6.25], [13.5 , 14.75, 7.75]]], dtype=np.float32)
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> mat = torch.eye(3) >>> input_dims = [5, 7] >>> output_dims = (11, 7) >>> for n_prefix_dims in [0, 1, 2, 3, 4, 5]: >>> input_shape = [2] * n_prefix_dims + input_dims >>> inputs = torch.arange(int(np.prod(input_shape))).reshape(*input_shape).float() >>> result = warp_tensor(inputs, mat, output_dims=output_dims) >>> #print('result =\n{}'.format(ub.repr2(result.cpu().numpy(), precision=2))) >>> print(result.shape)
Example
>>> # xdoctest: +REQUIRES(module:torch) >>> mat = torch.eye(4) >>> input_dims = [5, 5, 5] >>> output_dims = (6, 6, 6) >>> for n_prefix_dims in [0, 1, 2, 3, 4, 5]: >>> input_shape = [2] * n_prefix_dims + input_dims >>> inputs = torch.arange(int(np.prod(input_shape))).reshape(*input_shape).float() >>> result = warp_tensor(inputs, mat, output_dims=output_dims) >>> #print('result =\n{}'.format(ub.repr2(result.cpu().numpy(), precision=2))) >>> print(result.shape)
- Ignore:
import xdev globals().update(xdev.get_func_kwargs(warp_tensor)) >>> # xdoctest: +REQUIRES(module:torch) >>> import cv2 >>> inputs = torch.arange(9).view(1, 1, 3, 3).float() + 2 >>> input_dims = inputs.shape[2:] >>> #output_dims = (6, 6) >>> def fmt(a): >>> return ub.repr2(a.numpy(), precision=2) >>> s = 2.5 >>> output_dims = tuple(np.round((np.array(input_dims) * s)).astype(int).tolist()) >>> mat = torch.FloatTensor([[s, 0, 0], [0, s, 0], [0, 0, 1]]) >>> inv = mat.inverse() >>> warp_tensor(inputs, mat, output_dims) >>> print(‘## INPUTS’) >>> print(fmt(inputs)) >>> print(’nalign_corners=True’) >>> print(’—-‘) >>> print(‘## warp_tensor, align_corners=True’) >>> print(fmt(warp_tensor(inputs, inv, output_dims, isinv=True, align_corners=True))) >>> print(‘## interpolate, align_corners=True’) >>> print(fmt(F.interpolate(inputs, output_dims, mode=’bilinear’, align_corners=True))) >>> print(’nalign_corners=False’) >>> print(’—-‘) >>> print(‘## warp_tensor, align_corners=False, new_mode=False’) >>> print(fmt(warp_tensor(inputs, inv, output_dims, isinv=True, align_corners=False))) >>> print(‘## warp_tensor, align_corners=False, new_mode=True’) >>> print(fmt(warp_tensor(inputs, inv, output_dims, isinv=True, align_corners=False, new_mode=True))) >>> print(‘## interpolate, align_corners=False’) >>> print(fmt(F.interpolate(inputs, output_dims, mode=’bilinear’, align_corners=False))) >>> print(‘## interpolate (scale), align_corners=False’) >>> print(ub.repr2(F.interpolate(inputs, scale_factor=s, mode=’bilinear’, align_corners=False).numpy(), precision=2)) >>> cv2_M = mat.cpu().numpy()[0:2] >>> src = inputs[0, 0].cpu().numpy() >>> dsize = tuple(output_dims[::-1]) >>> print(’nOpen CV warp Result’) >>> result2 = (cv2.warpAffine(src, cv2_M, dsize=dsize, flags=cv2.INTER_LINEAR)) >>> print(‘result2 =n{}’.format(ub.repr2(result2, precision=2)))
- 1
Created with sphinx-autoapi