:py:mod:`kwimage.util_warp`
===========================

.. py:module:: kwimage.util_warp

.. autoapi-nested-parse::

   .. todo:: - [ ] Replace internal padded slice with kwarray.padded_slice


Module Contents
---------------


Functions
~~~~~~~~~

.. autoapisummary::

   kwimage.util_warp._coordinate_grid
   kwimage.util_warp.warp_image
   kwimage.util_warp.warp_tensor
   kwimage.util_warp.subpixel_align
   kwimage.util_warp.subpixel_set
   kwimage.util_warp.subpixel_accum
   kwimage.util_warp.subpixel_maximum
   kwimage.util_warp.subpixel_minimum
   kwimage.util_warp.subpixel_slice
   kwimage.util_warp.subpixel_translate
   kwimage.util_warp._padded_slice
   kwimage.util_warp._ensure_arraylike
   kwimage.util_warp._rectify_slice
   kwimage.util_warp._warp_tensor_cv2
   kwimage.util_warp.warp_points
   kwimage.util_warp.remove_homog
   kwimage.util_warp.add_homog
   kwimage.util_warp.subpixel_getvalue
   kwimage.util_warp.subpixel_setvalue
   kwimage.util_warp._bilinear_coords


Attributes
~~~~~~~~~~

.. autoapisummary::

   kwimage.util_warp.TORCH_GRID_SAMPLE_HAS_ALIGN


.. py:data:: TORCH_GRID_SAMPLE_HAS_ALIGN
   

.. py:function:: _coordinate_grid(dims, align_corners=False)

   Creates a homogenous coordinate system.

   :Parameters: * **dims** (*Tuple[int*]*) -- height / width or depth / height / width
                * **align_corners** (*bool*) -- returns a grid where the left and right corners assigned to the
                  extreme values and intermediate values are interpolated.

   :returns: Tensor[shape=(3, *DIMS)]

   .. rubric:: References

   https://github.com/ClementPinard/SfmLearner-Pytorch/blob/master/inverse_warp.py

   .. rubric:: Example

   >>> # xdoctest: +IGNORE_WHITESPACE
   >>> # xdoctest: +REQUIRES(module:torch)
   >>> _coordinate_grid((2, 2))
   tensor([[[0., 1.],
            [0., 1.]],
           [[0., 0.],
            [1., 1.]],
           [[1., 1.],
            [1., 1.]]])
   >>> _coordinate_grid((2, 2, 2))
   >>> _coordinate_grid((2, 2), align_corners=True)
   tensor([[[0., 2.],
            [0., 2.]],
           [[0., 0.],
            [2., 2.]],
           [[1., 1.],
            [1., 1.]]])


.. py:function:: warp_image(inputs, mat, **kw)


.. py:function:: warp_tensor(inputs, mat, output_dims, mode='bilinear', padding_mode='zeros', isinv=False, ishomog=None, align_corners=False, new_mode=False)

   A pytorch implementation of warp affine that works similarly to
   cv2.warpAffine / cv2.warpPerspective.

   It is possible to use 3x3 transforms to warp 2D image data.
   It is also possible to use 4x4 transforms to warp 3D volumetric data.

   :Parameters: * **inputs** (*Tensor[..., *DIMS]*) -- tensor to warp.
                  Up to 3 (determined by output_dims) of the trailing space-time
                  dimensions are warped. Best practice is to use inputs with the
                  shape in [B, C, *DIMS].
                * **mat** (*Tensor*) -- either a 3x3 / 4x4 single transformation matrix to apply to all
                  inputs or Bx3x3 or Bx4x4 tensor that specifies a transformation
                  matrix for each batch item.
                * **output_dims** (*Tuple[int*]*) --

                  The output space-time dimensions. This can either be in the form
                      (W,), (H, W), or (D, H, W).
                * **mode** (*str*) -- Can be bilinear or nearest.
                  See `torch.nn.functional.grid_sample`
                * **padding_mode** (*str*) -- Can be zeros, border, or reflection.
                  See `torch.nn.functional.grid_sample`.
                * **isinv** (*bool, default=False*) -- Set to true if `mat` is the inverse transform
                * **ishomog** (*bool, default=None*) -- Set to True if the matrix is non-affine
                * **align_corners** (*bool, default=False*) -- Note the default of False does not work correctly with grid_sample
                  in torch <= 1.2, but using align_corners=True isnt typically what
                  you want either. We will be stuck with buggy functionality until
                  torch 1.3 is released.

                  However, using align_corners=0 does seem to reasonably correspond
                  with opencv behavior.

   .. rubric:: Notes

   Also, it may be possible to speed up the code with `F.affine_grid`

   KNOWN ISSUE: There appears to some difference with cv2.warpAffine when
       rotation or shear are non-zero. I'm not sure what the cause is.
       It may just be floating point issues, but Im' not sure.

   .. todo::

      - [ ] FIXME: see example in Mask.scale where this algo breaks when
      the matrix is `2x3`
      - [ ] Make this algo work when matrix ix 2x2

   .. rubric:: References

   https://discuss.pytorch.org/t/affine-transformation-matrix-paramters-conversion/19522
   https://github.com/pytorch/pytorch/issues/15386

   .. rubric:: Example

   >>> # Create a relatively simple affine matrix
   >>> # xdoctest: +REQUIRES(module:torch)
   >>> import skimage
   >>> mat = torch.FloatTensor(skimage.transform.AffineTransform(
   >>>     translation=[1, -1], scale=[.532, 2],
   >>>     rotation=0, shear=0,
   >>> ).params)
   >>> # Create inputs and an output dimension
   >>> input_shape = [1, 1, 4, 5]
   >>> inputs = torch.arange(int(np.prod(input_shape))).reshape(*input_shape).float()
   >>> output_dims = (11, 7)
   >>> # Warp with our code
   >>> result1 = warp_tensor(inputs, mat, output_dims=output_dims, align_corners=0)
   >>> print('result1 =\n{}'.format(ub.repr2(result1.cpu().numpy()[0, 0], precision=2)))
   >>> # Warp with opencv
   >>> import cv2
   >>> cv2_M = mat.cpu().numpy()[0:2]
   >>> src = inputs[0, 0].cpu().numpy()
   >>> dsize = tuple(output_dims[::-1])
   >>> result2 = cv2.warpAffine(src, cv2_M, dsize=dsize, flags=cv2.INTER_LINEAR)
   >>> print('result2 =\n{}'.format(ub.repr2(result2, precision=2)))
   >>> # Ensure the results are the same (up to floating point errors)
   >>> assert np.all(np.isclose(result1[0, 0].cpu().numpy(), result2, atol=1e-2, rtol=1e-2))

   .. rubric:: Example

   >>> # Create a relatively simple affine matrix
   >>> # xdoctest: +REQUIRES(module:torch)
   >>> import skimage
   >>> mat = torch.FloatTensor(skimage.transform.AffineTransform(
   >>>     rotation=0.01, shear=0.1).params)
   >>> # Create inputs and an output dimension
   >>> input_shape = [1, 1, 4, 5]
   >>> inputs = torch.arange(int(np.prod(input_shape))).reshape(*input_shape).float()
   >>> output_dims = (11, 7)
   >>> # Warp with our code
   >>> result1 = warp_tensor(inputs, mat, output_dims=output_dims)
   >>> print('result1 =\n{}'.format(ub.repr2(result1.cpu().numpy()[0, 0], precision=2, supress_small=True)))
   >>> print('result1.shape = {}'.format(result1.shape))
   >>> # Warp with opencv
   >>> import cv2
   >>> cv2_M = mat.cpu().numpy()[0:2]
   >>> src = inputs[0, 0].cpu().numpy()
   >>> dsize = tuple(output_dims[::-1])
   >>> result2 = cv2.warpAffine(src, cv2_M, dsize=dsize, flags=cv2.INTER_LINEAR)
   >>> print('result2 =\n{}'.format(ub.repr2(result2, precision=2)))
   >>> print('result2.shape = {}'.format(result2.shape))
   >>> # Ensure the results are the same (up to floating point errors)
   >>> # NOTE: The floating point errors seem to be significant for rotation / shear
   >>> assert np.all(np.isclose(result1[0, 0].cpu().numpy(), result2, atol=1, rtol=1e-2))

   .. rubric:: Example

   >>> # Create a random affine matrix
   >>> # xdoctest: +REQUIRES(module:torch)
   >>> import skimage
   >>> rng = np.random.RandomState(0)
   >>> mat = torch.FloatTensor(skimage.transform.AffineTransform(
   >>>     translation=rng.randn(2), scale=1 + rng.randn(2),
   >>>     rotation=rng.randn() / 10., shear=rng.randn() / 10.,
   >>> ).params)
   >>> # Create inputs and an output dimension
   >>> input_shape = [1, 1, 5, 7]
   >>> inputs = torch.arange(int(np.prod(input_shape))).reshape(*input_shape).float()
   >>> output_dims = (3, 11)
   >>> # Warp with our code
   >>> result1 = warp_tensor(inputs, mat, output_dims=output_dims, align_corners=0)
   >>> print('result1 =\n{}'.format(ub.repr2(result1.cpu().numpy()[0, 0], precision=2)))
   >>> # Warp with opencv
   >>> import cv2
   >>> cv2_M = mat.cpu().numpy()[0:2]
   >>> src = inputs[0, 0].cpu().numpy()
   >>> dsize = tuple(output_dims[::-1])
   >>> result2 = cv2.warpAffine(src, cv2_M, dsize=dsize, flags=cv2.INTER_LINEAR)
   >>> print('result2 =\n{}'.format(ub.repr2(result2, precision=2)))
   >>> # Ensure the results are the same (up to floating point errors)
   >>> # NOTE: The errors seem to be significant for rotation / shear
   >>> assert np.all(np.isclose(result1[0, 0].cpu().numpy(), result2, atol=1, rtol=1e-2))

   .. rubric:: Example

   >>> # Test 3D warping with identity
   >>> # xdoctest: +REQUIRES(module:torch)
   >>> mat = torch.eye(4)
   >>> input_dims = [2, 3, 3]
   >>> output_dims = (2, 3, 3)
   >>> input_shape = [1, 1] + input_dims
   >>> inputs = torch.arange(int(np.prod(input_shape))).reshape(*input_shape).float()
   >>> result = warp_tensor(inputs, mat, output_dims=output_dims)
   >>> print('result =\n{}'.format(ub.repr2(result.cpu().numpy()[0, 0], precision=2)))
   >>> assert torch.all(inputs == result)

   .. rubric:: Example

   >>> # Test 3D warping with scaling
   >>> # xdoctest: +REQUIRES(module:torch)
   >>> mat = torch.FloatTensor([
   >>>     [0.8,   0,   0, 0],
   >>>     [  0, 1.0,   0, 0],
   >>>     [  0,   0, 1.2, 0],
   >>>     [  0,   0,   0, 1],
   >>> ])
   >>> input_dims = [2, 3, 3]
   >>> output_dims = (2, 3, 3)
   >>> input_shape = [1, 1] + input_dims
   >>> inputs = torch.arange(int(np.prod(input_shape))).reshape(*input_shape).float()
   >>> result = warp_tensor(inputs, mat, output_dims=output_dims, align_corners=0)
   >>> print('result =\n{}'.format(ub.repr2(result.cpu().numpy()[0, 0], precision=2)))
   result =
   np.array([[[ 0.  ,  1.25,  1.  ],
              [ 3.  ,  4.25,  2.5 ],
              [ 6.  ,  7.25,  4.  ]],
             ...
             [[ 7.5 ,  8.75,  4.75],
              [10.5 , 11.75,  6.25],
              [13.5 , 14.75,  7.75]]], dtype=np.float32)

   .. rubric:: Example

   >>> # xdoctest: +REQUIRES(module:torch)
   >>> mat = torch.eye(3)
   >>> input_dims = [5, 7]
   >>> output_dims = (11, 7)
   >>> for n_prefix_dims in [0, 1, 2, 3, 4, 5]:
   >>>      input_shape = [2] * n_prefix_dims + input_dims
   >>>      inputs = torch.arange(int(np.prod(input_shape))).reshape(*input_shape).float()
   >>>      result = warp_tensor(inputs, mat, output_dims=output_dims)
   >>>      #print('result =\n{}'.format(ub.repr2(result.cpu().numpy(), precision=2)))
   >>>      print(result.shape)

   .. rubric:: Example

   >>> # xdoctest: +REQUIRES(module:torch)
   >>> mat = torch.eye(4)
   >>> input_dims = [5, 5, 5]
   >>> output_dims = (6, 6, 6)
   >>> for n_prefix_dims in [0, 1, 2, 3, 4, 5]:
   >>>      input_shape = [2] * n_prefix_dims + input_dims
   >>>      inputs = torch.arange(int(np.prod(input_shape))).reshape(*input_shape).float()
   >>>      result = warp_tensor(inputs, mat, output_dims=output_dims)
   >>>      #print('result =\n{}'.format(ub.repr2(result.cpu().numpy(), precision=2)))
   >>>      print(result.shape)

   Ignore:
       import xdev
       globals().update(xdev.get_func_kwargs(warp_tensor))
       >>> # xdoctest: +REQUIRES(module:torch)
       >>> import cv2
       >>> inputs = torch.arange(9).view(1, 1, 3, 3).float() + 2
       >>> input_dims = inputs.shape[2:]
       >>> #output_dims = (6, 6)
       >>> def fmt(a):
       >>>     return ub.repr2(a.numpy(), precision=2)
       >>> s = 2.5
       >>> output_dims = tuple(np.round((np.array(input_dims) * s)).astype(int).tolist())
       >>> mat = torch.FloatTensor([[s, 0, 0], [0, s, 0], [0, 0, 1]])
       >>> inv = mat.inverse()
       >>> warp_tensor(inputs, mat, output_dims)
       >>> print('## INPUTS')
       >>> print(fmt(inputs))
       >>> print('\nalign_corners=True')
       >>> print('----')
       >>> print('## warp_tensor, align_corners=True')
       >>> print(fmt(warp_tensor(inputs, inv, output_dims, isinv=True, align_corners=True)))
       >>> print('## interpolate, align_corners=True')
       >>> print(fmt(F.interpolate(inputs, output_dims, mode='bilinear', align_corners=True)))
       >>> print('\nalign_corners=False')
       >>> print('----')
       >>> print('## warp_tensor, align_corners=False, new_mode=False')
       >>> print(fmt(warp_tensor(inputs, inv, output_dims, isinv=True, align_corners=False)))
       >>> print('## warp_tensor, align_corners=False, new_mode=True')
       >>> print(fmt(warp_tensor(inputs, inv, output_dims, isinv=True, align_corners=False, new_mode=True)))
       >>> print('## interpolate, align_corners=False')
       >>> print(fmt(F.interpolate(inputs, output_dims, mode='bilinear', align_corners=False)))
       >>> print('## interpolate (scale), align_corners=False')
       >>> print(ub.repr2(F.interpolate(inputs, scale_factor=s, mode='bilinear', align_corners=False).numpy(), precision=2))
       >>> cv2_M = mat.cpu().numpy()[0:2]
       >>> src = inputs[0, 0].cpu().numpy()
       >>> dsize = tuple(output_dims[::-1])
       >>> print('\nOpen CV warp Result')
       >>> result2 = (cv2.warpAffine(src, cv2_M, dsize=dsize, flags=cv2.INTER_LINEAR))
       >>> print('result2 =\n{}'.format(ub.repr2(result2, precision=2)))


.. py:function:: subpixel_align(dst, src, index, interp_axes=None)

   Returns an aligned version of the source tensor and destination index.

   Used as the backend to implement other subpixel functions like:
       subpixel_accum, subpixel_maximum.


.. py:function:: subpixel_set(dst, src, index, interp_axes=None)

   Add the source values array into the destination array at a particular
   subpixel index.

   :Parameters: * **dst** (*ArrayLike*) -- destination accumulation array
                * **src** (*ArrayLike*) -- source array containing values to add
                * **index** (*Tuple[slice]*) -- subpixel slice into dst that corresponds with src
                * **interp_axes** (*tuple*) -- specify which axes should be spatially interpolated

   .. todo:: - [ ]: allow index to be a sequence indices

   .. rubric:: Example

   >>> import kwimage
   >>> dst = np.zeros(5) + .1
   >>> src = np.ones(2)
   >>> index = [slice(1.5, 3.5)]
   >>> kwimage.util_warp.subpixel_set(dst, src, index)
   >>> print(ub.repr2(dst, precision=2, with_dtype=0))
   np.array([0.1, 0.5, 1. , 0.5, 0.1])


.. py:function:: subpixel_accum(dst, src, index, interp_axes=None)

   Add the source values array into the destination array at a particular
   subpixel index.

   :Parameters: * **dst** (*ArrayLike*) -- destination accumulation array
                * **src** (*ArrayLike*) -- source array containing values to add
                * **index** (*Tuple[slice]*) -- subpixel slice into dst that corresponds with src
                * **interp_axes** (*tuple*) -- specify which axes should be spatially interpolated

   .. rubric:: Notes

   Inputs:
       +---+---+---+---+---+  dst.shape = (5,)
             +---+---+        src.shape = (2,)
             |=======|        index = 1.5:3.5

   Subpixel shift the source by -0.5.
   When the index is non-integral, pad the aligned src with an extra value
   to ensure all dst pixels that would be influenced by the smaller
   subpixel shape are influenced by the aligned src. Note that we are not
   scaling.

           +---+---+---+      aligned_src.shape = (3,)
           |===========|      aligned_index = 1:4

   .. rubric:: Example

   >>> dst = np.zeros(5)
   >>> src = np.ones(2)
   >>> index = [slice(1.5, 3.5)]
   >>> subpixel_accum(dst, src, index)
   >>> print(ub.repr2(dst, precision=2, with_dtype=0))
   np.array([0. , 0.5, 1. , 0.5, 0. ])

   .. rubric:: Example

   >>> dst = np.zeros((6, 6))
   >>> src = np.ones((3, 3))
   >>> index = (slice(1.5, 4.5), slice(1, 4))
   >>> subpixel_accum(dst, src, index)
   >>> print(ub.repr2(dst, precision=2, with_dtype=0))
   np.array([[0. , 0. , 0. , 0. , 0. , 0. ],
             [0. , 0.5, 0.5, 0.5, 0. , 0. ],
             [0. , 1. , 1. , 1. , 0. , 0. ],
             [0. , 1. , 1. , 1. , 0. , 0. ],
             [0. , 0.5, 0.5, 0.5, 0. , 0. ],
             [0. , 0. , 0. , 0. , 0. , 0. ]])
   >>> # xdoctest: +REQUIRES(module:torch)
   >>> dst = torch.zeros((1, 3, 6, 6))
   >>> src = torch.ones((1, 3, 3, 3))
   >>> index = (slice(None), slice(None), slice(1.5, 4.5), slice(1.25, 4.25))
   >>> subpixel_accum(dst, src, index)
   >>> print(ub.repr2(dst.numpy()[0, 0], precision=2, with_dtype=0))
   np.array([[0.  , 0.  , 0.  , 0.  , 0.  , 0.  ],
             [0.  , 0.38, 0.5 , 0.5 , 0.12, 0.  ],
             [0.  , 0.75, 1.  , 1.  , 0.25, 0.  ],
             [0.  , 0.75, 1.  , 1.  , 0.25, 0.  ],
             [0.  , 0.38, 0.5 , 0.5 , 0.12, 0.  ],
             [0.  , 0.  , 0.  , 0.  , 0.  , 0.  ]])

   Doctest:
       >>> # TODO: move to a unit test file
       >>> subpixel_accum(np.zeros(5), np.ones(2), [slice(1.5, 3.5)]).tolist()
       [0.0, 0.5, 1.0, 0.5, 0.0]
       >>> subpixel_accum(np.zeros(5), np.ones(2), [slice(0, 2)]).tolist()
       [1.0, 1.0, 0.0, 0.0, 0.0]
       >>> subpixel_accum(np.zeros(5), np.ones(3), [slice(.5, 3.5)]).tolist()
       [0.5, 1.0, 1.0, 0.5, 0.0]
       >>> subpixel_accum(np.zeros(5), np.ones(3), [slice(-1, 2)]).tolist()
       [1.0, 1.0, 0.0, 0.0, 0.0]
       >>> subpixel_accum(np.zeros(5), np.ones(3), [slice(-1.5, 1.5)]).tolist()
       [1.0, 0.5, 0.0, 0.0, 0.0]
       >>> subpixel_accum(np.zeros(5), np.ones(3), [slice(10, 13)]).tolist()
       [0.0, 0.0, 0.0, 0.0, 0.0]
       >>> subpixel_accum(np.zeros(5), np.ones(3), [slice(3.25, 6.25)]).tolist()
       [0.0, 0.0, 0.0, 0.75, 1.0]
       >>> subpixel_accum(np.zeros(5), np.ones(3), [slice(4.9, 7.9)]).tolist()
       [0.0, 0.0, 0.0, 0.0, 0.099...]
       >>> subpixel_accum(np.zeros(5), np.ones(9), [slice(-1.5, 7.5)]).tolist()
       [1.0, 1.0, 1.0, 1.0, 1.0]
       >>> subpixel_accum(np.zeros(5), np.ones(9), [slice(2.625, 11.625)]).tolist()
       [0.0, 0.0, 0.375, 1.0, 1.0]
       >>> subpixel_accum(np.zeros(5), 1, [slice(2.625, 11.625)]).tolist()
       [0.0, 0.0, 0.375, 1.0, 1.0]


.. py:function:: subpixel_maximum(dst, src, index, interp_axes=None)

   Take the max of the source values array into and the destination array at a
   particular subpixel index. Modifies the destination array.

   :Parameters: * **dst** (*ArrayLike*) -- destination array to index into
                * **src** (*ArrayLike*) -- source array that agrees with the index
                * **index** (*Tuple[slice]*) -- subpixel slice into dst that corresponds with src
                * **interp_axes** (*tuple*) -- specify which axes should be spatially interpolated

   .. rubric:: Example

   >>> dst = np.array([0, 1.0, 1.0, 1.0, 0])
   >>> src = np.array([2.0, 2.0])
   >>> index = [slice(1.6, 3.6)]
   >>> subpixel_maximum(dst, src, index)
   >>> print(ub.repr2(dst, precision=2, with_dtype=0))
   np.array([0. , 1. , 2. , 1.2, 0. ])

   .. rubric:: Example

   >>> # xdoctest: +REQUIRES(module:torch)
   >>> dst = torch.zeros((1, 3, 5, 5)) + .5
   >>> src = torch.ones((1, 3, 3, 3))
   >>> index = (slice(None), slice(None), slice(1.4, 4.4), slice(1.25, 4.25))
   >>> subpixel_maximum(dst, src, index)
   >>> print(ub.repr2(dst.numpy()[0, 0], precision=2, with_dtype=0))
   np.array([[0.5 , 0.5 , 0.5 , 0.5 , 0.5 ],
             [0.5 , 0.5 , 0.6 , 0.6 , 0.5 ],
             [0.5 , 0.75, 1.  , 1.  , 0.5 ],
             [0.5 , 0.75, 1.  , 1.  , 0.5 ],
             [0.5 , 0.5 , 0.5 , 0.5 , 0.5 ]])


.. py:function:: subpixel_minimum(dst, src, index, interp_axes=None)

   Take the min of the source values array into and the destination array at a
   particular subpixel index. Modifies the destination array.

   :Parameters: * **dst** (*ArrayLike*) -- destination array to index into
                * **src** (*ArrayLike*) -- source array that agrees with the index
                * **index** (*Tuple[slice]*) -- subpixel slice into dst that corresponds with src
                * **interp_axes** (*tuple*) -- specify which axes should be spatially interpolated

   .. rubric:: Example

   >>> dst = np.array([0, 1.0, 1.0, 1.0, 0])
   >>> src = np.array([2.0, 2.0])
   >>> index = [slice(1.6, 3.6)]
   >>> subpixel_minimum(dst, src, index)
   >>> print(ub.repr2(dst, precision=2, with_dtype=0))
   np.array([0. , 0.8, 1. , 1. , 0. ])

   .. rubric:: Example

   >>> # xdoctest: +REQUIRES(module:torch)
   >>> dst = torch.zeros((1, 3, 5, 5)) + .5
   >>> src = torch.ones((1, 3, 3, 3))
   >>> index = (slice(None), slice(None), slice(1.4, 4.4), slice(1.25, 4.25))
   >>> subpixel_minimum(dst, src, index)
   >>> print(ub.repr2(dst.numpy()[0, 0], precision=2, with_dtype=0))
   np.array([[0.5 , 0.5 , 0.5 , 0.5 , 0.5 ],
             [0.5 , 0.45, 0.5 , 0.5 , 0.15],
             [0.5 , 0.5 , 0.5 , 0.5 , 0.25],
             [0.5 , 0.5 , 0.5 , 0.5 , 0.25],
             [0.5 , 0.3 , 0.4 , 0.4 , 0.1 ]])


.. py:function:: subpixel_slice(inputs, index)

   Take a subpixel slice from a larger image.  The returned output is
   left-aligned with the requested slice.

   :Parameters: * **inputs** (*ArrayLike*) -- data
                * **index** (*Tuple[slice]*) -- a slice to subpixel accuracy

   .. rubric:: Example

   >>> # xdoctest: +REQUIRES(module:torch)
   >>> import kwimage
   >>> import torch
   >>> # say we have a (576, 576) input space
   >>> # and a (9, 9) output space downsampled by 64x
   >>> ospc_feats = np.tile(np.arange(9 * 9).reshape(1, 9, 9), (1024, 1, 1))
   >>> inputs = torch.from_numpy(ospc_feats)
   >>> # We detected a box in the input space
   >>> ispc_bbox = kwimage.Boxes([[64,  65, 100, 120]], 'ltrb')
   >>> # Get coordinates in the output space
   >>> ospc_bbox = ispc_bbox.scale(1 / 64)
   >>> tl_x, tl_y, br_x, br_y = ospc_bbox.data[0]
   >>> # Convert the box to a slice
   >>> index = [slice(None), slice(tl_y, br_y), slice(tl_x, br_x)]
   >>> # Note: I'm not 100% sure this work right with non-intergral slices
   >>> outputs = kwimage.subpixel_slice(inputs, index)

   .. rubric:: Example

   >>> inputs = np.arange(5 * 5 * 3).reshape(5, 5, 3)
   >>> index = [slice(0, 3), slice(0, 3)]
   >>> outputs = subpixel_slice(inputs, index)
   >>> index = [slice(0.5, 3.5), slice(-0.5, 2.5)]
   >>> outputs = subpixel_slice(inputs, index)

   >>> inputs = np.arange(5 * 5).reshape(1, 5, 5).astype(float)
   >>> index = [slice(None), slice(3, 6), slice(3, 6)]
   >>> outputs = subpixel_slice(inputs, index)
   >>> print(outputs)
   [[[18. 19.  0.]
     [23. 24.  0.]
     [ 0.  0.  0.]]]
   >>> index = [slice(None), slice(3.5, 6.5), slice(2.5, 5.5)]
   >>> outputs = subpixel_slice(inputs, index)
   >>> print(outputs)
   [[[20.   21.   10.75]
     [11.25 11.75  6.  ]
     [ 0.    0.    0.  ]]]


.. py:function:: subpixel_translate(inputs, shift, interp_axes=None, output_shape=None)

   Translates an image by a subpixel shift value using bilinear interpolation

   :Parameters: * **inputs** (*ArrayLike*) -- data to translate
                * **shift** (*Sequence*) -- amount to translate each dimension specified by `interp_axes`.
                  Note: if inputs contains more than one "image" then all "images" are
                  translated by the same amount. This function contains  no mechanism
                  for translating each image differently. Note that by default
                  this is a y,x shift for 2 dimensions.
                * **interp_axes** (*Sequence, default=None*) -- axes to perform interpolation on, if not specified the final
                  `n` axes are interpolated, where `n=len(shift)`
                * **output_shape** (*tuple, default=None*) -- if specified the output is returned with this shape, otherwise

   .. rubric:: Notes

   This function powers most other functions in this file.
   Speedups here can go a long way.

   .. rubric:: Example

   >>> inputs = np.arange(5) + 1
   >>> print(inputs.tolist())
   [1, 2, 3, 4, 5]
   >>> outputs = subpixel_translate(inputs, 1.5)
   >>> print(outputs.tolist())
   [0.0, 0.5, 1.5, 2.5, 3.5]

   .. rubric:: Example

   >>> # xdoctest: +REQUIRES(module:torch)
   >>> inputs = torch.arange(9).view(1, 1, 3, 3).float()
   >>> print(inputs.long())
   tensor([[[[0, 1, 2],
             [3, 4, 5],
             [6, 7, 8]]]])
   >>> outputs = subpixel_translate(inputs, (-.4, .5), output_shape=(1, 1, 2, 5))
   >>> print(outputs)
   tensor([[[[0.6000, 1.7000, 2.7000, 1.6000, 0.0000],
             [2.1000, 4.7000, 5.7000, 3.1000, 0.0000]]]])

   Ignore:
       >>> inputs = np.arange(5)
       >>> shift = -.6
       >>> interp_axes = None
       >>> subpixel_translate(inputs, -.6)
       >>> subpixel_translate(inputs[None, None, None, :], -.6)
       >>> inputs = np.arange(25).reshape(5, 5)
       >>> shift = (-1.6, 2.3)
       >>> interp_axes = (0, 1)
       >>> subpixel_translate(inputs, shift, interp_axes, output_shape=(9, 9))
       >>> subpixel_translate(inputs, shift, interp_axes, output_shape=(3, 4))


.. py:function:: _padded_slice(data, in_slice, ndim=None, pad_slice=None, pad_mode='constant', **padkw)

   Allows slices with out-of-bound coordinates.  Any out of bounds coordinate
   will be sampled via padding.

   .. note::

      Negative slices have a different meaning here then they usually do.
      Normally, they indicate a wrap-around or a reversed stride, but here
      they index into out-of-bounds space (which depends on the pad mode).
      For example a slice of -2:1 literally samples two pixels to the left of
      the data and one pixel from the data, so you get two padded values and
      one data value.

   :Parameters: * **data** (*Sliceable[T]*) -- data to slice into. Any channels must be the last dimension.
                * **in_slice** (*Tuple[slice, ...]*) -- slice for each dimensions
                * **ndim** (*int*) -- number of spatial dimensions
                * **pad_slice** (*List[int|Tuple]*) -- additional padding of the slice

   :returns:

                 data_sliced: subregion of the input data (possibly with padding,
                     depending on if the original slice went out of bounds)

                 st_dims : a list indicating the low and high space-time coordinate
                     values of the returned data slice.
   :rtype: Tuple[Sliceable, List]

   .. rubric:: Example

   >>> data = np.arange(5)
   >>> in_slice = [slice(-2, 7)]

   >>> data_sliced, st_dims = _padded_slice(data, in_slice)
   >>> print(ub.repr2(data_sliced, with_dtype=False))
   >>> print(st_dims)
   np.array([0, 0, 0, 1, 2, 3, 4, 0, 0])
   [(-2, 7)]

   >>> data_sliced, st_dims = _padded_slice(data, in_slice, pad_slice=(3, 3))
   >>> print(ub.repr2(data_sliced, with_dtype=False))
   >>> print(st_dims)
   np.array([0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 0, 0, 0, 0, 0])
   [(-5, 10)]

   >>> data_sliced, st_dims = _padded_slice(data, slice(3, 4), pad_slice=[(1, 0)])
   >>> print(ub.repr2(data_sliced, with_dtype=False))
   >>> print(st_dims)
   np.array([2, 3])
   [(2, 4)]


.. py:function:: _ensure_arraylike(data, n=None)


.. py:function:: _rectify_slice(data_dims, low_dims, high_dims, pad_slice=None)

   Given image dimensions, bounding box dimensions, and a padding get the
   corresponding slice from the image and any extra padding needed to achieve
   the requested window size.

   :Parameters: * **data_dims** (*tuple*) -- n-dimension data sizes (e.g. 2d height, width)
                * **low_dims** (*tuple*) -- bounding box low values (e.g. 2d ymin, xmin)
                * **high_dims** (*tuple*) -- bounding box high values (e.g. 2d ymax, xmax)
                * **pad_slice** (*List[int|Tuple]*) -- pad applied to (left and right) / (both) sides of each slice dim

   :returns:

                 data_slice - low and high values of a fancy slice corresponding to
                     the image with shape `data_dims`. This slice may not correspond
                     to the full window size if the requested bounding box goes out
                     of bounds.
                 extra_padding - extra padding needed after slicing to achieve
                     the requested window size.
   :rtype: Tuple

   .. rubric:: Example

   >>> # Case where 2D-bbox is inside the data dims on left edge
   >>> # Comprehensive 1D-cases are in the unit-test file
   >>> data_dims  = [300, 300]
   >>> low_dims   = [0, 0]
   >>> high_dims  = [10, 10]
   >>> pad_slice  = [(10, 10), (5, 5)]
   >>> a, b = _rectify_slice(data_dims, low_dims, high_dims, pad_slice)
   >>> print('data_slice = {!r}'.format(a))
   >>> print('extra_padding = {!r}'.format(b))
   data_slice = [(0, 20), (0, 15)]
   extra_padding = [(10, 0), (5, 0)]


.. py:function:: _warp_tensor_cv2(inputs, mat, output_dims, mode='linear', ishomog=None)

   implementation with cv2.warpAffine for speed / correctness comparison

   On GPU: torch is faster in both modes
   On CPU: torch is faster for homog, but cv2 is faster for affine

   Benchmark:
       >>> # xdoctest: +REQUIRES(module:torch)
       >>> from kwimage.util.util_warp import *
       >>> from kwimage.util.util_warp import _warp_tensor_cv2
       >>> from kwimage.util.util_warp import warp_tensor
       >>> import numpy as np
       >>> ti = ub.Timerit(10, bestof=3, verbose=2, unit='ms')
       >>> mode = 'linear'
       >>> rng = np.random.RandomState(0)
       >>> inputs = torch.Tensor(rng.rand(16, 10, 32, 32)).to('cpu')
       >>> mat = torch.FloatTensor([[2.5, 0, 10.5], [0, 3, 0], [0, 0, 1]])
       >>> mat[2, 0] = .009
       >>> mat[2, 2] = 2
       >>> output_dims = (64, 64)
       >>> results = ub.odict()
       >>> # -------------
       >>> for timer in ti.reset('warp_tensor(torch)'):
       >>>     with timer:
       >>>         outputs = warp_tensor(inputs, mat, output_dims, mode=mode)
       >>>         torch.cuda.synchronize()
       >>> results[ti.label] = outputs
       >>> # -------------
       >>> inputs = inputs.cpu().numpy()
       >>> mat = mat.cpu().numpy()
       >>> for timer in ti.reset('warp_tensor(cv2)'):
       >>>     with timer:
       >>>         outputs = _warp_tensor_cv2(inputs, mat, output_dims, mode=mode)
       >>> results[ti.label] = outputs
       >>> import itertools as it
       >>> for k1, k2 in it.combinations(results, 2):
       >>>     a = kwarray.ArrayAPI.numpy(results[k1])
       >>>     b = kwarray.ArrayAPI.numpy(results[k2])
       >>>     diff = np.abs(a - b)
       >>>     diff_stats = kwarray.stats_dict(diff, n_extreme=1, extreme=1)
       >>>     print('{} - {}: {}'.format(k1, k2, ub.repr2(diff_stats, nl=0, precision=4)))
       >>> # xdoctest: +REQUIRES(--show)
       >>> import kwplot
       >>> kwplot.autompl()
       >>> kwplot.imshow(results['warp_tensor(torch)'][0, 0], fnum=1, pnum=(1, 2, 1), title='torch')
       >>> kwplot.imshow(results['warp_tensor(cv2)'][0, 0], fnum=1, pnum=(1, 2, 2), title='cv2')


.. py:function:: warp_points(matrix, pts, homog_mode='divide')

   Warp ND points / coordinates using a transformation matrix.

   Homogoenous coordinates are added on the fly if needed. Works with both
   numpy and torch.

   :Parameters: * **matrix** (*ArrayLike*) -- [D1 x D2] transformation matrix.
                  if using homogenous coordinates D2=D + 1, otherwise D2=D.
                  if using homogenous coordinates and the matrix represents an Affine
                  transformation, then either D1=D or D1=D2, i.e. the last row of
                  zeros and a one is optional.
                * **pts** (*ArrayLike*) -- [N1 x ... x D] points (usually x, y).
                  If points are already in homogenous space, then the output will be
                  returned in homogenous space. D is the dimensionality of the
                  points.  The leading axis may take any shape, but usually, shape
                  will be [N x D] where N is the number of points.
                * **homog_mode** (*str, default='divide'*) -- what to do for homogenous coordinates. Can either divide, keep, or
                  drop.

   Retrns:
       new_pts (ArrayLike): the points after being transformed by the matrix

   .. rubric:: Example

   >>> from kwimage.util_warp import *  # NOQA
   >>> # --- with numpy
   >>> rng = np.random.RandomState(0)
   >>> pts = rng.rand(10, 2)
   >>> matrix = rng.rand(2, 2)
   >>> warp_points(matrix, pts)
   >>> # --- with torch
   >>> # xdoctest: +REQUIRES(module:torch)
   >>> pts = torch.Tensor(pts)
   >>> matrix = torch.Tensor(matrix)
   >>> warp_points(matrix, pts)

   .. rubric:: Example

   >>> from kwimage.util_warp import *  # NOQA
   >>> # --- with numpy
   >>> pts = np.ones((10, 2))
   >>> matrix = np.diag([2, 3, 1])
   >>> ra = warp_points(matrix, pts)
   >>> # xdoctest: +REQUIRES(module:torch)
   >>> rb = warp_points(torch.Tensor(matrix), torch.Tensor(pts))
   >>> assert np.allclose(ra, rb.numpy())

   .. rubric:: Example

   >>> from kwimage.util_warp import *  # NOQA
   >>> # test different cases
   >>> rng = np.random.RandomState(0)
   >>> # Test 3x3 style projective matrices
   >>> pts = rng.rand(1000, 2)
   >>> matrix = rng.rand(3, 3)
   >>> ra33 = warp_points(matrix, pts)
   >>> # xdoctest: +REQUIRES(module:torch)
   >>> rb33 = warp_points(torch.Tensor(matrix), torch.Tensor(pts))
   >>> assert np.allclose(ra33, rb33.numpy())
   >>> # Test opencv style affine matrices
   >>> pts = rng.rand(10, 2)
   >>> matrix = rng.rand(2, 3)
   >>> ra23 = warp_points(matrix, pts)
   >>> rb23 = warp_points(torch.Tensor(matrix), torch.Tensor(pts))
   >>> assert np.allclose(ra33, rb33.numpy())


.. py:function:: remove_homog(pts, mode='divide')

   Remove homogenous coordinate to a point array.

   This is a convinience function, it is not particularly efficient.

   SeeAlso:
       cv2.convertPointsFromHomogeneous

   .. rubric:: Example

   >>> homog_pts = np.random.rand(10, 3)
   >>> remove_homog(homog_pts, 'divide')
   >>> remove_homog(homog_pts, 'drop')


.. py:function:: add_homog(pts)

   Add a homogenous coordinate to a point array

   This is a convinience function, it is not particularly efficient.

   SeeAlso:
       cv2.convertPointsToHomogeneous

   .. rubric:: Example

   >>> pts = np.random.rand(10, 2)
   >>> add_homog(pts)

   Benchmark:
       >>> import timerit
       >>> ti = timerit.Timerit(1000, bestof=10, verbose=2)
       >>> pts = np.random.rand(1000, 2)
       >>> for timer in ti.reset('kwimage'):
       >>>     with timer:
       >>>         kwimage.add_homog(pts)
       >>> for timer in ti.reset('cv2'):
       >>>     with timer:
       >>>         cv2.convertPointsToHomogeneous(pts)
       >>> # cv2 is 4x faster, but has more restrictive inputs


.. py:function:: subpixel_getvalue(img, pts, coord_axes=None, interp='bilinear', bordermode='edge')

   Get values at subpixel locations

   :Parameters: * **img** (*ArrayLike*) -- image to sample from
                * **pts** (*ArrayLike*) -- subpixel rc-coordinates to sample
                * **coord_axes** (*Sequence, default=None*) -- axes to perform interpolation on, if not specified the first `d`
                  axes are interpolated, where `d=pts.shape[-1]`.
                  IE: this indicates which axes each coordinate dimension corresponds to.
                * **interp** (*str*) -- interpolation mode
                * **bordermode** (*str*) -- how locations outside the image are handled

   .. rubric:: Example

   >>> from kwimage.util_warp import *  # NOQA
   >>> img = np.arange(3 * 3).reshape(3, 3)
   >>> pts = np.array([[1, 1], [1.5, 1.5], [1.9, 1.1]])
   >>> subpixel_getvalue(img, pts)
   array([4. , 6. , 6.8])
   >>> subpixel_getvalue(img, pts, coord_axes=(1, 0))
   array([4. , 6. , 5.2])
   >>> # xdoctest: +REQUIRES(module:torch)
   >>> img = torch.Tensor(img)
   >>> pts = torch.Tensor(pts)
   >>> subpixel_getvalue(img, pts)
   tensor([4.0000, 6.0000, 6.8000])
   >>> subpixel_getvalue(img.numpy(), pts.numpy(), interp='nearest')
   array([4., 8., 7.], dtype=float32)
   >>> subpixel_getvalue(img.numpy(), pts.numpy(), interp='nearest', coord_axes=[1, 0])
   array([4., 8., 5.], dtype=float32)
   >>> subpixel_getvalue(img, pts, interp='nearest')
   tensor([4., 8., 7.])

   .. rubric:: References

   stackoverflow.com/uestions/12729228/simple-binlin-interp-images-numpy

   SeeAlso:
       cv2.getRectSubPix(image, patchSize, center[, patch[, patchType]])


.. py:function:: subpixel_setvalue(img, pts, value, coord_axes=None, interp='bilinear', bordermode='edge')

   Set values at subpixel locations

   :Parameters: * **img** (*ArrayLike*) -- image to set values in
                * **pts** (*ArrayLike*) -- subpixel rc-coordinates to set
                * **value** (*ArrayLike*) -- value to place in the image
                * **coord_axes** (*Sequence, default=None*) -- axes to perform interpolation on, if not specified the first `d`
                  axes are interpolated, where `d=pts.shape[-1]`.
                  IE: this indicates which axes each coordinate dimension corresponds to.
                * **interp** (*str*) -- interpolation mode
                * **bordermode** (*str*) -- how locations outside the image are handled

   .. rubric:: Example

   >>> from kwimage.util_warp import *  # NOQA
   >>> img = np.arange(3 * 3).reshape(3, 3).astype(float)
   >>> pts = np.array([[1, 1], [1.5, 1.5], [1.9, 1.1]])
   >>> interp = 'bilinear'
   >>> value = 0
   >>> print('img = {!r}'.format(img))
   >>> pts = np.array([[1.5, 1.5]])
   >>> img2 = subpixel_setvalue(img.copy(), pts, value)
   >>> print('img2 = {!r}'.format(img2))
   >>> pts = np.array([[1.0, 1.0]])
   >>> img2 = subpixel_setvalue(img.copy(), pts, value)
   >>> print('img2 = {!r}'.format(img2))
   >>> pts = np.array([[1.1, 1.9]])
   >>> img2 = subpixel_setvalue(img.copy(), pts, value)
   >>> print('img2 = {!r}'.format(img2))
   >>> img2 = subpixel_setvalue(img.copy(), pts, value, coord_axes=[1, 0])
   >>> print('img2 = {!r}'.format(img2))


.. py:function:: _bilinear_coords(ptsT, impl, img, coord_axes)