TextRecogGeneralAug¶

class mmocr.datasets.transforms.TextRecogGeneralAug[源代码]¶

A general geometric augmentation tool for text images in the CVPR 2020 paper “Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition”. It applies distortion, stretching, and perspective transforms to an image.

This implementation is adapted from https://github.com/RubanSeven/Text-Image-Augmentation-python/blob/master/augment.py # noqa

TODO: Split this transform into three transforms.

Required Keys:

img

Modified Keys:

img
img_shape

tia_distort(img, segment=4)[源代码]¶

Image distortion.

参数

img (np.ndarray) – The image.
segment (int) – The number of segments to divide the image along the width. Defaults to 4.

返回类型

numpy.ndarray

tia_perspective(img)[源代码]¶

Image perspective transformation.

参数

img (np.ndarray) – The image.
segment (int) – The number of segments to divide the image along the width. Defaults to 4.

返回类型

numpy.ndarray

tia_stretch(img, segment=4)[源代码]¶

Image stretching.

参数

img (np.ndarray) – The image.
segment (int) – The number of segments to divide the image along the width. Defaults to 4.

返回类型

numpy.ndarray

transform(results)[源代码]¶

Call function to pad images.

参数: results (dict) – Result dict from loading pipeline.
返回: Updated result dict.
返回类型: dict

warp_mls(src, src_pts, dst_pts, dst_w, dst_h, trans_ratio=1.0)[源代码]¶

Warp the image.

参数

src (numpy.ndarray) –
src_pts (List[int]) –
dst_pts (List[int]) –
dst_w (int) –
dst_h (int) –
trans_ratio (float) –

返回类型

numpy.ndarray