Shortcuts

TextRecogGeneralAug

class mmocr.datasets.transforms.TextRecogGeneralAug[源代码]

A general geometric augmentation tool for text images in the CVPR 2020 paper “Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition”. It applies distortion, stretching, and perspective transforms to an image.

This implementation is adapted from https://github.com/RubanSeven/Text-Image-Augmentation-python/blob/master/augment.py # noqa

TODO: Split this transform into three transforms.

Required Keys:

  • img

Modified Keys:

  • img

  • img_shape

tia_distort(img, segment=4)[源代码]

Image distortion.

参数
  • img (np.ndarray) – The image.

  • segment (int) – The number of segments to divide the image along the width. Defaults to 4.

返回类型

numpy.ndarray

tia_perspective(img)[源代码]

Image perspective transformation.

参数
  • img (np.ndarray) – The image.

  • segment (int) – The number of segments to divide the image along the width. Defaults to 4.

返回类型

numpy.ndarray

tia_stretch(img, segment=4)[源代码]

Image stretching.

参数
  • img (np.ndarray) – The image.

  • segment (int) – The number of segments to divide the image along the width. Defaults to 4.

返回类型

numpy.ndarray

transform(results)[源代码]

Call function to pad images.

参数

results (dict) – Result dict from loading pipeline.

返回

Updated result dict.

返回类型

dict

warp_mls(src, src_pts, dst_pts, dst_w, dst_h, trans_ratio=1.0)[源代码]

Warp the image.

参数
返回类型

numpy.ndarray

Read the Docs v: dev-1.x
Versions
latest
stable
0.x
dev-1.x
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.