mmocr.datasets¶
mmocr.datasets.transforms
Loading¶
Load an image from file. |
|
Load and process the |
|
Load and process the |
TextDet Transforms¶
First randomly rescale the image so that the longside and shortside of the image are around the bound; then jitter its aspect ratio. |
|
Flip the image & bbox polygon. |
|
Pad Image to target size. |
|
First rescale the image for its shorter side to reach the short_size and then jitter its aspect ratio, final rescale the shape guaranteed to be divided by scale_divisor. |
|
Randomly select a region and crop images to a target size and make sure to contain text region. |
|
Random crop and flip a patch in the image. |
TextRecog Transforms¶
A general geometric augmentation tool for text images in the CVPR 2020 paper “Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition”. |
|
Randomly crop the image’s height, either from top or bottom. |
|
Jitter the image contents. |
|
Reverse image pixels. |
|
Resize the image to the base shape, downsample it with gaussian pyramid, and rescale it back to original size. |
|
Only pad the image’s width. |
|
Rescale the image to the height according to setting and keep the aspect ratio unchanged if possible. |
OCR Transforms¶
Randomly crop images and make sure to contain at least one intact instance. |
|
Randomly rotate the image, boxes, and polygons. |
|
Resize image & bboxes & polygons. |
|
Fix invalid polygons in the dataset. |
|
Removed ignored elements from the pipeline. |
Formatting¶
Pack the inputs data for text detection. |
|
Pack the inputs data for text recognition. |
|
Pack the inputs data for key information extraction. |
Transform Wrapper¶
A wrapper around imgaug https://github.com/aleju/imgaug. |
|
A wrapper around torchvision transforms. |