Table of Contents

Shortcuts

mmocr.datasets¶

mmocr.datasets.transforms

Loading
TextDet Transforms
TextRecog Transforms
OCR Transforms
Formatting
Transform Wrapper
Adapter

Loading ¶

`LoadImageFromFile`	Load an image from file.
`LoadOCRAnnotations`	Load and process the `instances` annotation provided by dataset.
`LoadKIEAnnotations`	Load and process the `instances` annotation provided by dataset.
`InferencerLoader`	Load the image in Inferencer’s pipeline.

TextDet Transforms ¶

`BoundedScaleAspectJitter`	First randomly rescale the image so that the longside and shortside of the image are around the bound; then jitter its aspect ratio.
`RandomFlip`	Flip the image & bbox polygon.
`SourceImagePad`	Pad Image to target size.
`ShortScaleAspectJitter`	First rescale the image for its shorter side to reach the short_size and then jitter its aspect ratio, final rescale the shape guaranteed to be divided by scale_divisor.
`TextDetRandomCrop`	Randomly select a region and crop images to a target size and make sure to contain text region.
`TextDetRandomCropFlip`	Random crop and flip a patch in the image.

TextRecog Transforms ¶

`TextRecogGeneralAug`	A general geometric augmentation tool for text images in the CVPR 2020 paper “Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition”.
`CropHeight`	Randomly crop the image’s height, either from top or bottom.
`ImageContentJitter`	Jitter the image contents.
`ReversePixels`	Reverse image pixels.
`PyramidRescale`	Resize the image to the base shape, downsample it with gaussian pyramid, and rescale it back to original size.
`PadToWidth`	Only pad the image’s width.
`RescaleToHeight`	Rescale the image to the height according to setting and keep the aspect ratio unchanged if possible.

OCR Transforms ¶

`RandomCrop`	Randomly crop images and make sure to contain at least one intact instance.
`RandomRotate`	Randomly rotate the image, boxes, and polygons.
`Resize`	Resize image & bboxes & polygons.
`FixInvalidPolygon`	Fix invalid polygons in the dataset.
`RemoveIgnored`	Removed ignored elements from the pipeline.

Formatting ¶

`PackTextDetInputs`	Pack the inputs data for text detection.
`PackTextRecogInputs`	Pack the inputs data for text recognition.
`PackKIEInputs`	Pack the inputs data for key information extraction.

Transform Wrapper ¶

`ImgAugWrapper`	A wrapper around imgaug https://github.com/aleju/imgaug.
`TorchVisionWrapper`	A wrapper around torchvision transforms.

Adapter ¶

`MMDet2MMOCR`	Convert transforms’s data format from MMDet to MMOCR.
`MMOCR2MMDet`	Convert transforms’s data format from MMOCR to MMDet.