Shortcuts

TextDetDataPreprocessor

class mmocr.models.textdet.TextDetDataPreprocessor(mean=None, std=None, pad_size_divisor=1, pad_value=0, bgr_to_rgb=False, rgb_to_bgr=False, batch_augments=None)[source]

Image pre-processor for detection tasks.

Comparing with the mmengine.ImgDataPreprocessor,

  1. It supports batch augmentations.

2. It will additionally append batch_input_shape and pad_shape to data_samples considering the object detection task.

It provides the data pre-processing as follows

  • Collate and move data to the target device.

  • Pad inputs to the maximum size of current batch with defined pad_value. The padding size can be divisible by a defined pad_size_divisor

  • Stack inputs to batch_inputs.

  • Convert inputs from bgr to rgb if the shape of input is (3, H, W).

  • Normalize image with defined std and mean.

  • Do batch augmentations during training.

Parameters
  • mean (Sequence[Number], optional) – The pixel mean of R, G, B channels. Defaults to None.

  • std (Sequence[Number], optional) – The pixel standard deviation of R, G, B channels. Defaults to None.

  • pad_size_divisor (int) – The size of padded image should be divisible by pad_size_divisor. Defaults to 1.

  • pad_value (Number) – The padded pixel value. Defaults to 0.

  • pad_mask (bool) – Whether to pad instance masks. Defaults to False.

  • mask_pad_value (int) – The padded pixel value for instance masks. Defaults to 0.

  • pad_seg (bool) – Whether to pad semantic segmentation maps. Defaults to False.

  • seg_pad_value (int) – The padded pixel value for semantic segmentation maps. Defaults to 255.

  • bgr_to_rgb (bool) – whether to convert image from BGR to RGB. Defaults to False.

  • rgb_to_bgr (bool) – whether to convert image from RGB to RGB. Defaults to False.

  • batch_augments (list[dict], optional) – Batch-level augmentations

Return type

None

forward(data, training=False)[source]

Perform normalization、padding and bgr2rgb conversion based on BaseDataPreprocessor.

Parameters
  • data (dict) – data sampled from dataloader.

  • training (bool) – Whether to enable training time augmentation.

Returns

Data in the same format as the model input.

Return type

dict

Read the Docs v: dev-1.x
Versions
latest
stable
v1.0.1
v1.0.0
0.x
v0.6.3
v0.6.2
v0.6.1
v0.6.0
v0.5.0
v0.4.1
v0.4.0
v0.3.0
v0.2.1
v0.2.0
v0.1.0
dev-1.x
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.