TextDetDataPreprocessor¶

class mmocr.models.textdet.TextDetDataPreprocessor(mean=None, std=None, pad_size_divisor=1, pad_value=0, bgr_to_rgb=False, rgb_to_bgr=False, batch_augments=None)[源代码]¶

Image pre-processor for detection tasks.

Comparing with the mmengine.ImgDataPreprocessor,

It supports batch augmentations.

2. It will additionally append batch_input_shape and pad_shape to data_samples considering the object detection task.

It provides the data pre-processing as follows

Collate and move data to the target device.
Pad inputs to the maximum size of current batch with defined pad_value. The padding size can be divisible by a defined pad_size_divisor
Stack inputs to batch_inputs.
Convert inputs from bgr to rgb if the shape of input is (3, H, W).
Normalize image with defined std and mean.
Do batch augmentations during training.

参数

mean (Sequence[Number], optional) – The pixel mean of R, G, B channels. Defaults to None.
std (Sequence[Number], optional) – The pixel standard deviation of R, G, B channels. Defaults to None.
pad_size_divisor (int) – The size of padded image should be divisible by pad_size_divisor. Defaults to 1.
pad_value (Number) – The padded pixel value. Defaults to 0.
pad_mask (bool) – Whether to pad instance masks. Defaults to False.
mask_pad_value (int) – The padded pixel value for instance masks. Defaults to 0.
pad_seg (bool) – Whether to pad semantic segmentation maps. Defaults to False.
seg_pad_value (int) – The padded pixel value for semantic segmentation maps. Defaults to 255.
bgr_to_rgb (bool) – whether to convert image from BGR to RGB. Defaults to False.
rgb_to_bgr (bool) – whether to convert image from RGB to RGB. Defaults to False.
batch_augments (list[dict], optional) – Batch-level augmentations

返回类型

None

forward(data, training=False)[源代码]¶

Perform normalization、padding and bgr2rgb conversion based on BaseDataPreprocessor.

参数

data (dict) – data sampled from dataloader.
training (bool) – Whether to enable training time augmentation.

返回

Data in the same format as the model input.

返回类型

dict