TextRecogDataPreprocessor¶
- class mmocr.models.textrecog.TextRecogDataPreprocessor(mean=None, std=None, pad_size_divisor=1, pad_value=0, bgr_to_rgb=False, rgb_to_bgr=False, batch_augments=None)[源代码]¶
Image pre-processor for recognition tasks.
Comparing with the
mmengine.ImgDataPreprocessor
,It supports batch augmentations.
2. It will additionally append batch_input_shape and valid_ratio to data_samples considering the object recognition task.
It provides the data pre-processing as follows
Collate and move data to the target device.
Pad inputs to the maximum size of current batch with defined
pad_value
. The padding size can be divisible by a definedpad_size_divisor
Stack inputs to inputs.
Convert inputs from bgr to rgb if the shape of input is (3, H, W).
Normalize image with defined std and mean.
Do batch augmentations during training.
- 参数
mean (Sequence[Number], optional) – The pixel mean of R, G, B channels. Defaults to None.
std (Sequence[Number], optional) – The pixel standard deviation of R, G, B channels. Defaults to None.
pad_size_divisor (int) – The size of padded image should be divisible by
pad_size_divisor
. Defaults to 1.pad_value (Number) – The padded pixel value. Defaults to 0.
bgr_to_rgb (bool) – whether to convert image from BGR to RGB. Defaults to False.
rgb_to_bgr (bool) – whether to convert image from RGB to RGB. Defaults to False.
batch_augments (list[dict], optional) – Batch-level augmentations
- 返回类型