PackTextRecogInputs¶
- class mmocr.datasets.transforms.PackTextRecogInputs(meta_keys=('img_path', 'ori_shape', 'img_shape', 'pad_shape', 'valid_ratio'))[源代码]¶
Pack the inputs data for text recognition.
The type of outputs is dict:
inputs: Image as a tensor, whose shape is (C, H, W).
data_samples: Two components of
TextRecogDataSample
will be updated:gt_text (LabelData):
item(str): The groundtruth of text. Rename from ‘gt_texts’.
metainfo (dict): ‘metainfo’ is always populated. The contents of the ‘metainfo’ depends on
meta_keys
. By default it includes:“img_path”: Path to the image file.
“ori_shape”: Shape of the preprocessed image as a tuple (h, w).
“img_shape”: Shape of the image input to the network as a tuple (h, w). Note that the image may be zero-padded afterward on the bottom/right if the batch tensor is larger than this shape.
“valid_ratio”: The proportion of valid (unpadded) content of image on the x-axis. It defaults to 1 if not set in pipeline.
- 参数
meta_keys (Sequence[str], optional) – Meta keys to be converted to the metainfo of
TextRecogDataSampel
. Defaults to('img_path', 'ori_shape', 'img_shape', 'pad_shape', 'valid_ratio')
.