SDMGR¶
- class mmocr.models.kie.SDMGR(backbone=None, roi_extractor=None, neck=None, kie_head=None, dictionary=None, data_preprocessor=None, init_cfg=None)[源代码]¶
The implementation of the paper: Spatial Dual-Modality Graph Reasoning for Key Information Extraction. https://arxiv.org/abs/2103.14470.
- 参数
backbone (dict, optional) – Config of backbone. If None, None will be passed to kie_head during training and testing. Defaults to None.
roi_extractor (dict, optional) – Config of roi extractor. Only applicable when backbone is not None. Defaults to None.
neck (dict, optional) – Config of neck. Defaults to None.
kie_head (dict) – Config of KIE head. Defaults to None.
dictionary (dict, optional) – Config of dictionary. Defaults to None.
data_preprocessor (dict or ConfigDict, optional) – The pre-process config of
BaseDataPreprocessor
. it usually includes,pad_size_divisor
,pad_value
,mean
andstd
. It has to be None when working in non-visual mode. Defaults to None.init_cfg (dict or list[dict], optional) – Initialization configs. Defaults to None.
- 返回类型
- extract_feat(img, gt_bboxes)[源代码]¶
Extract features from images if self.backbone is not None. It returns None otherwise.
- 参数
img (torch.Tensor) – The input image with shape (N, C, H, W).
gt_bboxes (list[torch.Tensor)) – A list of ground truth bounding boxes, each of shape \((N_i, 4)\).
- 返回
The extracted features with shape (N, E).
- 返回类型
- forward(inputs, data_samples=None, mode='tensor', **kwargs)[源代码]¶
The unified entry for a forward process in both training and test.
The method should accept three modes: “tensor”, “predict” and “loss”:
“tensor”: Forward the whole network and return tensor or tuple of
tensor without any post-processing, same as a common nn.Module. - “predict”: Forward and return the predictions, which are fully processed to a list of
DetDataSample
. - “loss”: Forward and return a dict of losses according to the given inputs and data samples.Note that this method doesn’t handle neither back propagation nor optimizer updating, which are done in the
train_step()
.- 参数
inputs (torch.Tensor) – The input tensor with shape (N, C, …) in general.
data_samples (list[
DetDataSample
], optional) – The annotation data of every samples. Defaults to None.mode (str) – Return what kind of value. Defaults to ‘tensor’.
- 返回
The return type depends on
mode
.If
mode="tensor"
, return a tensor or a tuple of tensor.If
mode="predict"
, return a list ofDetDataSample
.If
mode="loss"
, return a dict of tensor.
- 返回类型
- loss(inputs, data_samples, **kwargs)[源代码]¶
Calculate losses from a batch of inputs and data samples.
- 参数
inputs (torch.Tensor) – Input images of shape (N, C, H, W). Typically these should be mean centered and std scaled.
data_samples (list[KIEDataSample]) – A list of N datasamples, containing meta information and gold annotations for each of the images.
- 返回
A dictionary of loss components.
- 返回类型
- predict(inputs, data_samples, **kwargs)[源代码]¶
Predict results from a batch of inputs and data samples with post- processing. :param inputs: Input images of shape (N, C, H, W).
Typically these should be mean centered and std scaled.
- 参数
data_samples (list[KIEDataSample]) – A list of N datasamples, containing meta information and gold annotations for each of the images.
inputs (torch.Tensor) –
- 返回
A list of datasamples of prediction results. Results are stored in
pred_instances.labels
andpred_instances.edge_labels
.- 返回类型
List[KIEDataSample]