Shortcuts

SDMGR

class mmocr.models.kie.SDMGR(backbone=None, roi_extractor=None, neck=None, kie_head=None, dictionary=None, data_preprocessor=None, init_cfg=None)[source]

The implementation of the paper: Spatial Dual-Modality Graph Reasoning for Key Information Extraction. https://arxiv.org/abs/2103.14470.

Parameters
  • backbone (dict, optional) – Config of backbone. If None, None will be passed to kie_head during training and testing. Defaults to None.

  • roi_extractor (dict, optional) – Config of roi extractor. Only applicable when backbone is not None. Defaults to None.

  • neck (dict, optional) – Config of neck. Defaults to None.

  • kie_head (dict) – Config of KIE head. Defaults to None.

  • dictionary (dict, optional) – Config of dictionary. Defaults to None.

  • data_preprocessor (dict or ConfigDict, optional) – The pre-process config of BaseDataPreprocessor. it usually includes, pad_size_divisor, pad_value, mean and std. It has to be None when working in non-visual mode. Defaults to None.

  • init_cfg (dict or list[dict], optional) – Initialization configs. Defaults to None.

Return type

None

extract_feat(img, gt_bboxes)[source]

Extract features from images if self.backbone is not None. It returns None otherwise.

Parameters
  • img (torch.Tensor) – The input image with shape (N, C, H, W).

  • gt_bboxes (list[torch.Tensor)) – A list of ground truth bounding boxes, each of shape \((N_i, 4)\).

Returns

The extracted features with shape (N, E).

Return type

torch.Tensor

forward(inputs, data_samples=None, mode='tensor', **kwargs)[source]

The unified entry for a forward process in both training and test.

The method should accept three modes: “tensor”, “predict” and “loss”:

  • “tensor”: Forward the whole network and return tensor or tuple of

tensor without any post-processing, same as a common nn.Module. - “predict”: Forward and return the predictions, which are fully processed to a list of DetDataSample. - “loss”: Forward and return a dict of losses according to the given inputs and data samples.

Note that this method doesn’t handle neither back propagation nor optimizer updating, which are done in the train_step().

Parameters
  • inputs (torch.Tensor) – The input tensor with shape (N, C, …) in general.

  • data_samples (list[DetDataSample], optional) – The annotation data of every samples. Defaults to None.

  • mode (str) – Return what kind of value. Defaults to ‘tensor’.

Returns

The return type depends on mode.

  • If mode="tensor", return a tensor or a tuple of tensor.

  • If mode="predict", return a list of DetDataSample.

  • If mode="loss", return a dict of tensor.

Return type

torch.Tensor

loss(inputs, data_samples, **kwargs)[source]

Calculate losses from a batch of inputs and data samples.

Parameters
  • inputs (torch.Tensor) – Input images of shape (N, C, H, W). Typically these should be mean centered and std scaled.

  • data_samples (list[KIEDataSample]) – A list of N datasamples, containing meta information and gold annotations for each of the images.

Returns

A dictionary of loss components.

Return type

dict[str, Tensor]

predict(inputs, data_samples, **kwargs)[source]

Predict results from a batch of inputs and data samples with post- processing. :param inputs: Input images of shape (N, C, H, W).

Typically these should be mean centered and std scaled.

Parameters
  • data_samples (list[KIEDataSample]) – A list of N datasamples, containing meta information and gold annotations for each of the images.

  • inputs (torch.Tensor) –

Returns

A list of datasamples of prediction results. Results are stored in pred_instances.labels and pred_instances.edge_labels.

Return type

List[KIEDataSample]

Read the Docs v: dev-1.x
Versions
latest
stable
v1.0.1
v1.0.0
0.x
v0.6.3
v0.6.2
v0.6.1
v0.6.0
v0.5.0
v0.4.1
v0.4.0
v0.3.0
v0.2.1
v0.2.0
v0.1.0
dev-1.x
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.