MMDetWrapper¶

class mmocr.models.textdet.MMDetWrapper(cfg, text_repr_type='poly')[source]¶

A wrapper of MMDet’s model.

Parameters

cfg (dict) – The config of the model.
text_repr_type (str) – The boundary encoding type ‘poly’ or ‘quad’. Defaults to ‘poly’.

Return type

None

adapt_predictions(data, data_samples)[source]¶

Convert Instance datas from MMDet into MMOCR’s format.

Parameters

data (List[mmdet.structures.det_data_sample.DetDataSample]) –
(list[DetDataSample]): Detection results of the input images. Each DetDataSample usually contain ‘pred_instances’. And the pred_instances usually contains following keys. - scores (Tensor): Classification scores, has a shape

(num_instance, )
- labels (Tensor): Labels of bboxes, has a shape
  (num_instances, ).
- bboxes (Tensor): Has a shape (num_instances, 4),
  the last dimension 4 arrange as (x1, y1, x2, y2).
- masks (Tensor, Optional): Has a shape (num_instances, H, W).
data_samples (list[TextDetDataSample]) – The annotation data of every samples.

Returns

A list of N datasamples containing ground: truth and prediction results. The polygon results are saved in TextDetDataSample.pred_instances.polygons The confidence scores are saved in TextDetDataSample.pred_instances.scores.

Return type

list[TextDetDataSample]

forward(inputs, data_samples=None, mode='tensor', **kwargs)[source]¶

The unified entry for a forward process in both training and test.

The method works in three modes: “tensor”, “predict” and “loss”:

“tensor”: Forward the whole network and return tensor or tuple of

tensor without any post-processing, same as a common nn.Module. - “predict”: Forward and return the predictions, which are fully processed to a list of DetDataSample. - “loss”: Forward and return a dict of losses according to the given inputs and data samples.

Note that this method doesn’t handle either back propagation or parameter update, which are supposed to be done in train_step().

Parameters

inputs (torch.Tensor) – The input tensor with shape (N, C, …) in general.
data_samples (Optional[Union[List[mmocr.structures.textdet_data_sample.TextDetDataSample], List[mmdet.structures.det_data_sample.DetDataSample]]]) –
mode (str) –

Return type

Union[Dict[str, torch.Tensor], List[mmdet.structures.det_data_sample.DetDataSample], Tuple[torch.Tensor], torch.Tensor]

:param data_samples (list[DetDataSample] or: list[TextDetDataSample]): The annotation data of every: sample. When in “predict” mode, it should be a list of TextDetDataSample. Otherwise they are :obj:`DetDataSample`s. Defaults to None.

Parameters

mode (str) – Running mode. Defaults to ‘tensor’.
inputs (torch.Tensor) –
data_samples (Optional[Union[List[mmocr.structures.textdet_data_sample.TextDetDataSample], List[mmdet.structures.det_data_sample.DetDataSample]]]) –

Returns

The return type depends on mode.

If mode="tensor", return a tensor or a tuple of tensor.
If mode="predict", return a list of TextDetDataSample.
If mode="loss", return a dict of tensor.

Return type

Union[Dict[str, torch.Tensor], List[mmdet.structures.det_data_sample.DetDataSample], Tuple[torch.Tensor], torch.Tensor]