

class mmocr.models.textdet.MMDetWrapper(cfg, text_repr_type='poly')[源代码]

A wrapper of MMDet’s model.

  • cfg (dict) – The config of the model.

  • text_repr_type (str) – The boundary encoding type ‘poly’ or ‘quad’. Defaults to ‘poly’.



adapt_predictions(data, data_samples)[源代码]

Convert Instance datas from MMDet into MMOCR’s format.

  • data (List[mmdet.structures.det_data_sample.DetDataSample]) –

    (list[DetDataSample]): Detection results of the input images. Each DetDataSample usually contain ‘pred_instances’. And the pred_instances usually contains following keys. - scores (Tensor): Classification scores, has a shape

    (num_instance, )

    • labels (Tensor): Labels of bboxes, has a shape

      (num_instances, ).

    • bboxes (Tensor): Has a shape (num_instances, 4),

      the last dimension 4 arrange as (x1, y1, x2, y2).

    • masks (Tensor, Optional): Has a shape (num_instances, H, W).

  • data_samples (list[TextDetDataSample]) – The annotation data of every samples.


A list of N datasamples containing ground

truth and prediction results. The polygon results are saved in TextDetDataSample.pred_instances.polygons The confidence scores are saved in TextDetDataSample.pred_instances.scores.



forward(inputs, data_samples=None, mode='tensor', **kwargs)[源代码]

The unified entry for a forward process in both training and test.

The method works in three modes: “tensor”, “predict” and “loss”:

  • “tensor”: Forward the whole network and return tensor or tuple of

tensor without any post-processing, same as a common nn.Module. - “predict”: Forward and return the predictions, which are fully processed to a list of DetDataSample. - “loss”: Forward and return a dict of losses according to the given inputs and data samples.

Note that this method doesn’t handle either back propagation or parameter update, which are supposed to be done in train_step().


Union[Dict[str, torch.Tensor], List[mmdet.structures.det_data_sample.DetDataSample], Tuple[torch.Tensor], torch.Tensor]

:param data_samples (list[DetDataSample] or: list[TextDetDataSample]): The annotation data of every

sample. When in “predict” mode, it should be a list of TextDetDataSample. Otherwise they are :obj:`DetDataSample`s. Defaults to None.


The return type depends on mode.

  • If mode="tensor", return a tensor or a tuple of tensor.

  • If mode="predict", return a list of TextDetDataSample.

  • If mode="loss", return a dict of tensor.


Union[Dict[str, torch.Tensor], List[mmdet.structures.det_data_sample.DetDataSample], Tuple[torch.Tensor], torch.Tensor]