

class mmocr.datasets.transforms.LoadOCRAnnotations(with_bbox=False, with_label=False, with_polygon=False, with_text=False, **kwargs)[源代码]

Load and process the instances annotation provided by dataset.

The annotation format is as the following:

        # List of 4 numbers representing the bounding box of the
        # instance, in (x1, y1, x2, y2) order.
        # used in text detection or text spotting tasks.
        'bbox': [x1, y1, x2, y2],

        # Label of instance, usually it's 0.
        # used in text detection or text spotting tasks.
        'bbox_label': 0,

        # List of n numbers representing the polygon of the
        # instance, in (xn, yn) order.
        # used in text detection/ textspotter.
        "polygon": [x1, y1, x2, y2, ... xn, yn],

        # The flag indicating whether the instance should be ignored.
        # used in text detection or text spotting tasks.
        "ignore": False,

        # The groundtruth of text.
        # used in text recognition or text spotting tasks.
        "text": 'tmp',

After this module, the annotation has been changed to the format below:

    # In (x1, y1, x2, y2) order, float type. N is the number of bboxes
    # in np.float32
    'gt_bboxes': np.ndarray(N, 4)
     # In np.int64 type.
    'gt_bboxes_labels': np.ndarray(N, )
    # In (x1, y1,..., xk, yk) order, float type.
    # in list[np.float32]
    'gt_polygons': list[np.ndarray(2k, )]
     # In np.bool_ type.
    'gt_ignored': np.ndarray(N, )
     # In list[str]
    'gt_texts': list[str]

Required Keys:

  • instances

    • bbox (optional)

    • bbox_label (optional)

    • polygon (optional)

    • ignore (optional)

    • text (optional)

Added Keys:

  • gt_bboxes (np.float32)

  • gt_bboxes_labels (np.int64)

  • gt_polygons (list[np.float32])

  • gt_ignored (np.bool_)

  • gt_texts (list[str])

  • with_bbox (bool) – Whether to parse and load the bbox annotation. Defaults to False.

  • with_label (bool) – Whether to parse and load the label annotation. Defaults to False.

  • with_polygon (bool) – Whether to parse and load the polygon annotation. Defaults to False.

  • with_text (bool) – Whether to parse and load the text annotation. Defaults to False.




Function to load multiple types annotations.


results (dict) – Result dict from :obj:OCRDataset.


The dict contains loaded bounding box, label polygon and text annotations.

