Shortcuts

LoadOCRAnnotations

class mmocr.datasets.transforms.LoadOCRAnnotations(with_bbox=False, with_label=False, with_polygon=False, with_text=False, **kwargs)[source]

Load and process the instances annotation provided by dataset.

The annotation format is as the following:

{
    'instances':
    [
        {
        # List of 4 numbers representing the bounding box of the
        # instance, in (x1, y1, x2, y2) order.
        # used in text detection or text spotting tasks.
        'bbox': [x1, y1, x2, y2],

        # Label of instance, usually it's 0.
        # used in text detection or text spotting tasks.
        'bbox_label': 0,

        # List of n numbers representing the polygon of the
        # instance, in (xn, yn) order.
        # used in text detection/ textspotter.
        "polygon": [x1, y1, x2, y2, ... xn, yn],

        # The flag indicating whether the instance should be ignored.
        # used in text detection or text spotting tasks.
        "ignore": False,

        # The groundtruth of text.
        # used in text recognition or text spotting tasks.
        "text": 'tmp',
        }
    ]
}

After this module, the annotation has been changed to the format below:

{
    # In (x1, y1, x2, y2) order, float type. N is the number of bboxes
    # in np.float32
    'gt_bboxes': np.ndarray(N, 4)
     # In np.int64 type.
    'gt_bboxes_labels': np.ndarray(N, )
    # In (x1, y1,..., xk, yk) order, float type.
    # in list[np.float32]
    'gt_polygons': list[np.ndarray(2k, )]
     # In np.bool_ type.
    'gt_ignored': np.ndarray(N, )
     # In list[str]
    'gt_texts': list[str]
}

Required Keys:

  • instances

    • bbox (optional)

    • bbox_label (optional)

    • polygon (optional)

    • ignore (optional)

    • text (optional)

Added Keys:

  • gt_bboxes (np.float32)

  • gt_bboxes_labels (np.int64)

  • gt_polygons (list[np.float32])

  • gt_ignored (np.bool_)

  • gt_texts (list[str])

Parameters
  • with_bbox (bool) – Whether to parse and load the bbox annotation. Defaults to False.

  • with_label (bool) – Whether to parse and load the label annotation. Defaults to False.

  • with_polygon (bool) – Whether to parse and load the polygon annotation. Defaults to False.

  • with_text (bool) – Whether to parse and load the text annotation. Defaults to False.

Return type

None

transform(results)[source]

Function to load multiple types annotations.

Parameters

results (dict) – Result dict from :obj:OCRDataset.

Returns

The dict contains loaded bounding box, label polygon and text annotations.

Return type

dict

Read the Docs v: dev-1.x
Versions
latest
stable
v1.0.1
v1.0.0
0.x
v0.6.3
v0.6.2
v0.6.1
v0.6.0
v0.5.0
v0.4.1
v0.4.0
v0.3.0
v0.2.1
v0.2.0
v0.1.0
dev-1.x
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.