Shortcuts

DRRGHead

class mmocr.models.textdet.DRRGHead(in_channels, k_at_hops=(8, 4), num_adjacent_linkages=3, node_geo_feat_len=120, pooling_scale=1.0, pooling_output_size=(4, 3), nms_thr=0.3, min_width=8.0, max_width=24.0, comp_shrink_ratio=1.03, comp_ratio=0.4, comp_score_thr=0.3, text_region_thr=0.2, center_region_thr=0.2, center_region_area_thr=50, local_graph_thr=0.7, module_loss={'type': 'DRRGModuleLoss'}, postprocessor={'link_thr': 0.85, 'type': 'DRRGPostprocessor'}, init_cfg={'mean': 0, 'override': {'name': 'out_conv'}, 'std': 0.01, 'type': 'Normal'})[source]

The class for DRRG head: Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection.

Parameters
  • in_channels (int) – The number of input channels.

  • k_at_hops (tuple(int)) – The number of i-hop neighbors, i = 1, 2. Defaults to (8, 4).

  • num_adjacent_linkages (int) – The number of linkages when constructing adjacent matrix. Defaults to 3.

  • node_geo_feat_len (int) – The length of embedded geometric feature vector of a component. Defaults to 120.

  • pooling_scale (float) – The spatial scale of rotated RoI-Align. Defaults to 1.0.

  • pooling_output_size (tuple(int)) – The output size of RRoI-Aligning. Defaults to (4, 3).

  • nms_thr (float) – The locality-aware NMS threshold of text components. Defaults to 0.3.

  • min_width (float) – The minimum width of text components. Defaults to 8.0.

  • max_width (float) – The maximum width of text components. Defaults to 24.0.

  • comp_shrink_ratio (float) – The shrink ratio of text components. Defaults to 1.03.

  • comp_ratio (float) – The reciprocal of aspect ratio of text components. Defaults to 0.4.

  • comp_score_thr (float) – The score threshold of text components. Defaults to 0.3.

  • text_region_thr (float) – The threshold for text region probability map. Defaults to 0.2.

  • center_region_thr (float) – The threshold for text center region probability map. Defaults to 0.2.

  • center_region_area_thr (int) – The threshold for filtering small-sized text center region. Defaults to 50.

  • local_graph_thr (float) – The threshold to filter identical local graphs. Defaults to 0.7.

  • module_loss (dict) – The config of loss that DRRGHead uses. Defaults to dict(type='DRRGModuleLoss').

  • postprocessor (dict) – Config of postprocessor for Drrg. Defaults to dict(type='DrrgPostProcessor', link_thr=0.85).

  • init_cfg (dict or list[dict], optional) – Initialization configs. Defaults to dict(type='Normal', override=dict(name='out_conv'), mean=0, std=0.01).

Return type

None

forward(inputs, data_samples=None)[source]

Run DRRG head in prediction mode, and return the raw tensors only. :param inputs: Shape of \((1, C, H, W)\). :type inputs: Tensor :param data_samples: A list of data

samples. Defaults to None.

Returns

Returns (edge, score, text_comps).

  • edge (ndarray): The edge array of shape \((N_{edges}, 2)\) where each row is a pair of text component indices that makes up an edge in graph.

  • score (ndarray): The score array of shape \((N_{edges},)\), corresponding to the edge above.

  • text_comps (ndarray): The text components of shape \((M, 9)\) where each row corresponds to one box and its score: (x1, y1, x2, y2, x3, y3, x4, y4, score).

Return type

tuple

Parameters
loss(inputs, data_samples)[source]

Loss function.

Parameters
  • inputs (Tensor) – Shape of \((N, C, H, W)\).

  • data_samples (List[TextDetDataSample]) – List of data samples.

Returns

  • pred_maps (Tensor): Prediction map with shape

    \((N, 6, H, W)\).

  • gcn_pred (Tensor): Prediction from GCN module, with

    shape \((N, 2)\).

  • gt_labels (Tensor): Ground-truth label of shape

    \((m, n)\) where \(m * n = N\).

Return type

tuple(pred_maps, gcn_pred, gt_labels)

Read the Docs v: dev-1.x
Versions
latest
stable
v1.0.1
v1.0.0
0.x
v0.6.3
v0.6.2
v0.6.1
v0.6.0
v0.5.0
v0.4.1
v0.4.0
v0.3.0
v0.2.1
v0.2.0
v0.1.0
dev-1.x
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.