Shortcuts

DRRGModuleLoss

class mmocr.models.textdet.DRRGModuleLoss(ohem_ratio=3.0, downsample_ratio=1.0, orientation_thr=2.0, resample_step=8.0, num_min_comps=9, num_max_comps=600, min_width=8.0, max_width=24.0, center_region_shrink_ratio=0.3, comp_shrink_ratio=1.0, comp_w_h_ratio=0.3, text_comp_nms_thr=0.25, min_rand_half_height=8.0, max_rand_half_height=24.0, jitter_level=0.2, loss_text={'eps': 1e-05, 'fallback_negative_num': 100, 'type': 'MaskedBalancedBCEWithLogitsLoss'}, loss_center={'type': 'MaskedBCEWithLogitsLoss'}, loss_top={'reduction': 'none', 'type': 'SmoothL1Loss'}, loss_btm={'reduction': 'none', 'type': 'SmoothL1Loss'}, loss_sin={'type': 'MaskedSmoothL1Loss'}, loss_cos={'type': 'MaskedSmoothL1Loss'}, loss_gcn={'type': 'CrossEntropyLoss'})[source]

The class for implementing DRRG loss. This is partially adapted from https://github.com/GXYM/DRRG licensed under the MIT license.

DRRG: Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection.

Parameters
  • ohem_ratio (float) – The negative/positive ratio in ohem. Defaults to 3.0.

  • downsample_ratio (float) – Downsample ratio. Defaults to 1.0. TODO: remove it.

  • orientation_thr (float) – The threshold for distinguishing between head edge and tail edge among the horizontal and vertical edges of a quadrangle. Defaults to 2.0.

  • resample_step (float) – The step size for resampling the text center line. Defaults to 8.0.

  • num_min_comps (int) – The minimum number of text components, which should be larger than k_hop1 mentioned in paper. Defaults to 9.

  • num_max_comps (int) – The maximum number of text components. Defaults to 600.

  • min_width (float) – The minimum width of text components. Defaults to 8.0.

  • max_width (float) – The maximum width of text components. Defaults to 24.0.

  • center_region_shrink_ratio (float) – The shrink ratio of text center regions. Defaults to 0.3.

  • comp_shrink_ratio (float) – The shrink ratio of text components. Defaults to 1.0.

  • comp_w_h_ratio (float) – The width to height ratio of text components. Defaults to 0.3.

  • min_rand_half_height (float) – The minimum half-height of random text components. Defaults to 8.0.

  • max_rand_half_height (float) – The maximum half-height of random text components. Defaults to 24.0.

  • jitter_level (float) – The jitter level of text component geometric features. Defaults to 0.2.

  • loss_text (dict) – The loss config used to calculate the text loss. Defaults to dict(type='MaskedBalancedBCEWithLogitsLoss', fallback_negative_num=100, eps=1e-5).

  • loss_center (dict) – The loss config used to calculate the center loss. Defaults to dict(type='MaskedBCEWithLogitsLoss').

  • loss_top (dict) – The loss config used to calculate the top loss, which is a part of the height loss. Defaults to dict(type='SmoothL1Loss', reduction='none').

  • loss_btm (dict) – The loss config used to calculate the bottom loss, which is a part of the height loss. Defaults to dict(type='SmoothL1Loss', reduction='none').

  • loss_sin (dict) – The loss config used to calculate the sin loss. Defaults to dict(type='MaskedSmoothL1Loss').

  • loss_cos (dict) – The loss config used to calculate the cos loss. Defaults to dict(type='MaskedSmoothL1Loss').

  • loss_gcn (dict) – The loss config used to calculate the GCN loss. Defaults to dict(type='CrossEntropyLoss').

  • text_comp_nms_thr (float) –

Return type

None

forward(preds, data_samples)[source]

Compute Drrg loss.

Parameters
  • preds (tuple) – The prediction tuple(pred_maps, gcn_pred, gt_labels), each of shape \((N, 6, H, W)\), \((N, 2)\) and \((m ,n)\), where \(m * n = N\).

  • data_samples (list[TextDetDataSample]) – The data samples.

Returns

A loss dict with loss_text, loss_center, loss_height, loss_sin, loss_cos, and loss_gcn.

Return type

dict

get_targets(data_samples)[source]

Generate loss targets from data samples.

Parameters

data_samples (list(TextDetDataSample)) – Ground truth data samples.

Returns

A tuple of 8 lists of tensors as DRRG targets. Read docstring of _get_target_single for more details.

Return type

tuple

Read the Docs v: dev-1.x
Versions
latest
stable
v1.0.1
v1.0.0
0.x
v0.6.3
v0.6.2
v0.6.1
v0.6.0
v0.5.0
v0.4.1
v0.4.0
v0.3.0
v0.2.1
v0.2.0
v0.1.0
dev-1.x
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.