Shortcuts

TextSnakeModuleLoss

class mmocr.models.textdet.TextSnakeModuleLoss(ohem_ratio=3.0, downsample_ratio=1.0, orientation_thr=2.0, resample_step=4.0, center_region_shrink_ratio=0.3, loss_text={'eps': 1e-05, 'fallback_negative_num': 100, 'type': 'MaskedBalancedBCEWithLogitsLoss'}, loss_center={'type': 'MaskedBCEWithLogitsLoss'}, loss_radius={'type': 'MaskedSmoothL1Loss'}, loss_sin={'type': 'MaskedSmoothL1Loss'}, loss_cos={'type': 'MaskedSmoothL1Loss'})[source]

The class for implementing TextSnake loss. This is partially adapted from https://github.com/princewang1994/TextSnake.pytorch.

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes.

Parameters
  • ohem_ratio (float) – The negative/positive ratio in ohem.

  • downsample_ratio (float) – Downsample ratio. Defaults to 1.0. TODO: remove it.

  • orientation_thr (float) – The threshold for distinguishing between head edge and tail edge among the horizontal and vertical edges of a quadrangle.

  • resample_step (float) – The step of resampling.

  • center_region_shrink_ratio (float) – The shrink ratio of text center.

  • loss_text (dict) – The loss config used to calculate the text loss.

  • loss_center (dict) – The loss config used to calculate the center loss.

  • loss_radius (dict) – The loss config used to calculate the radius loss.

  • loss_sin (dict) – The loss config used to calculate the sin loss.

  • loss_cos (dict) – The loss config used to calculate the cos loss.

Return type

None

forward(preds, data_samples)[source]
Parameters
  • preds (Tensor) – The prediction map of shape \((N, 5, H, W)\), where each dimension is the map of “text_region”, “center_region”, “sin_map”, “cos_map”, and “radius_map” respectively.

  • data_samples (list[TextDetDataSample]) – The data samples.

Returns

A loss dict with loss_text, loss_center, loss_radius, loss_sin and loss_cos.

Return type

dict

get_targets(data_samples)[source]

Generate loss targets from data samples.

Parameters

data_samples (list(TextDetDataSample)) – Ground truth data samples.

Returns

tuple(gt_text_masks, gt_masks, gt_center_region_masks, gt_radius_maps, gt_sin_maps, gt_cos_maps): A tuple of six lists of ndarrays as the targets.

Return type

Tuple

vector_angle(vec1, vec2)[source]

Compute the angle between two vectors.

Parameters
Return type

numpy.ndarray

vector_cos(vec)[source]

Compute the cos of the angle between vector and x-axis.

Parameters

vec (numpy.ndarray) –

Return type

float

vector_sin(vec)[source]

Compute the sin of the angle between vector and x-axis.

Parameters

vec (numpy.ndarray) –

Return type

float

vector_slope(vec)[source]

Compute the slope of a vector.

Parameters

vec (numpy.ndarray) –

Return type

float

Read the Docs v: stable
Versions
latest
stable
v1.0.1
v1.0.0
0.x
v0.6.3
v0.6.2
v0.6.1
v0.6.0
v0.5.0
v0.4.1
v0.4.0
v0.3.0
v0.2.1
v0.2.0
v0.1.0
dev-1.x
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.