Shortcuts

FCEModuleLoss

class mmocr.models.textdet.FCEModuleLoss(fourier_degree, num_sample, negative_ratio=3.0, resample_step=4.0, center_region_shrink_ratio=0.3, level_size_divisors=(8, 16, 32), level_proportion_range=((0, 0.4), (0.3, 0.7), (0.6, 1.0)), loss_tr={'type': 'MaskedBalancedBCELoss'}, loss_tcl={'type': 'MaskedBCELoss'}, loss_reg_x={'reduction': 'none', 'type': 'SmoothL1Loss'}, loss_reg_y={'reduction': 'none', 'type': 'SmoothL1Loss'})[source]

The class for implementing FCENet loss.

FCENet(CVPR2021): Fourier Contour Embedding for Arbitrary-shaped Text Detection

Parameters
  • fourier_degree (int) – The maximum Fourier transform degree k.

  • num_sample (int) – The sampling points number of regression loss. If it is too small, fcenet tends to be overfitting.

  • negative_ratio (float or int) – Maximum ratio of negative samples to positive ones in OHEM. Defaults to 3.

  • resample_step (float) – The step size for resampling the text center line (TCL). It’s better not to exceed half of the minimum width.

  • center_region_shrink_ratio (float) – The shrink ratio of text center region.

  • level_size_divisors (tuple(int)) – The downsample ratio on each level.

  • level_proportion_range (tuple(tuple(int))) – The range of text sizes assigned to each level.

  • loss_tr (dict) – The loss config used to calculate the text region loss. Defaults to dict(type=’MaskedBalancedBCELoss’).

  • loss_tcl (dict) – The loss config used to calculate the text center line loss. Defaults to dict(type=’MaskedBCELoss’).

  • loss_reg_x (dict) – The loss config used to calculate the regression loss on x axis. Defaults to dict(type=’MaskedSmoothL1Loss’).

  • loss_reg_y (dict) – The loss config used to calculate the regression loss on y axis. Defaults to dict(type=’MaskedSmoothL1Loss’).

Return type

None

forward(preds, data_samples)[source]

Compute FCENet loss.

Parameters
  • preds (list[dict]) – A list of dict with keys of cls_res, reg_res corresponds to the classification result and regression result computed from the input tensor with the same index. They have the shapes of \((N, C_{cls,i}, H_i, W_i)\) and :math: (N, C_{out,i}, H_i, W_i).

  • data_samples (list[TextDetDataSample]) – The data samples.

Returns

The dict for fcenet losses with loss_text, loss_center,

loss_reg_x and loss_reg_y.

Return type

dict

forward_single(pred, gt)[source]

Compute loss for one feature level.

Parameters
  • pred (dict) – A dict with keys cls_res and reg_res corresponds to the classification result and regression result from one feature level.

  • gt (Tensor) – Ground truth for one feature level. Cls and reg targets are concatenated along the channel dimension.

Returns

A list of losses for each feature level.

Return type

list[Tensor]

get_targets(data_samples)[source]

Generate loss targets for fcenet from data samples.

Parameters

data_samples (list(TextDetDataSample)) – Ground truth data samples.

Returns

A tuple of three tensors from three different

feature level as FCENet targets.

Return type

tuple[Tensor]

Read the Docs v: dev-1.x
Versions
latest
stable
v1.0.1
v1.0.0
0.x
v0.6.3
v0.6.2
v0.6.1
v0.6.0
v0.5.0
v0.4.1
v0.4.0
v0.3.0
v0.2.1
v0.2.0
v0.1.0
dev-1.x
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.