FCEModuleLoss¶
- class mmocr.models.textdet.FCEModuleLoss(fourier_degree, num_sample, negative_ratio=3.0, resample_step=4.0, center_region_shrink_ratio=0.3, level_size_divisors=(8, 16, 32), level_proportion_range=((0, 0.4), (0.3, 0.7), (0.6, 1.0)), loss_tr={'type': 'MaskedBalancedBCELoss'}, loss_tcl={'type': 'MaskedBCELoss'}, loss_reg_x={'reduction': 'none', 'type': 'SmoothL1Loss'}, loss_reg_y={'reduction': 'none', 'type': 'SmoothL1Loss'})[源代码]¶
The class for implementing FCENet loss.
FCENet(CVPR2021): Fourier Contour Embedding for Arbitrary-shaped Text Detection
- 参数
fourier_degree (int) – The maximum Fourier transform degree k.
num_sample (int) – The sampling points number of regression loss. If it is too small, fcenet tends to be overfitting.
negative_ratio (float or int) – Maximum ratio of negative samples to positive ones in OHEM. Defaults to 3.
resample_step (float) – The step size for resampling the text center line (TCL). It’s better not to exceed half of the minimum width.
center_region_shrink_ratio (float) – The shrink ratio of text center region.
level_size_divisors (tuple(int)) – The downsample ratio on each level.
level_proportion_range (tuple(tuple(int))) – The range of text sizes assigned to each level.
loss_tr (dict) – The loss config used to calculate the text region loss. Defaults to dict(type=’MaskedBalancedBCELoss’).
loss_tcl (dict) – The loss config used to calculate the text center line loss. Defaults to dict(type=’MaskedBCELoss’).
loss_reg_x (dict) – The loss config used to calculate the regression loss on x axis. Defaults to dict(type=’MaskedSmoothL1Loss’).
loss_reg_y (dict) – The loss config used to calculate the regression loss on y axis. Defaults to dict(type=’MaskedSmoothL1Loss’).
- 返回类型
- forward(preds, data_samples)[源代码]¶
Compute FCENet loss.
- 参数
preds (list[dict]) – A list of dict with keys of
cls_res
,reg_res
corresponds to the classification result and regression result computed from the input tensor with the same index. They have the shapes of \((N, C_{cls,i}, H_i, W_i)\) and :math: (N, C_{out,i}, H_i, W_i).data_samples (list[TextDetDataSample]) – The data samples.
- 返回
- The dict for fcenet losses with loss_text, loss_center,
loss_reg_x and loss_reg_y.
- 返回类型
- forward_single(pred, gt)[源代码]¶
Compute loss for one feature level.
- 参数
pred (dict) – A dict with keys
cls_res
andreg_res
corresponds to the classification result and regression result from one feature level.gt (Tensor) – Ground truth for one feature level. Cls and reg targets are concatenated along the channel dimension.
- 返回
A list of losses for each feature level.
- 返回类型
list[Tensor]
- get_targets(data_samples)[源代码]¶
Generate loss targets for fcenet from data samples.
- 参数
data_samples (list(TextDetDataSample)) – Ground truth data samples.
- 返回
- A tuple of three tensors from three different
feature level as FCENet targets.
- 返回类型
tuple[Tensor]