TextSnakeHead¶

class mmocr.models.textdet.TextSnakeHead(in_channels, out_channels=5, downsample_ratio=1.0, module_loss={'type': 'TextSnakeModuleLoss'}, postprocessor={'text_repr_type': 'poly', 'type': 'TextSnakePostprocessor'}, init_cfg={'mean': 0, 'override': {'name': 'out_conv'}, 'std': 0.01, 'type': 'Normal'})[source]¶

The class for TextSnake head: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes.

TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes.

Parameters

in_channels (int) – Number of input channels.
out_channels (int) – Number of output channels.
downsample_ratio (float) – Downsample ratio.
module_loss (dict) – Configuration dictionary for loss type. Defaults to dict(type='TextSnakeModuleLoss').
postprocessor (dict) – Config of postprocessor for TextSnake.
init_cfg (dict or list[dict], optional) – Initialization configs.

Return type

None

forward(inputs, data_samples=None)[source]¶

Parameters

inputs (torch.Tensor) – Shape \((N, C_{in}, H, W)\), where \(C_{in}\) is in_channels. \(H\) and \(W\) should be the same as the input of backbone.
data_samples (list[TextDetDataSample], optional) – A list of data samples. Defaults to None.

Returns

A tensor of shape \((N, 5, H, W)\), where the five channels represent [0]: text score, [1]: center score, [2]: sin, [3] cos, [4] radius, respectively.

Return type

Tensor