Shortcuts

STN

class mmocr.models.textrecog.STN(in_channels, resized_image_size=(32, 64), output_image_size=(32, 100), num_control_points=20, margins=[0.05, 0.05], init_cfg=[{'type': 'Xavier', 'layer': 'Conv2d'}, {'type': 'Constant', 'val': 1, 'layer': 'BatchNorm2d'}])[source]

Implement STN module in ASTER: An Attentional Scene Text Recognizer with Flexible Rectification (https://ieeexplore.ieee.org/abstract/document/8395027/)

Parameters
  • in_channels (int) – The number of input channels.

  • resized_image_size (Tuple[int, int]) – The resized image size. The input image will be downsampled to have a better recitified result.

  • output_image_size (Tuple[int, int]) – The size of the output image for TPS. Defaults to (32, 100).

  • num_control_points (int) – The number of control points. Defaults to 20.

  • margins (Tuple[float, float]) – The margins for control points to the top and down side of the image for TPS. Defaults to [0.05, 0.05].

  • init_cfg (Optional[Union[Dict, List[Dict]]]) –

forward(img)[source]

Forward function of STN.

Parameters

img (Tensor) – The input image tensor.

Returns

The rectified image tensor.

Return type

Tensor

init_stn(stn_fc2)[source]

Initialize the output linear layer of stn, so that the initial source point will be at the top and down side of the image, which will help to optimize.

Parameters

stn_fc2 (nn.Linear) – The output linear layer of stn.

Return type

None

Read the Docs v: stable
Versions
latest
stable
v1.0.1
v1.0.0
0.x
v0.6.3
v0.6.2
v0.6.1
v0.6.0
v0.5.0
v0.4.1
v0.4.0
v0.3.0
v0.2.1
v0.2.0
v0.1.0
dev-1.x
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.