SATRNEncoder¶
- class mmocr.models.textrecog.SATRNEncoder(n_layers=12, n_head=8, d_k=64, d_v=64, d_model=512, n_position=100, d_inner=256, dropout=0.1, init_cfg=None)[源代码]¶
Implement encoder for SATRN, see `SATRN.
<https://arxiv.org/abs/1910.04396>`_.
- 参数
n_layers (int) – Number of attention layers. Defaults to 12.
n_head (int) – Number of parallel attention heads. Defaults to 8.
d_k (int) – Dimension of the key vector. Defaults to 64.
d_v (int) – Dimension of the value vector. Defaults to 64.
d_model (int) – Dimension \(D_m\) of the input from previous model. Defaults to 512.
n_position (int) – Length of the positional encoding vector. Must be greater than
max_seq_len
. Defaults to 100.d_inner (int) – Hidden dimension of feedforward layers. Defaults to 256.
dropout (float) – Dropout rate. Defaults to 0.1.
init_cfg (dict or list[dict], optional) – Initialization configs. Defaults to None.
- 返回类型
- forward(feat, data_samples=None)[源代码]¶
Forward propagation of encoder.
- 参数
feat (Tensor) – Feature tensor of shape \((N, D_m, H, W)\).
data_samples (list[TextRecogDataSample]) – Batch of TextRecogDataSample, containing valid_ratio information. Defaults to None.
- 返回
A tensor of shape \((N, T, D_m)\).
- 返回类型
Tensor