mmocr.models¶
models.common¶
Dictionary¶
The class generates a dictionary for recognition. |
Losses¶
This loss combines a Sigmoid layers and a masked balanced BCE loss in one single class. |
|
Masked dice loss. |
|
Masked Smooth L1 loss. |
|
Masked square dice loss. |
|
This loss combines a Sigmoid layers and a masked BCE loss in one single class. |
|
Smooth L1 loss. |
|
Cross entropy loss. |
|
Masked Balanced BCE loss. |
|
Masked BCE loss. |
Layers¶
Transformer Encoder Layer. |
|
Transformer Decoder Layer. |
Modules¶
Scaled Dot-Product Attention Module. |
|
Multi-Head Attention module. |
|
Two-layer feed-forward module. |
|
Fixed positional encoding with sine and cosine functions. |
models.textdet¶
Detectors¶
The class for implementing single stage text detector. |
|
The class for implementing DBNet text detector: Real-time Scene Text Detection with Differentiable Binarization. |
|
The class for implementing PANet text detector: |
|
The class for implementing PSENet text detector: Shape Robust Text Detection with Progressive Scale Expansion Network. |
|
The class for implementing TextSnake text detector: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes. |
|
The class for implementing FCENet text detector FCENet(CVPR2021): Fourier Contour Embedding for Arbitrary-shaped Text Detection |
|
The class for implementing DRRG text detector. |
|
A wrapper of MMDet’s model. |
Data Preprocessors¶
Image pre-processor for detection tasks. |
Necks¶
This code is from https://github.com/WenmuZhou/PAN.pytorch. |
|
FPN-like fusion module in Shape Robust Text Detection with Progressive Scale Expansion Network. |
|
FPN-like fusion module in Real-time Scene Text Detection with Differentiable Binarization. |
|
The class for implementing DRRG and TextSnake U-Net-like FPN. |
Heads¶
Base head for text detection, build the loss and postprocessor. |
|
The class for PSENet head. |
|
The class for PANet head. |
|
The class for DBNet head. |
|
The class for implementing FCENet head. |
|
The class for TextSnake head: TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes. |
|
The class for DRRG head: Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection. |
Module Losses¶
Base class for the module loss of segmentation-based text detection algorithms with some handy utilities. |
|
The class for implementing PANet loss. |
|
The class for implementing PSENet loss. |
|
The class for implementing DBNet loss. |
|
The class for implementing TextSnake loss. |
|
The class for implementing FCENet loss. |
|
The class for implementing DRRG loss. |
Postprocessors¶
Base postprocessor for text detection models. |
|
Decoding predictions of PSENet to instances. |
|
Convert scores to quadrangles via post processing in PANet. |
|
Decoding predictions of DbNet to instances. |
|
Merge text components and construct boundaries of text instances. |
|
Decoding predictions of FCENet to instances. |
|
Decoding predictions of TextSnake to instances. |
models.textrecog¶
Recognizers¶
Base class for recognizer. |
|
Base class for encode-decode recognizer. |
|
CTC-loss based recognizer. |
|
Implementation of SAR |
|
Implementation of NRTR |
|
Implementation of `RobustScanner. |
|
Implementation of SATRN |
|
Implementation of `Read Like Humans: Autonomous, Bidirectional and Iterative LanguageModeling for Scene Text Recognition. |
|
Implementation of MASTER |
|
Implement `ASTER: An Attentional Scene Text Recognizer with Flexible Rectification. |
Data Preprocessors¶
Image pre-processor for recognition tasks. |
Preprocessors¶
Implement STN module in ASTER: An Attentional Scene Text Recognizer with Flexible Rectification (https://ieeexplore.ieee.org/abstract/document/8395027/) |
BackBones¶
Implement ResNet backbone for text recognition, modified from |
|
A mini VGG backbone for text recognition, modified from `VGG-VeryDeep. |
|
Modality transform in NRTR. |
|
Implement Shallow CNN block for SATRN. |
|
Implement ResNet backbone for text recognition, modified from `ResNet. |
|
|
|
See mmdet.models.backbones.MobileNetV2 for details. |
Encoders¶
Implementation of encoder module in `SAR. |
|
Transformer Encoder block with self attention mechanism. |
|
Base Encoder class for text recognition. |
|
Change the channel number with a one by one convoluational layer. |
|
Implement encoder for SATRN, see `SATRN. |
|
Implement transformer encoder for text recognition, modified from <https://github.com/FangShancheng/ABINet>. |
|
Implement BiLSTM encoder module in `ASTER: An Attentional Scene Text Recognizer with Flexible Rectification. |
Decoders¶
Base decoder for text recognition, build the loss and postprocessor. |
|
Transformer-based language model responsible for spell correction. Implementation of language model of ABINet. |
|
Converts visual features into text characters. |
|
A special decoder responsible for mixing and aligning visual feature and linguistic feature. |
|
Decoder for CRNN. |
|
Implementation Parallel Decoder module in `SAR. |
|
Implementation Sequential Decoder module in `SAR. |
|
Parallel Decoder module with beam-search in SAR. |
|
Transformer Decoder block with self attention mechanism. |
|
Sequence attention decoder for RobustScanner. |
|
Position attention decoder for RobustScanner. |
|
Decoder for RobustScanner. |
|
Decoder module in MASTER. |
|
Implement attention decoder. |
Module Losses¶
Base recognition loss. |
|
Implementation of loss module for encoder-decoder based text recognition method with CrossEntropy loss. |
|
Implementation of loss module for CTC-loss based text recognition. |
|
Implementation of ABINet multiloss that allows mixing different types of losses with weights. |
Postprocessors¶
Base text recognition postprocessor. |
|
PostProcessor for seq2seq. |
|
PostProcessor for CTC. |
models.kie¶
Extractors¶
The implementation of the paper: Spatial Dual-Modality Graph Reasoning for Key Information Extraction. |
Module Losses¶
The implementation the loss of key information extraction proposed in the paper: Spatial Dual-Modality Graph Reasoning for Key Information Extraction. |
Postprocessors¶
Postprocessor for SDMGR. |