Shortcuts

Useful Tools

We provide some useful tools under mmocr/tools directory.

Publish a Model

Before you upload a model to AWS, you may want to (1) convert the model weights to CPU tensors, (2) delete the optimizer states and (3) compute the hash of the checkpoint file and append the hash id to the filename. These functionalities could be achieved by tools/publish_model.py.

python tools/publish_model.py ${INPUT_FILENAME} ${OUTPUT_FILENAME}

For example,

python tools/publish_model.py work_dirs/psenet/latest.pth psenet_r50_fpnf_sbn_1x_20190801.pth

The final output filename will be psenet_r50_fpnf_sbn_1x_20190801-{hash id}.pth.

Convert text recognition dataset to lmdb format

Reading images or labels from files can be slow when data are excessive, e.g. on a scale of millions. Besides, in academia, most of the scene text recognition datasets are stored in lmdb format, including images and labels. To get closer to the mainstream practice and enhance the data storage efficiency, MMOCR now provides tools/data/utils/lmdb_converter.py to convert text recognition datasets to lmdb format.

Arguments Type Description
label_path str Path to label file.
output str Output lmdb path.
--img-root str Input imglist path.
--label-only bool Only converter label to lmdb
--label-format str The format of the label file, either txt or jsonl.
--batch-size int Processing batch size, defaults to 1000
--encoding str Bytes coding scheme, defaults to utf8.
--lmdb-map-size int Maximum size database may grow to , defaults to 1099511627776 bytes (1TB)

Examples

Generate a mixed lmdb file with label.txt and images in imgs/:

python tools/data/utils/lmdb_converter.py label.txt imgs.lmdb -i imgs

Generate a mixed lmdb file with label.jsonl and images in imgs/:

python tools/data/utils/lmdb_converter.py label.json imgs.lmdb -i imgs -f jsonl

Generate a label-only lmdb file with label.txt:

python tools/data/utils/lmdb_converter.py label.txt label.lmdb --label-only

Generate a label-only lmdb file with label.jsonl:

python tools/data/utils/lmdb_converter.py label.json label.lmdb --label-only -f jsonl

Convert annotations from Labelme

Labelme is a popular graphical image annotation tool. You can convert the labels generated by labelme to the MMOCR data format using tools/data/common/labelme_converter.py. Both detection and recognition tasks are supported.

# tasks can be "det" or both "det", "recog"
python tools/data/common/labelme_converter.py <json_dir> <image_dir> <out_dir> --tasks <tasks>

For example, converting the labelme format annotation in tests/data/toy_dataset/labelme to MMOCR detection labels instances_training.txt and cropping the image patches for recognition task to tests/data/toy_dataset/crops with the labels train_label.jsonl:

python tools/data/common/labelme_converter.py tests/data/toy_dataset/labelme tests/data/toy_dataset/imgs tests/data/toy_dataset --tasks det recog

Log Analysis

You can use tools/analyze_logs.py to plot loss/hmean curves given a training log file. Run pip install seaborn first to install the dependency.

python tools/analyze_logs.py plot_curve [--keys ${KEYS}] [--title ${TITLE}] [--legend ${LEGEND}] [--backend ${BACKEND}] [--style ${STYLE}] [--out ${OUT_FILE}]
Arguments Type Description
--keys str The metric that you want to plot. Defaults to loss.
--title str Title of figure.
--legend str Legend of each plot.
--backend str Backend of the plot. more info
--style str Style of the plot. Defaults to dark. more info
--out str Path of output figure.

Examples:

Download the following DBNet and CRNN training logs to run demos.

wget https://download.openmmlab.com/mmocr/textdet/dbnet/dbnet_r18_fpnc_sbn_1200e_icdar2015_20210329-ba3ab597.log.json -O DBNet_log.json

wget https://download.openmmlab.com/mmocr/textrecog/crnn/20210326_111035.log.json -O CRNN_log.json

Please specify an output path if you are running the codes on systems without a GUI.

  • Plot loss metric.

    python tools/analyze_logs.py plot_curve DBNet_log.json --keys loss --legend loss
    
  • Plot hmean-iou:hmean metric of text detection.

    python tools/analyze_logs.py plot_curve DBNet_log.json --keys hmean-iou:hmean --legend hmean-iou:hmean
    
  • Plot 0_1-N.E.D metric of text recognition.

    python tools/analyze_logs.py plot_curve CRNN_log.json --keys 0_1-N.E.D --legend 0_1-N.E.D
    
  • Compute the average training speed.

    python tools/analyze_logs.py cal_train_time CRNN_log.json --include-outliers
    

    The output is expected to be like the following.

    -----Analyze train time of CRNN_log.json-----
    slowest epoch 4, average time is 0.3464
    fastest epoch 5, average time is 0.2365
    time std over epochs is 0.0356
    average iter time: 0.2906 s/iter
    
Read the Docs v: v0.6.1
Versions
latest
stable
v0.6.1
v0.6.0
v0.5.0
v0.4.1
v0.4.0
v0.3.0
v0.2.1
v0.2.0
v0.1.0
Downloads
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.