Shortcuts

Note

You are reading the documentation for MMOCR 0.x, which will soon be deprecated by the end of 2022. We recommend you upgrade to MMOCR 1.0 to enjoy fruitful new features and better performance brought by OpenMMLab 2.0. Check out the maintenance plan, changelog, code and documentation of MMOCR 1.0 for more details.

Key Information Extraction

Overview

The structure of the key information extraction dataset directory is organized as follows.

└── wildreceipt
  ├── class_list.txt
  ├── dict.txt
  ├── image_files
  ├── openset_train.txt
  ├── openset_test.txt
  ├── test.txt
  └── train.txt

Preparation Steps

WildReceipt

WildReceiptOpenset

  • Step0: have WildReceipt prepared.

  • Step1: Convert annotation files to OpenSet format:

# You may find more available arguments by running
# python tools/data/kie/closeset_to_openset.py -h
python tools/data/kie/closeset_to_openset.py data/wildreceipt/train.txt data/wildreceipt/openset_train.txt
python tools/data/kie/closeset_to_openset.py data/wildreceipt/test.txt data/wildreceipt/openset_test.txt

Note

You can learn more about the key differences between CloseSet and OpenSet annotations in our tutorial.