This page describes the dataset used in the ICPR 2024 paper "BarBeR: A Barcode Benchmarking Repository". This dataset is a collection of 8 748 real images containing barcodes of different types captured in different conditions and with polygon annotations. The dataset has been developed for training and testing barcode localization algorithms and can be used to replicate the results presented in the aforementioned paper.
The table below resumes the core information about the released dataset.
The dataset comprises 12 smaller public datasets of 1D and 2D barcodes. Metadata has been rewritten in a unified format using the VGG annotator. The collected datasets account for 8 748 images with 9 818 annotated barcodes, 8 062 linear, and 1 756 two-dimensional. The dataset contains examples from 18 types of barcodes, of which 14 are 1D barcode symbologies (Code 128, Code 39, EAN-2, EAN-8, EAN-13, GS1-128, IATA 2 of 5, Intelligent Mail Barcode, Interleaved 2 of 5, Japan Postal Barcode, KIX-code, PostNet, RoyalMail Code, and UPC). For 2D barcodes, the included symbologies are Aztec, Datamatrix, PDF-417, and QR Code. The annotations contain the polygon shape of each barcode plus 3 important characteristics of the code: its type, it's PPE (pixels per element, other times referred to as PPM or pixels per module) and encoded string.
Field | Value |
---|---|
# Images | 8748 |
PPE Range 1D | [0.88, 24.33] |
PPE Range 2D | [1.21, 71.1] |
Annotation Format | VGG Annotator |
Lowest Resolution | [200, 141] |
Number of Objects | 9818 |
Highest Resolution | [5984, 3376] |
Number of 1D Labels | 8062 |
Number of 2D Labels | 1756 |
Number of Categories | 19 |
The BarBeR dataset follows the VGG annotation format. Datasets consist of two components: images in jpg format and a folder of JSON files with the annotations. Each barcode is assigned a polygon of 4 vertices. In total, we have 19 classes for the argument "Type", 18 of which identify a particular barcode type (Code 128, Code 39, EAN-2, EAN-8, EAN-13, GS1-128, IATA 2 of 5, Intelligent Mail Barcode, Interleaved 2 of 5, Japan Postal Barcode, KIX-code, PostNet, RoyalMail Code, UPC, Aztec, Datamatrix, PDF-417, and QR Code). The last class is the class "1D" which indicates a 1D barcode that was not automatically assigned to a category because undecodable. All these categories are automatically divided into 2 bigger groups when running the benchmark: 1D and 2D.