Dataset

The IMPROVED Dataset for Extreme Super-Resolution

An Official Grand Challenge at IEEE ICIP 2026

Tampere, Finland • 13–17 September 2026

ICIP Logo

The IMPROVED Dataset

The challenge is built upon the proprietary IMPROVED dataset (In-the-wild with Multi-Perspective Realistic Observations for Vehicle Evidence and license-plate recognition), specifically curated to capture the diversity and complexity of real operational environments.

Dataset Characteristics
  • Short real-world video clips (up to 10 frames) of moving vehicles with French license plates
  • Multiple acquisition devices (consumer cameras, surveillance cameras, smartphones)
  • Wide range of distances (10–100 meters)
  • Variable lighting conditions
  • Different weather conditions (clear, cloudy)
  • Large viewpoint variability (frontal, oblique, steep angle)
  • Strong natural degradations (motion blur, sensor noise, compression artifacts)
Data Format

Each video clip is paired with:

  • Low-resolution input frames extracted from the video
  • Ground-truth license plate text (character string) avalaible only for the developement set.
  • Frame-level metadata (Video clip ID and license plate bounding boxes)

Note: High-resolution reference images are not provided to emphasize the real-world super-resolution task.

Acquisition Site

Acquisition Circuit

The dataset was acquired at the Saint-Laurent-de-Mûre circuit in France. The highlighted portion of the circuit below was used for capturing the video sequences with vehicles in motion under realistic conditions.

Saint-Laurent-de-Mûre acquisition circuit

Figure 1: Saint-Laurent-de-Mûre circuit with highlighted acquisition zone.

Camera Configuration

The dataset features 17 different cameras covering a wide spectrum of imaging devices:

  • Surveillance cameras: Reolink, Instar, Hikvision, Dahua
  • Smartphones: Huawei P40 Pro, iPhone 15/15 Plus, Xiaomi Redmi Note 13, Huawei Honor 9X
  • Professional cameras: Blackmagic micro studio 4K (x2), Panasonic Lumix DMC-G70
  • Specialized: Hikvision fisheye camera (severe distortion), infrared camera

License Plate Examples

French License Plate Format

The dataset exclusively contains old and new French license plates. The format for old license plates 123-AAA-12 or 1234-AA-12 (3 or 4 degits, 2 or 3 letters, 2 degits). And the format for new license late is: AA-123-AA (2 letters, 3 digits, 2 letters).

Examples of French license plates in the dataset

Figure 2: Examples of French license plates from the acquisition.

File Format & Structure

Detection Annotations Format

Each sequence folder (10 frames) contains a detections.json file with the following structure:

[
  {
    "frame": "000000.png",
    "license_plate_coordinates": [x1, y1, x2, y2]
  },
  {
    "frame": "000001.png",
    "license_plate_coordinates": [x1, y1, x2, y2]
  },
  ...
]

Coordinates are in [top-left x, top-left y, bottom-right x, bottom-right y] format.

Directory Structure
dataset/
├── development/
│   └── seq_001/
│       ├── 000000.png
│       ├── ...
│       ├── 000009.png
│       └── detections.json
├── public_validation/
│   └── seq_002/
│       ├── 000000.png
│       ├── ...
│       └── detections.json
└── ground_truth.csv       (global ground truth file)

Each sequence contains exactly 10 consecutive frames.

Expected prediction file format

Participant should provide a zip file containing prediction.csv, it should include the sequence IDs and the coresponding predicted license plate text, as bellow:

sequence_id,license_plate
seq_001,AB123CD
seq_002,EF456GH
seq_002,457DEX16
...

Dataset Splits

The IMPROVED dataset is divided into three distinct sets to ensure fair evaluation and rigorous benchmarking:

Development Set

For Local Benchmarking & Validation

39 sequences (390 images)

  • Complete low-resolution video frames
  • Full ground-truth license plate labels
  • Detection coordinates in JSON format

Purpose: Allows participants to develop, test, and validate their models locally before submission.

Public Validation Set

For Public Leaderboard

347 sequences (3,470 images)

  • Low-resolution frames only (no ground truth)
  • Detection coordinates in JSON format

Purpose: Enables participants to benchmark against others and track progress on the public leaderboard.

Blind Test Set

For Final Evaluation

88 unreleazed sequences (880 images)

  • 16 completely unseen license plates
  • Similar diversity to development set
  • Evaluated on organizers' servers

Purpose: Ensures fair, unbiased final evaluation of all submissions on previously unseen data.

Training Policy & Guidelines

Training Guidelines
  • External data permitted: Participants may use any external data (synthetic, public, or proprietary) for training their models.
  • Model constraints: No restrictions on model architecture, size, or training methodology.
  • Pre-trained models: Use of publicly available pre-trained models is allowed and encouraged.
  • Data augmentation: Participants may augment the provided data with synthetic degradations.
  • Submission requirements: Final models must be submitted as Docker containers for reproducibility (top teams and on invitation only).
Important Notes
  • The blind test set will not be released to participants at any time
  • Public leaderboard rankings are indicative only
  • Final ranking is based exclusively on blind test set performance and expert evaluation
  • All submissions must follow the sequence_id,predicted_lp CSV format
  • Dataset license agreement must be signed during registration
Register to Access Dataset View Release Timeline