Hackathon Description and Data Download

Dataset Description

This dataset contains data gathered across 6 different hospitals in northern Italy during the first outbreak of SARS-CoV-2 (March-June 2020) with the coordination of Centro Diagnostico Italiano (CDI): https://aiforcovid.radiomica.it/. All included subjects were confirmed with a diagnosis of COVID-19. Disease outcome was updated at a later stage and is here reported as either severe (if patient required mechanical ventilation or died) or mild (all other outcomes).

During triage a set of clinical tests were performed generating a number of clinical parameters, 16 of which were deemed relevant for outcome prediction and included in the dataset. The following table reports the name and a brief description of the collected items. Not all items are available for all subjects.


Patient’s age (years)


Patient’s sex (0 – male, 1 – female)

Body Temperature (°C)

Patient temperature at admission (in °C)




Patient had intense tightening in the chest, air hunger, difficulty breathing, breathlessness or feeling of suffocation


White blood cells count (10^9/L)


C-reactive protein concentration (mg/dL)


Fibrinogen concentration in blood (mg/dL)


Lactate dehydrogenase concentration in blood (U/L)


D-dimer amount in blood


Oxygen percentage in blood


Partial pressure of of oxygen in arterial blood (mmHg)


Arterial oxygen saturation (%)


Blood pH

Cardiovascular Disease

Patient had cardiovascular disease

Respiratory Failure

Patient had respiratory failure


For each patient, a single chest X-ray is reported, also collected on first day of hospital admission. X-ray scans often occurred in emergency conditions, therefore both image quality and subject position are highly variable. Furthermore, some images were collected on digital support, while others are the result of digitalization of film images.

For the purpose of this challenge, data from the 6 different hospitals constitute the training set and will be provided as collected (raw), for a total of 1103 subjects. The test set is composed of 486 additional entries, all collected at the same institution. The test set has only recently been curated and, differently from the training set, it has not been made publicly available before. Patient outcome is provided in all instances for the training set and never for the test set.

Challenge Description

The objective of this hackathon consists in the classification of subjects according to disease outcome.

Proposed solutions will be considered if they meet the following two requirements:

  • Use of chest X-rays images: while it is possible to provide tentative classifications of outcome based on clinical data alone, the main focus of this hackathon is the development of solutions primarily exploiting X-ray images, eventually jointly with clinical data. Therefore solutions that completely eschew available images to focus only on clinical data will not be considered in the final ranking;
  • inclusion of methods description: a short description of the algorithm (max two A4 pages) that generated the results has to be submitted by the end date of the hackathon (February 28 2022). If other image databases are employed for pre-training or similar purposes, they should be listed here. While this document will not directly influence the final ranking, the organizers might decide to exclude from the competition algorithms with implausible descriptions.

Furthermore, the winners in each category (see below) will be required to make all the relevant code publicly available. There are no constraints over the licensing scheme, but the solution should be reproducible.

The score used for ranking is the accuracy value of the proposed classification on the test set. Please note that the proportion of classes may vary between training and test sets.

A second prize will be awarded for the approach with the highest level of explainability. This decision will be taken by a panel composed of clinicians and computer vision scientists.

Download the Data

Only registered (and authenticated) users can donwload the dataset. Please register or login here.