Two types of ground truth proliferation scores are provided with the training dataset. This translates to two challenge tasks that will be evaluated separately. The participants can submit results for either or both of the tasks. There is a third tasks that evaluates the performance of automatic mitosis detection on a different dataset.
Task 1: Prediction of proliferation score based on mitosis counting¶
The ground truth for this task is the proliferation score assigned by a pathologists. Details on how this scoring is performed in clinical practice can be found here. In short, when scoring a slide the pathologist first identifies a region of interest (the most invasive part of the tumor), counts mitoses and assigns a score based on the density of mitoses in the tissue. The score can be 1, 2 or 3 ranging from good to bad prognosis. Score 1 means that the pathologist counted less than 6 mitoses in 10 consecutive microscope high power fields (an area of approximately 2 mm2). Score 3 means that the pathologist counted more than 10 mitoses in 10 consecutive microscope high power fields. Intermediate cases are assigned score 2.
Evaluation: Methods will be ranked according to the agreement as measured with the quadratic weighted Cohen's kappa. This Github repository provides implementation of this metric in several programming languages.
Task 2: Prediction of proliferation score based on molecular data¶
This proliferation score is calculated as the mean RNA expression of 11 proliferation-associated genes and is a more objective measure compared with the proliferation score based on mitosis counting. It was initially described in this paper. Higher values for the proliferation score indicate higher tumor proliferation speed. The molecular proliferation score correlates well with the proliferation score based on mitosis counting, however the agreement is not perfect.
Evaluation: Methods will be ranked according to the agreement as measured with the Spearman's correlation coefficient.
Task 3: Mitosis detection¶
This task will evaluate the performance of mitosis detection algorithms in given tumor regions.
Evaluation: Methods will be ranked according to the overall F1-Score and the F1-Score computed for each case separately, similarly to the AMIDA13 challenge.