Cookies on this website
We use cookies to ensure that we give you the best experience on our website. If you click 'Continue' we'll assume that you are happy to receive all cookies and you won't see this message again. Click 'Find out more' for information on how to change your cookie settings.

An intensive care unit mortality prediction model for the PhysioNet/Computing in Cardiology Challenge 2012 using a novel Bayesian ensemble learning algorithm is described. Methods: Data pre-processing was automatically performed based upon domain knowledge to remove artefacts and erroneous recordings, e.g. physiologically invalid entries and unit conversion errors. A range of diverse features was extracted from the original time series signals including standard statistical descriptors such as the minimum, maximum, median, first, last, and the number of values. A new Bayesian ensemble scheme comprising 500 weak learners was then developed to classify the data samples. Each weak learner was a decision tree of depth two, which randomly assigned an intercept and gradient to a randomly selected single feature. The parameters of the ensemble learner were determined using a custom Markov chain Monte Carlo sampler. Results: The model was trained using 4000 observations from the training set, and was evaluated by the organisers of the competition on two new datasets with 4000 observations each (set b and set c). The outcomes of the datasets were unavailable to the competitors. The competition was judged on two events by two scores. Score 1 was the minimum of the positive predictive value and sensitivity for binary model predictions, and the model achieved 0.5310 and 0.5353 on the unseen datasets. Score 2, a range-normalized Hosmer-Lemeshow C statistic, evaluated to 26.44 and 29.86. The model was re-developed using the updated data sets from phase 2 after the competition, and achieved a score 1 of 0.5374 and a score 2 of 18.20 on set c. Conclusion: The proposed prediction model performs favourably on both the provided and hidden data sets (set A and set B), and has the potential to be used effectively for patient-specific predictions. © 2012 CCAL.


Journal article


Computing in Cardiology

Publication Date





249 - 252