Take, for example, an investigative study for predicting the progression of diabetes for a group of patients diagnosed with the disease. The dataset was collected for an academic study and is based on just 10 demographic and clinical measurements .
As we might expect, because the data does not incorporate information about genetic disposition, lifestyle, diet or comorbidities, we found that training a global predictive model on the data did not yield useful predictions about disease progression for all patients. This is not down to a lack of machine learning sophistication – the data simply does not contain sufficient discerning information. However, because humans are well attached to reality, and machines are not, we are best placed to apply our domain knowledge to recognise this. Secondly, unlike machine learning algorithms, as we aren’t restricted to a pre-programmed remit, we have the freedom to think outside the box.
On re-examining the accuracy of the predicted disease progression across the cohort, we came up with the idea that some conditional cases might possibly be more predictable than others. Returning to our machine learning toolkit, we applied an unsupervised learning algorithm to map out the patient cohort and discover that there are indeed sub-groups that exhibit superior prediction accuracy. The visualisation below renders them in bolder colours: blue for lowest, yellow for intermediate and red for high disease progression together with a higher predictive certainty. The fainter and paler areas symbolise sub-groups of lower prediction accuracy.
To tie this back to real-life applications, this partitioning of the cohort is analogous to the practice of ‘stratified healthcare’ or ‘precision medicine’, where the treatment prescribed to patients is not a solitary choice but selected for each individual qualified by a clinical assessment. A clinician could use the bolder areas of the map to advise those patients whether they are presently safely managing their diabetes or are at medium or high risk and require intervention. Patients in fainter areas may be advised that more information is required from them, which could include tracking diet, exercise or other activities alongside further clinical measurements.
While this small investigative study doesn’t represent a fully-fledged clinical project, we can take note of a few nuggets of insight. Machine learning is tremendously useful, capable of resolving patterns and relationships beyond the reach of human perception. It is, however, locked into its digital domain, working on an imperfect reflection of reality. Only we have the ability to bridge the gap between reality and the digital realm by thinking outside the box to refine our analysis and by applying domain knowledge to interpret the significance and application of the results. Having undertaken this study, we realise that we have improved our own understanding of reality and our analysis of it. It makes sense that not only is human intelligence a key component of machine learning, but that machine learning firmly reciprocates our own learning.
Lee Sedol, acclaimed as the Roger Federer of Go, has said that “robots will never understand the beauty of the game the same way that we humans do” . After all, AlphaGo does not really ‘play’ Go, or exhibit any passion for the game. So, if we are teaming up with our new cool-headed but keen learning companion of machine learning, then by exercising our own capacity to experiment and discover, we become better equipped to engineer applications and solutions that work well in everyday life.