Unsupervised: learning without any labels (e.g. Supervised learning: estimating model parameters using manually labelled dataĭistantly supervised: learning from pseudo, noisy ‘labels’ derived automatically by applying rules to existing databases or other structured data Human-in-the-loop: workflows in which humans remain involved, rather than being replaced Semi-automation: using machine learning to expedite tasks, rather than complete them Micro-tasks: discrete units of work that together complete a larger undertaking Text classification: automated categorization of documents into groups of interestĭata extraction: the task of identifying key bits of structured information from textsĬrowd-sourcing: decomposing work into micro-tasks to be performed by distributed workers Natural language processing: computational methods for automatically processing and analysing ‘natural’ (i.e. Machine learning: computer algorithms which ‘learn’ to perform a specific task through statistical modelling of (typically large amounts of) data
We have not described machine learning methods from academic papers unless a system to enact them has been made available we likewise have not described (the very large number of) software tools for facilitating systematic reviews unless they make use of machine learning.īox 1 Glossary of terms used in systematic review automation
SR Toolbox is a publicly available online catalogue of software tools to aid systematic review production and is regularly updated via regular literature surveillance plus direct submissions from tool developers and via social media.
Instead, we identified machine learning systems that are available for use in practice at the time of writing, through manual screening of records in SR Toolbox Footnote 1 on January 3, 2019, to identify all systematic review tools which incorporated machine learning. Perhaps unsurprisingly, multiple systematic reviews of such efforts already exist. We do not attempt an exhaustive review of research in this burgeoning field. We also discuss how a systematic review team might go about using them, and the strengths and limitations of each. We describe the current state of the science and provide practical guidance on which methods we believe are ready for use. In this survey, we aim to provide a gentle introduction to automation technologies for the non-computer scientist. Research on methods for semi-automating systematic reviews via machine learning and natural language processing now constitutes its own (small) subfield, with an accompanying body of work. This problem has been discussed at length elsewhere. The fundamental problem is that current EBM methods, while rigorous, simply do not scale to meet the demands imposed by the voluminous scale of the (unstructured) evidence base. Clearly, existing processes are not sustainable: reviews of current evidence cannot be produced efficiently and in any case often go out of date quickly once they are published.
More recent work estimates that conducting a review currently takes, on average, 67 weeks from registration to publication. Ī now outdated estimate from 1999 suggests that conducting a single review requires in excess of 1000 h of (highly skilled) manual labour. Second, the set of such articles is already massive and continues to expand rapidly. First, relevant evidence is primarily disseminated in unstructured, natural language articles describing the conduct and results of clinical trials. Unfortunately, this is a challenging aim to realize in practice, for a few reasons. Evidence-based medicine (EBM) is predicated on the idea of harnessing the entirety of the available evidence to inform patient care.