This webpage summaizes some of my work on biomedical pattern analysis.
Our proposal: We developed an algorithm that is capable of de-identifying the different eGFR assays. The algorithm was tested on a subset of the QICKD data set from CKD patients in England.
Findings: Our study shows that the developed algorithm can (1) identify the different eGFR reporting methods, and (2) “back-calibrate” the eGFR time-series in such a way that after calibration, the eGFR measurements stored in a patient’s health record prior to the standardised eGFR reporting become compatible with those reported using the standardised reporting.
Implications: The eGFR values of female patients calculated using MDRD alone would overestimate the one that is confirming to the NICE guidelines, whereas those of male patients would have been under estimated. Therefore, we recommend that our proposed algorithm be used to back-calibrate eGFR measurements when a novel eGFR reporting method is adopted each time.
The next step consists of sampling g, g' and t and then estimate p(g'|g,t). This function gives the likelihood of the rate change of eGFR given the current renal stage (that is the current eGFR value) and the age of the patient (t). We frame this as a conditional density estimation problem in which g',g, and t are continuous. The result is a likelihood graph shown below. The same graph can be represented as a likelihood table.
Personalized medicine involves customising management to meet patients’ needs. In CKD at the population level there is steady decline in renal function with increasing age; and progressive CKD has been defined as marked variation from this rate of decline.
To create visualisations of individual patient’s renal function and display smoothed trend lines and confidence intervals for their renal function and other important co-variants.
Applying advanced pattern recognition techniques developed in biometrics to routinely collected primary care data collected as part of the Quality Improvement in Chronic Kidney Disease (QICKD) trial. We plotted trend lines, using regression, and confidence intervals for individual patients. We also created a visualisation which allowed renal function to be compared with six other co-variants: glycated haemoglobin (HbA1c), body mass index (BMI), BP, and therapy. The outputs were reviewed by an expert panel.
We successfully extracted and displayed data. We demonstrated that estimated glomerular filtration (eGFR) is a noisy variable, and showed that a large number of people would exceed the “progressive CKD” criteria. We created a data display that could be readily automated. This display was well received by our expert panel but requires extensive development before testing in a clinical setting.
It is feasible to utilise data visualisation methods developed in biometrics to look at CKD data. The criteria for defining “progressive CKD” need revisiting, as many patients exceed them. Further development work and testing is needed to explore whether this type of data modelling and visualisation might improve patient care.
Background: Medical research increasingly requires the linkage of data from different sources. Conducting a requirements analysis for a new application is an established part of software engineering, but rarely reported in the biomedical literature; and no generic approaches have been published as to how to link heterogeneous health data.
Methods: Literature review, followed by a consensus process to define how requirements for research, using, multiple data sources might be modeled.
Results: We have developed a requirements analysis: i-ScheDULEs - The first components of the modeling process are indexing and create a rich picture of the research study. Secondly, we developed a series of reference models of progressive complexity: Data flow diagrams (DFD) to define data requirements; unified modeling language (UML) use case diagrams to capture study specific and governance requirements; and finally, business process models, using business process modeling notation (BPMN).
Discussion: These requirements and their associated models should become part of research study protocols.
Abstract: When the EHRs are used for secondary purposes such as service evaluation and epidemiology research, data are increasingly aggregated from EHRs from different clinics and hospitals; over time, and from different EHR vendors. The sheer size of data means that they are increasingly difficult to manage, and our experiential learning in diabetes and chronic kidney disease (CKD) suggests that simplistic processing can lead to errors. In this paper we propose an agile data management process avoiding the need to import and process data in a relational database; and this reduces combined processing and analysis time. We carried out a demonstration study to identify how blood pressure varied between those recorded for patients included or excluded from quality targets. We describe a novel specification language that allows clinicians to focus on identifying the variables they need to extract useful information from EHRs. Data to answer a research question were available in <1hour rather than the much longer times previously required in extracting, assembling and processing data from our SQL database.
Secure transmission of patient dataWe have produced technical reports: