Projects
Some of the projects I have led or been involved in
Algorithmic fairness in disease surveillance systems: assessing algorithmic fairness in transformer-based NLP systems for real-time disease classification in hospital emergency departments [Databricks, Azure]
Cancer risk prediction models and electronic decision support: development of machine learning models for cancer risk prediction in patients with non-specific systems, for deployment in clinical decision support tools [SQL, Huggingface BERT models]
Development of data linkage infrastructure: part of my previous role involved helping to develop linked data infrastructure to support cancer research by bringing together data from primary care, hospitals and cancer registries. This has been a big team effort and is now the first such resource in Australia. It provides researchers and policy makers with a more complete picture of patients’ interactions with the health system to improve cancer outcomes. A recent paper describing one of these resources can be found here.
Geolocation of high-risk COVID-19 cases: development and deployment of a geolocation system for scalable record linkage tools that match locations in COVID-19 contact tracing data to high-risk locations [PostgreSQL]
Open source tools for record linkage and geocoding: developed the whereabouts package for fast, accurate geocoding [DuckDB]