In my day job I am a data scientist who works with linked healthcare data for cancer research, applying machine learning and statistical methods to improve cancer outcomes.

This blog relates more to general tech, programming and data topics, rather than academia; the purpose is to record some interesting things I learn, which are hopefully useful to others too.

My background is in maths and physics and since around 2015 I have been in the data world, working on projects involving signal processing, geospatial analysis and over the past few years, clinical data science. My main interests are in data science, particularly the methodological aspects of statistics and machine learning, and also tools that can help to faciliate the grunt work that is common in data science (especially in the healthcare area).


Here is a github project I’ve been working on for open-source geocoding, since I couldn’t find any good open-source geocoding libraries and prefer not to pay:

  • whereabouts: a fast open-source geocoding package built using Python and DuckDB that can be run in your own environment.

Some other things I like: mountains; cycling; middle-eastern cooking experiments; reading; and music.