Written by Christopher Kelly
Dec. 20, 2017
Christopher: Hello and welcome to the Nourish Balance Thrive podcast. My name is Christopher Kelly and today I am joined by Dr. Gari Clifford. Say, hi, Gari.
Christopher: Dr. Clifford has worked in signal processing and machine learning applied to biomedical and clinical problems for over 20 years in academia and industry. Beginning in streaming data analytics in critical care, he diversified into mobile health in 2006 as a principal research scientist at MIT where he co-founded the MindChild Medical and pushed its fetal monitoring systems through the FDA approval to the market. Subsequently, as faculty at Oxford, he co-founded the Sleep and Circadian Neuroscience Institute, pioneering machine learning in digital psychiatry. As Chair of BMI at Emory and Georgia Tech, he leads the informatics behind large-scale mobile health studies. That is incredibly impressive. Congratulations on everything you've achieved, Gari.
Gari: Thanks very much, Chris.
Christopher: Tell me about how you first became interested in machine learning in medicine and mobile health.
Gari: It's a long route. I started off in Physics. After I had done my Master's, I was working in a research lab in London, and I decided that I wanted to learn more about artificial intelligence. It was the mid-90s and things were starting to take off, but they were still very slow and pedestrian, but I liked the idea of it so I Googled -- well, I said Google around, whatever it was at the time, AltaVistad around which is like wading through soup. I found this brilliant guy at Oxford called Lionel Tarassenko who was one of the few people in the country that was looking at neural networks and artificial intelligence, and I emailed him. I think he was so surprised that I found him on the Web that he asked me to come and visit and then offered me a PhD position there and that's sort of how my entree into the field began. I shifted from Theoretical Physics into Artificial Intelligence, and the first project that I did was applied to healthcare, looking at deteriorations in hospitals.
So I started looking at that and then during my PhD, I was lucky enough to go on a long trip around China. I gave a couple of lectures there, and I started to see -- it was about the year 2000 -- I started to see mobile phones appearing everywhere. I asked somebody in the audience at one of my lectures, how far mobile phones were penetrating into China and one of the students looked at me and said, "Well if you can drive a truck to the end of the road, there will be a cellphone tower mast there at the end of the road. That switched a light on inside me and I thought, wow, this is really going to change healthcare especially in Developing Countries, especially in low to middle income settings and since then I've been working really to push both in Western medicine and in terms of resource-poor medicine which is really the way I think of any health. It's how do you reach a much larger community who have less money to spend.
Christopher: What was it about artificial intelligence that compelled you enough to draw you away from theoretical physics?
Gari: That's a good question. Like everybody, I'm interested in how does the brain learn, what separates us from things that can't learn, how do we create something that stimulates that and what are the essential building blocks of learning organisms? I think that's what drove me to look at it. I was still a theoretical physicist at the time and the theory part of it really was the thing that catalyzed my interest in it, but as soon as I started to see where we could apply it, I became very focused on how do we use this to change healthcare. I don't know when the switch came of wanting to use it for a more immediate purpose but that happened sometime during my PhD.
Christopher: I was lucky enough to see you present at the San Francisco Data Institute Conference a few weeks ago. The Data Institute has been an important part of my recent education. I learned how to do machine learning with Jeremy Howard there teaching courses, and I had a fantastic time learning from him. You can find Jeremy's website at fast.ai. I thought why not come back and do the conference especially because there was a medical track which appeals to me greatly. Can you talk about your involvement with the Data Institute and the topic of your presentation there?
Gari: Sure. I've known David Uminsky, one of the directors at the Data Institute, for many years. We go back to our Boston days. After I did my PhD, I moved to Boston and did my post-doc there. Our wives are both anthropologists, and I met David through that, like everything in the world, it's a socialization. He's actually a mathematician who does a lot of work for theoretical physicists and so we had a lot in common. We would talk about this kind of stuff at length. After he took a faculty position in San Francisco and started working in the Data Science Institute and directing it, our worlds converged at that point when we realized that we were working on the same thing. David got me to come over and share a couple of sessions on AI applied to healthcare.
It was really good timing for me because I just finished presenting and wrapping up an international competition in France the week before and awarded the prizes. It's a competition that my old mentor, Roger Mark at MIT and myself run each year. It's called the Computing in Cardiology PhysioNet Challenges, and we come up with something in the area of medicine usually applied to cardiology that is a current challenge for the community, and we post a lot of labeled data and we challenge the community to try and solve a problem.
This year we challenged the community to solve the problem of detecting atrial fibrillation in electrocardiograms. So the electrocardiogram is, your listeners don't know, is this electrical signal we record of the heart to tell us what rhythm the person is in. AF is this conduction abnormality that causes the atria, [0:05:44] [Indiscernible] at the top of your heart not to find the proper sequence. You can see some evidence of this in the electrocardiogram by unusual patterns that you see there. You see long and short intervals between each heartbeat and the wobbles in the baseline and this kind of stuff. The challenge was how to detect that accurately.
We had been given 12,000 ECGs by a company called AliveCor. We built this $99 hand-held device for measuring your own ECG. They have written their own algorithms, but their interesting is they're a very forward-thinking company and they wanted to see what the research community could do with it. So we had over 100 people enter this competition, and people tried all sorts of things, from random forests to support vector machines to a set of heuristics to, of course, deep learning. Our conference ran for about eight months. We run it January through September. Then we awarded the prizes. We always keep about 30% of the data hidden, and we don't tell everybody to assess their code on that final trench of data until right at the end of the competition so they get one shot at that hidden data. Then we report the scores at the conference and we award the prizes.
This year we had some extra fun machine learning. We were doing meta machine learning, if you like. We were using the contestants' own algorithms themselves to bootstrap the quality of the data. One of the dirty secrets in medicine is that we don't all agree with each other and the inter-expert variability can be as high as 20, 30 or even 40%. So, one of your problems in machine learning in medicine is that if you don't know what the error rate is in your labels, you don't really know what the signaling is on your ability to do predictions, so you end up with this problem of, if 30% of my data is incorrectly labeled, then I'm never going to do better than 70% and I'm probably going to do much worse than that. So you're hampering yourself right from the get-go. So what we decided to do was -- sorry, go ahead.
Christopher: I just wanted to say maybe we need to explain some of the terms that you use. The ground truth that you're talking about is the electrocardiogram, and this is something you could print out. You would see the QRT signal waving up and down, like everybody's familiar with, on a piece of paper. Then you could show that to a cardiologist and then he would tell you whether it's a normal sinus rhythm or some sort of an arrhythmia. That's what you mean by creating the label is what the cardiologist would say is true.
Gari: That's right, yeah. That's right. But not every cardiologist agrees with every other cardiologist, in fact, but some of our ECGs for some of our recordings, we broke it down, in the same way that the company did, into four very basic classes; either they were in AF, normal sinus rhythm which is what a healthy person has most of the time, another rhythm that isn't AF or sinus rhythm, could be any rhythm, or it's too noisy to read. So there are basically four classes in this problem.
We would sometimes get eight experts; two would say it's AF, two would say it's another rhythm, two would say it's normal sinus rhythm, and two would say it's noisy. At that point, what label can you give it? It gets very, very difficult. In fact it turns out that you need 20 or 30 experts before you can asymptotically end up with a stable diagnosis for that individual. Now your gold standard tends to be some kind of work-up on the patient over a long period of time. You might do blood test. You might do a stress test. You might do a long-term Holter monitor. There's a bunch of different ways of getting a definitive diagnosis out of this, but getting it in 30 seconds out of a hand-held machine is much more difficult, and people are going to argue quite considerably on that.
Christopher: Okay, I should connect the dots for my listeners a little bit, many of whom are endurance athletes. A few months ago, I interviewed Dr. Peter Backx who is a senior scientist at the Toronto General Hospital. Dr. Backx is investigating this exact type of arrhythmia in endurance athletes. So there are lots of things that are risk factors to this pathology, one of which is overexercising or maybe exercising too hard. I'm not sure everything is known, but I just wanted to point that out so that people listening would understand the importance and relevance of this diagnosis. Was there anything else that you'd like to add to that, Gari?
Gari: Yeah. So the fun part of this was that we took all the entrants in the competition, we ranked all of our recordings by how much everybody disagreed on the actual classification. So the more disagreement we had, in other words, the lower the agreement level for across all of our 100 entrants, then the more we worried that the individual recording was mislabeled and was causing some problems. So what we did was we ranked them and then we went back and we asked experts to relabel these difficult recordings to improve the quality of the data. So that was part of the fun in taking that and trying to use the people who were actually entering the competition to help us improve the quality of it.
The final bit, the final piece of fun, of course, is we applied another machine learning algorithm on top of all the final algorithms. So we had 100 algorithms and we applied another machine learning algorithm to train it, to vote all of those 100 algorithms together to come up with a better classification. It turns out that we could improve the score from about 83% to about 87% just by doing that.
Christopher: That's amazing, so it's the wisdom of crowds off the wisdom of crowds.
Christopher: How does that work? I'm interested in this. It seems to be a thing in machine learning that I've not really seen in software engineering before. Maybe that's just because I've been trapped in a cupboard too long. But it seems like in machine learning, these competitions are really popular. One website that I'd already heard of was Kaggle, and I knew about the PhysioNet website, but I didn't know that they organized competitions. What is it about machine learning practitioners where this competition idea appeals to them? It's possible for you to just post the description and some labeled data on the website and people will have a go and submit some entries.
Gari: I think it's the nature where we are in research at the moment. It's very difficult to find the data, curate the data in a meaningful sense and label the data in an accurate sense. There really is a paucity of very good data sets out there, and the ones that have been used have been beaten to death and are generally not as useful as you would imagine or at least are not solving critical real world problems as often as you would hope. So when you post a data set with a particular challenge, it often attracts a lot of people to it for the excitement of entering it and also for the glory of winning it, I suppose, but also because it represents something new. There's a potential for a completely new innovation to occur, and you can be the first one that will be publishing in that area. I think people are driven by lots of different reasons, but being the first one to innovate in the given area has always been key to researchers, I think.
It's becoming easier and easier nowadays with all these different toolboxes out there for people to download. Of course there's a danger on the flip side to that which is that it's far too easy in some sense. You get a lot of people just downloading data and downloading code, throwing it to each other and stirring it around until they get an answer. I think the competitions are important because we hold a whole set of data out and we hide it so that nobody gets a chance to overtrain on it. Pretty much anything that you read where they've taken public data and they cross-validated it, probably overtrained without realizing it. We found in the competition actually that we only gave people five chances to classify 30% of our hidden data and then we gave them one chance to classify 100% of the hidden data. On the 30% of the data, they got better and better and better every time and then as soon as we gave them 100% of the data to try and classify, their scores all dropped by 3 or 4% on average. That shows that they've overtraining on this hidden data. I think that's very important. If you try your algorithm out more than once on your hidden data then you're starting to overtrain on that.
Christopher: Okay, so for our listeners, so these algorithms, they learn by example. They become more and more sophisticated, the more data they see, the more examples that they see. The danger is that the model is over-fit, has too many parameters, it's too complex for the test data set and so it may not perform very well on future unseen data. Is that correct?
Gari: That's right.
Christopher: So talk about how these models, the winning models that say has an ensemble of the winning entries, talk about how those models are going to be used in the future.
Gari: Well it's a good question whether we then use an ensemble of all the models to try and improve AF-detection or whether we try and break down the individual algorithms and decide what worked well and what didn't and try and build new classifiers out of this. I think what we demonstrated from the competition though is that 10,000 ECGs are not enough because what we saw was that the four winners in the competition had four completely different approaches. All four top scores which were all an AC of 0.83 or an F1 score 0.813 which is, it's a little bit like sensitivity and specificity. We found that these top four scores were from four completely independent approaches; one was steep learning, one was a random forest approach, one was a set of heuristics, and one was a standard support vector machine. A lot of it involved tailoring the data going in.
One thing that was clear was that anybody who just dropped a data in a deep neural network and cranked the handle didn't do very well. We've seen that over and over and over again, trying to identify what parts of the signal should be removed would improve a lot of people's entries. Also, people who were to use other data sets, external data sets to bootstrap their entry, to pre-learn it, essentially did better as well. What that tells me is that there's still a long way to go before deep learning and other complex approaches where you auto-encourage your features will show some promise in these data sets. One of the problems there of course is how large does it have to be, how much human annotation do we need before deep learning could be useful.
Christopher: That's really interesting you should say that. You're not the first person to say it. I interviewed Pedro Dominguez, and I talked about some of the problems I was trying to solve that he said, "You should try XGBoost," which is a bit like the random forest algorithm that you mentioned earlier, and it changed everything. Just that one word totally changed that part of what I was working with, made my life so much easier, and we ended up coming up with something really useful where the deep learning algorithm would not really be getting anywhere before, so it's interesting that you should say that.
Gari: Actually one of the winners used XGBoost in part of that framework, but it's interesting that actually didn't do any better than any of the other approaches either. I think it's because we had reached the limit of what you could do with that size of data and we needed larger data sets. Because the promise of deep learning is that if you have enough data, you should be able to learn the optimal feature set but the question is, how large has that data set have to be? I would say it's at least ten times larger than the amount that we had.
Christopher: Okay, interesting. At the conference you showed me a little hand-held device that you could put your fingers on, and it would do a little mini ECG. Is that the AliveCor device?
Gari: Yes, that's the AliveCor device that I had been talking about, yeah.
Christopher: So are we going to see these algorithms build into the firmware or software for that device so that as an athlete or someone else is worried about arrhythmia, I'll be able to take that little ECG in the morning and then get a classification from the algorithm?
Gari: Yeah, that's very much the hope. We expect that either one of these algorithms or some variants of them or maybe even an ensemble of them will be built into the hardware or probably more likely into a cloud-based system. You take a recording, you load it and a few seconds later you get a diagnosis locally. I can see that being the way that we move forward with healthcare in these particular instances anyway.
Christopher: Could the problem be mechanical turk? I know you used those words on the PhysioNet website. Just for maybe people who don't understand that term, the idea of mechanical turk is just literally doing it by hand, doing what a computer would do but by hand. Would it be possible to mechanical turk the ECG? Could you sense the data via the Internet and then have, I don't know, three cardiologists look at it and then just label it in real time? Or maybe not quite real time but close to real time.
Gari: That's right. The business model, I believe, that AliveCor uses -- I should point out that I have no shares or I'm not being paid by them so this is not an advert. I can't speak for them either, but I know that if you use their software, you have a couple of options to either have a cardiologist diagnose it given a certain period of time, or a Holter technician, this is a trained ECG reader, will read it for slightly less money and turn it around in less time, but the turnaround time is still a period of around 24 hours, I believe. So I don't know whether you would count that as mechanical turking because it's just a one-to-one. Usually what we think of with mechanical turking is that we get more than one person. In fact sometimes we'll get 5, 10, 15, 20 people to do the same job and then we try and aggregate the data together to try and get a better estimate of what the class is or what the measurement is. I think there's enormous potential for doing that with medical data if it's broken down into a small enough subset of simplistic tasks.
I convinced an old colleague and friend of mine that you could do this with ECG even with non-experts. We had this argument for a while and then I was sitting in his kitchen one evening and I said, "Let's try this experiment." I drew some ECGs on a piece of paper. Just to let some of you know, I can draw them pretty easily. Then what I did was I got them to label the start of the heart contracting and the end of it, relaxing, and this is called the QRS onset and the T wave offset. It's basically the start of the ventricles contracting and the end of the ventricles essentially having fully relaxed. This period is called the QT interval. If it gets prolonged then it means that you're probably going to have a heart attack, so it's very, very important it's often measured. It's actually difficult to get people to measure it properly and you have to usually bump people together and average, lots of estimates.
So I called over a couple of people that happened to be in his house at the time who had never seen an ECG in their life. I'll explain why I knew that in a minute. I asked them to annotate the data, asked them to find the start and the endpoint of the QRS complex and end of the T wave. I showed them three ECGs and showed them how to do, and I said, "Mark the start of the really big bump and the end of the last bump." Subject A did a really good job of it, and Subject B didn't do such a good job of it. The only difference between Subject A and Subject B was that Subject A was five-and-a-half years old and Subject B was three-and-a-half years old.
So I managed to convince them at this point -- this is how I knew they had no electrocardiogram reading expertise, they were these very young kids at the time -- so I managed to convince them that we could really mechanical turk this stuff. You could really put medical data in front of somebody and ask them to recognize patterns in it. In fact, people have done that with all sorts of subject. If you look up Galaxy Zoo, it's a way that the public can label different types of galaxies that the Hubble telescope is taking pictures of. We built a taxonomy of these galaxies, and we say, "This is a spiral galaxy," give them lots of different phenotypes. They're basically patterns that you look at and then you try and recognize them. We do the same with malignant tumors, not that we're actually crowdsourcing and mechanical turking malignant tumors at this point, but I don't see a reason that we couldn't do that as long as we had a quality system in the background that aggregated things together properly.
Actually that's where the machine learning comes in. We build a machine and you have them do exactly that, that try and work out when somebody is good at doing this and when they're bad at doing this and then rewarding them appropriately. So I think medicine could be machine learned in that way. It could be crowdsourced, it could be mechanical turked in that way. I could get my ECG recorded on my device and then I could send it to you and say, "Hey, Chris, what do you think of this? You've recorded a few of these now. Does this look too noisy to you or is this okay?" You could give it a label and then you could send it down to your --
Christopher: Humans are really good at recognizing patterns.
Gari: Yeah, humans are great at recognizing patterns and images. That's why our brain is fantastic. It's a beautiful neural network for doing that. We're very good at doing it with very few training patterns that --
Christopher: It's pre-trained.
Gari: Yeah, well that's also something more complex to the human brain than there is to your average deep learning neural network. We do some very intricate and non-understandable things that boosts our ability to do pattern recognition. We're not quite sure what that is.
Christopher: How would that work for numerical data? Say that I just had a blood test done and I got back a bunch of numbers, could you think of a way to mechanical turk that for pattern recognition?
Gari: What you could do is you could -- you're thinking more along the lines of do I diagnose somebody with hypertension here and should I treat them? So you could present all of the signs and symptoms to a bunch of interns and ask them all to decide what they should prescribe the person and what should be the next treatment. You could give them a little website of a description of what the protocol is. If you had enough interns read that protocol and look at the person, you should end up with a pretty accurate diagnosis and follow on for the treatment. You could do it more complex than just looking at pictures.
Christopher: Okay, so to give you some more background, I've been really interested in heart rate variability for quite a long time now. I've been checking on my own. I've seen data from some of the other athletes that we work with. I've been interviewed by a guy called Jason Moore who has a start-up called Elite HRV, and they make devices and software for tracking HRV and readiness to train. Jason has been on my podcast as well. I could link to that on the show for next episode. Jason and I talked about the possibility of me developing some software that does exactly what your physio.net competition winners are doing and so I was very excited about that.
Before I saw that, I actually gave up on the task. I thought that I wouldn't be able to do it because I saw this paper that was titled, "Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks." One of the authors was Andrew Ng who is a very prestigious guy in the world of machine learning, and I thought, well if he's doing it then there's no point in me even trying. Could you talk about that paper? Was that entry included in your competition?
Gari: I'm not sure whether I should talk about who entered the competition or not. There were some entries based upon that approach that came in very early in the competition and then gave up. It was interesting to see that just taking that [0:25:03] [Indiscernible] approach to the problem didn't do very well. In fact, there was one group that reproduced that exact approach as well and came in 33rd in the competition. I think the issue is around, when you just take the raw data and you just present it to deepen your own network, you're only able to explore all the infinite possibilities if you have an infinite amount of data. So just assuming that the convolutional neural network is going to create exactly the right number of filters to identify the episodes is a mythical hope I think.
I think some of the other things that weren't clear in the paper, it's very difficult to disentangle it because it's a preprint, it hasn't been peer reviewed, so there's obviously a lot of things in there that haven't been explored properly in print. There's a lot of information missing but from what I can tell from it, they took these very long Holter recordings which were 14 days and then they had humans pick examples out of them which ultimately infuses a bit of a bias into the data because in the real world, you don't have any humans there to pick these episodes out. Then the test data, they only had about, I think there were only ten or 20 examples of some of the rhythms in the test data and so you can't really say very much about its performance on unseen data.
Of course it also goes back to my comment about the fact that I don't know how many times they run it on their unseen test data, and that's a really important point. If you don't enter the competition and you're not willing to run the algorithm against real hidden data then it seems a bit suspicious to me, so I'm going to hold and reserve judgment about the utility of that until I see it peer reviewed and I see how their algorithm would perform on real hidden data. We have 4,000 examples sitting, waiting to be tested, and 10,000 for them to train on so making this still open and welcome to use it. If I were reviewing the paper, I would require that they did that, run their algorithm on that because the data is there for them to prove that it works on.
I think that also drives back to another one of the problems in the field which is that none of the data used is also public so there's no way anybody can repeat what they've done. The code wasn't posted publicly and the data wasn't posted publicly, so it's very difficult to work out what they did and whether any of it is repeatable. So if you don't put the code out there and the data out there then it's hard to know whether there are any mistakes made. It doesn't matter who you are. You shouldn't be given the benefit of the doubt. You should have to prove yourself in the scientific literature.
Christopher: That's really good information for me. I've got the word science in my degree title but I don't really feel like a scientist. I don't really feel like I've ever done any science. So it is an important lesson for me to learn, just because you saw one paper that made some claims, doesn't mean it's necessarily true.
Gari: Yeah, and everybody makes mistakes. I always tell my students that every time I publish a paper, I fix some of the mistakes that I made previously. Every paper that you put in isn't perfect and so if you don't see a history of somebody attacking a problem over and over and over again with different variants then they probably don't have any domain knowledge about it, and it's much easier for them to have made a mistake.
Christopher: Speaking of domain knowledge, how do you know that an ECG is the best way to predict this type of arrhythmia? Has anyone tried anything else like subjective questioning or maybe some other biomarker like blood pressure or maybe some inflammatory markers? When I did a podcast with Peter Backx, he talked about the role of TNF alpha as a possible mechanosensor, and I wonder whether anyone has tried to measure that at such a level with great enough sensitivity that they can start using it to predict an arrhythmia.
Gari: Yeah, although to be clear, we're not predicting arrhythmias here. We're detecting. So if you have this arrhythmia and you happen to record it at the time that it manifests then usually it's one of the better ways to be able to detect this particular problem. But it's not the only way. There are blood marker tests. There are other ways of sensing the activity of the heart and then there are surrogates for it such as signs and symptoms, such as light-headedness and problems during exercise.
I think one of the take home points is that the electrocardiogram is a very easy thing to measure and gives you a fairly accurate diagnosis, but it's not the only way to do it and if you combine lots of other things together, essentially, you could throw them into the machine learning framework, you could put them in your deep neural network as co-variants much higher up in the model. You would improve your ability to do this kind of detection.
Gari: Yes, so I was thinking, with Elite HRV, we've got athletes who are taking this every morning. They're putting on the heart rate monitor strap and checking their HRV every morning. I knew the endurance athletes are at greater risks for AF in particular and so I was wondering whether you could just do the detection whilst the person was doing something else. I just wonder what the compliance would be like on somebody who had no other reason to do it but their fingers onto that AliveCor device.
Gari: It's a very important point. You have to look at, these algorithms have been trained on patient populations whether it was a reason to suspect that they may have AF. If you then take a larger population who have no previous signs or symptoms that would indicate that you might suspect they have AF then your false positive rate is going to go up through the roof because the prevalence of AF in the general population is much lower than the one in the population that you're sensing and also the type of electrocardiograms that you would record would be different as well. So you haven't trained on the right population at this point.
Christopher: That's really good information for me. I'm just thinking about what we've done so far, trains and models on data that was collected from athletes that came to us with some specific plaints and now I'm working on a data set where that wasn't the case. It was blood data that was collected from some people who had a fancy gym membership and this is a routine thing and, again, you're selecting another population. So those are some really interesting things for me to think about. Where would you send someone like me to learn more about that stuff?
Gari: I think the area that you should look up is called propensity matching. If you look up the literature on that, there's some great work on identifying the right populations to compare in order to prove that their particular test works.
Christopher: Excellent, that's perfect. You've just given me the magic keywords, lots of more knowledge. Thank you.
Gari: You're welcome.
Christopher: Science is like that. You just need to know the keyword.
Gari: That's right.
Christopher: Okay, so let's shift gears a little bit. I wanted you to talk about whether or not we should be sharing our medical data.
Gari: That is a good question. A lot of people are sensitive about sharing it, and there are certainly dangers to doing so. There can be repercussions. On the worst end of things, you can imagine that somebody sees a spike in HIV patterns in a particular area and decides to go and victimize the people in that area. Or they find their neighbor has HIV and they start victimizing their neighbor because they don't understand the illness and they think they're going to catch it. They start to think lots of bad stereotypes about their neighbor. So there can be dangers about sharing your medical data. Also people are paranoid about insurance companies removing their ability, reimbursement for preexisting conditions and this kind of stuff. On the flip side of that there's the enormous potential we have to be able to do mass-scale medical research if we can have high quality data across a large population. So the idea of how we share our medical data becomes a really interesting question.
I personally think that we should be brokers for our own healthcare data in the same way that we're brokers for our own social media data. Now whether we know it or not, we sell our social media data to Google and Facebook if you use those services, in exchange for some extremely useful or extremely entertaining in a lot of case, software. We made this implicit decision to do that. I'm very happy with the services that I get for the data that I give them, and I trust them not to release the data or use it for any nefarious means, at least Google, and I'm happy with that. Now the question is, would I do the same with my medical record? I think you have to start to have some boundaries that are put in with the medical record which is, I would like to know who touched my medical record and why. These kinds of systems have been in place for a long period of time.
I did a summer internship in the Benefits Agency in the UK. You know what that, is Chris, but for the listeners, that's basically the agency in the UK that provides unemployment benefit and other social welfare. One of my jobs was to check whether claims were alive or dead and then file them away and find the paperwork associated with them and file them away. So I type in somebody's Social Security number, we collect National Insurance number in the UK, and I would find these records. When I started this job, they warned me. They said, "Now don't be tempted to search for famous people in there because we know exactly whose data you're looking at and when you're looking at it. If you start looking at Margaret Thatcher's National Insurance record then you're going to get sacked." She was Prime Minister at the time so this dates me a little bit.
It occurred to me that it's amazing to me that we, even in the earliest days of computerizing these kinds of records, we had this trail of who touched the data and when they touched it and why, and we used to audit that. Nowadays we tend not to do that. We don't worry about who is touching our data and why, yet for medical data I think that's very important. I want to know which clinician looked at my data and why they were looking at it because their suspicion of illness is something that's important and I want to know why they looked at my data and what made them think about looking at it. If it was to see whether I was a potential recruit for a study then I'm okay with that, but if it's to use my data because they wanted to see whether I was somebody that they could sell services to that I may not need then I'm not okay with that, so the way that you gate your medical data.
Okay, if a drug company came in and said, "Well we really like people with your type of medical data. We've been matched to you with this algorithm that Google wrote or that you wrote or healthcare wrote, so we'd like to use your data and we'll give you a $50 fee for it." I'd be okay with that as well. I'd be quite happy to sell my medical data in an anonymized fashion to a third party. Then for research, I would give it away free because I think that that's important for that to go forward. So research in an academic institute who is not likely to profit from it should get it for free and in an anonymized format, and a drug company should have to pay for it.
Christopher: Wouldn't it be wonderful if you could transition from doctor to doctor? When you go from Dr. A to Dr. B and you have to start scratch with all your medical records, how much does that suck? Wouldn't it be amazing if you could frictionlessly move in between clinicians?
Gari: Well the Holy Grail of personalized, portable medical record has been idealized for decades, but we're so far from that because the medical industry likes to silo the data. The less it can be exchanged, the more likely you are to have to stick with the current system. I think the barriers are there on purpose, to be honest, to help protect the current industries. There's a certain amount of non-nefarious, just momentum around this. Historically we came from an age when we didn't need to exchange this data, so the data was naturally siloed. I think nowadays, there has been no top-down push from the government. The closest we got was the Health Insurance Portability and Privacy Act which has been hijacked by lawyers to try and prevent us sharing data. Even though the word portability occurs before privacy in that phrase and was designed to make healthcare more portable, it has actually had the reverse effect I think.
Christopher: Right. So who is going to build the systems then? That's the nice thing about Facebook, that's what it does is it provides a repository for my data and then it allows me to control who gets access to it, but who is going to do that for medical records?
Gari: That is the billion dollar question. I think it's starting to happen as a new protocol called FHIR that's enabling interchange between electronic medical data records, and somebody is going to come up and finally build your portable healthcare record. We've had Microsoft HealthVault and Google Health for a while. These have been repositories of your personal data, but there has never really been a meaningful way to push and pull data in and out of them from other systems. Until those are created, we won't have a personal healthcare record.
Christopher: Right. Yeah, the number one problem that I face is that all of the data is in PDF files, so the first thing I have to do is try and extract the numbers out of the PDF record before you do anything with it at all. Surely people up and down the country are all doing tests where the data is coming back to them in a similar format and so it's not even electronic at that point.
Gari: Yeah, I've seen some crazy stuff, people literally printing out pieces of paper and then scanning them back in.
Christopher: So what can anyone do that's listening to -- say you agree that the idea of shared medical records is a good idea but we just need the proper control and auditing systems -- what can someone do to help make that happen?
Gari: You could start a business that used the Blockchain technology to create a marketplace for our own medical data. You would have to partner with different insurance agencies and different industries to be able to build that, trying to think of some good starting points. There are a couple of not-for-profits out there that might be interested in doing this. Everybody gets very flighty when we start to talk about pushing and pulling medical data. It's not a trivial nut to crack.
Christopher: Blockchain, that's the keyword there. I'll link to some really useful resources in the show notes for this episode. Well Dr. Clifford this has been fantastic. Where can people go to find out more about your work?
Gari: It's easy to Google me, Gari Clifford or gdclifford.info. I've got things scattered around the web, so it's pretty easy to find things. Luckily I have a strange Welsh spelling to my name, so I'm probably the only Gari Clifford you'll find out there.
Christopher: That's fantastic. I will, of course, link to everything that I've used to prepare for this interview in the show notes for this episode, including the PDF you were kind enough to -- the PDF presentation. So if people want to see that, they can find the link on the show notes as well. Well this has been fantastic. Is there anything else you'd want people to know about?
Gari: No. I enjoyed the chat, and thanks very much for the opportunity, Chris.
[0:39:58] End of Audio