Probabilities for Physicians Part I: Randomness

At Elsa Health we work with many healthcare providers with very different levels of exposure to probabilities - from the community healthcare providers with absolutely no experience to health researchers with a good familiarity of confidence intervals and p-values - more on these later! From all these encounters we have noticed a common trend, there is lack of an intuitive understanding of probabilities and the randomness of the world. This is particularly true when the probabilities seem to be counter intuitive (see the Monty Hall Problem and examples here and here).

We are starting a simple, casual, and intuitive series of short posts on probabilities for healthcare providers who wish to get a better understanding or dust off their probability know-how! The posts will include content that is needed to develop your own health algorithm using the Elsa Open Health Algorithm Platform. We will try as much as possible to keep the content approachable and free of the dense mathematics that might turn many away from applying the beautiful concepts of probabilities.

Getting Started

Many of us have some basic understanding of probabilities as we go through life and put numbers on how likely certain things are to happen. We all see weather forecasts, election polls, sports odds, and even in casual conversations where statements like "It is likely that ..." or "I'm pretty sure that..." or even the one friend who is always certain and says: "I am 1,000% sure that...". All these are different ways we express probabilities and certainties/uncertainties in our daily lives.

It is worth noting that anyone who is one thousand percent sure of anything either has insider information or is trying to sell you something. Either way, something is fishy!

With that said, we should move forward with the definition: "A probability is how likely something is to happen", and that probabilities range from 0 (impossible event) to 1 (guaranteed event).

Often we deal with probabilities in the form of percentages between 0% and 100%. However, we will stick to the more appropriate scale of 0 - 1 for the rest of this series.

Probabilities and Randomness

The world around us is full of random events that happen from the quantum level to the migration patterns of wildebeests in the Serengeti. If you pick up the phone and call anyone in your contacts there is a probability value for whether or not they will pick up. Furthermore, that probability might depend on who you call, what time you call, the status of your relationship, or even whether or not they owe you money. We can think of these as factors that affect how "likely" the person is to pick up when you call.

In a more clinical scenario, imagine a world where half (1/2 or 0.5 or 50%) of the population have a certain disease. Let's call this disease Probabitis. We are also going to assume people either have the disease or not. In this strange world, the doctor can just randomly diagnose a patient as either having the disease or not, even without talking to or even seeing the patient and the doctor would be right around 50% of the time in the long run!

There are countless more examples of randomness at work, and even more examples of how humans have internalized randomness into simple mental models that frequently come in handy. Mathematically, probabilities are the tools we use to handle and deal with randomness in the world. Probabilities allow us to describe a world that is vague constantly changing.

But What About Statistics?

There is a blurry but significant difference between statistics and probability. Probability deals with predicting the likelihood of future events, while statistics involves the analysis of the frequency of past events[reference].

Statistics can tell us what percent of the population with headaches had relief after taking aspirin while Probabilities can tell us how likely it is that you will have relief from your headache after taking aspirin.

Random and Not Random Things

We can either describe things as being random or not random. When things are random they are called Stochastic and when they are not they are called Deterministic. We will dive deeper into these in a later post, it is enough to just know the terms for now.

Additionally, we are going to use the term "variable" to mean a "thing" from now on. So don't let that throw you off.

An example of a random variable is a persons age. If you walk into a bar and ask everyone how old they are, you are likely to receive a wide range of answers. Those answers are random. However, we can use probabilities to describe the randomness in the answers, for instance, we know that:

No one is below 0 years of age
It is pretty safe to assume there are no neonates or children (unless it's a child friendly bar??here)
It's also unlikely that there will be many elderly people (again, depends on context)
No one is going to be 200 years old

Using all this information, we can make pretty good guesses for the ages we will hear in the responses. This is our internal probability intuition at work, and, when we add mathematical rigor to this intuition we end up with more reliable and scalable "guesses".

The theory of probabilities is at bottom nothing but common sense reduced to calculus; it enables us to appreciate with exactness that which accurate minds feel with a sort of instinct for which ofttimes they are unable to account."
— Pierre-Simon Laplace (1749-1827)

Interactive Example: The Gardener & the Statistician

Let's imagine there are two people, a gardener and a statistician. The gardener grows two kinds of flowers, purple and red flowers. The statistician works for a fancy research group and is trying to study different gardens to quantify the ratio of purple to red flowers.

For this we scenario we will introduce our first type of random variable: The Bernoulli. We will cover this in more detail in the next post, but for now we describe this distribution as that of a coin toss. When both outcomes are equal, i.e: a fair coin, the outcomes are equally likely to be heads or tails. However, If the coin is heavier on once side, then the outcomes will favor one outcome over the other. Here the probability of a success (heads) is called "p", and it ranges from 0 to 1, where 0.5 is a fair coin.

Back to our example and simulation. Let's describe everything we need to know/assume.

Let's say our farmer has 80 purple flower seeds
She also has 20 red flower seeds
Sometimes, out of random bad luck, some seeds do not grow into flowers

Given the assumption 1 and 2, we expect our statistician to observer 80 purple flowers and 20 red flowers. When we include assumption 3, the scenario is more realistic and now the statistician might observe different results that are random around a certain number (80 purple vs 20 red). This type of randomness that is present in the world can be described by probability and random variables. In this case we represent the probability of a given flower being purple as:

\begin{align} Bernoulli(p = 0.8) \end{align}

We can simulate what the Statistician is likely to observer with different p values below:

Bernoullis' Garden

Probability of purple is: Bernoulli ()

Statistics Summary: There are 0 Purple flowers and 100 Red flowers. Therefore, 0% of the flowers are Purple.
Try re-running the experiment without changing the probability of purple.

What is Happening Above?

Since probabilities describe random events and the likelihood of those events happening, even with a fixed probability, it is possible to observe different results. You can test this for yourself by setting the value of p to a specific value, and rerun the simulation to see what the statistician would observe in an alternate universe. You will notice that the summary statistics vary slightly every time you rerun the simuation. That is randomness, and probabilities let us tame this feature of the universe.

Randomness in Medicine

In clinical and medical research we often encounter randomness in the results of our experiments; this is clear from the small differences we observe in literature findings. For example, one study can find that 80% of patients with malaria have a headache and another can find that only 66% have a headache. Both results are correct from a statistical perspective, and the more studies we do, the more observations we make, the more likely we are to get the true probability value.

Check out the open health platform we are building!

Open the Health Algorithm Platform

Upcoming Articles

More on the Bernoulli distribution & random variables
The Beta distribution & random variables
The Normal distribution & random variables

Reach Out to Us

To learn more about our work, or if you are interested in working together, please reach out to us through our website, or follow us on social media!

To contact us directly visit our site: elsa.health/contact