Gaussian Process in daily life


My home is in a tropical region in India in between two rivers. Very scenic, I know. But it also means that I have experienced a lot of rain and flooding in where I stay. 

Exhibit A : River flooding my ancestral home 


One of those rainy days (when the river hadn't yet reached the house), I set out to measure the increase/decrease of water level at the river. We had been experiencing a lot of rain in the past few days and due to a possibility of risk of floods, school was closed, with nothing better left to do, I decided to check the water level near the river. I would fix my ruler onto the ground next to the river and read off the markings against the water line. I would check the readings every 1-2 hours or so.
I have a bunch of collection of river water readings from those days, I would bet with my grandmom whether the water level was going to increase or decrease based on those readings lol. Anyway, it was just a fun experiment for me then, but as a researcher, now I can think of abstracting out  those measurements and finding deep insights. I would describe my measurement process (albeit, not exactly the same as I used to then as I need to connect this story to our actual topic, Gaussian Process) Let us see how we can relate tropical rainfall and floods to the temperate Gaussian Processes. 

So the story starts here. One Monday morning, I went near the river to check the waterline. The water had already seeped onto the river bank the evening before, however there wasn't much rain that night. Before I actually did the measurement, I had a guessed that the reading would be between 6 and 8cm as it was around 8cm the evening before. This is called the prior. Since I'm not sure of the exact value I consider my prior information as a Gaussian random variable.
How does considering the measurement as a GRV help me capture its value well?
Gaussian random variables have two parameters: mean $\mu$ and standard deviation $\sigma$. $\mu$ represents the most likely value the variable would take - which is my guess of 7cm. $\sigma$ expresses up to which values further from 7cm should I consider. For a GRV, any value post 6 standard deviation (up or down) results in a zero probability of occurring. Hence, this GRV encapsulates the entire region where the actual measurement could have been! Moreover, it also gives a measure of uncertainty around my actual measurement.  Isn't that much better than having a single value?
                                                Prior distribution of water level measurement

I decided to check my scale reading at 9 AM. It showed me 7.8cm. Using the Bayes rule, I can update the prior with the new measurement information I've got!
Here's the distribution of my GRV $x_{1}$ measured at 9 AM.

Distribution of random variable: $x_{1}$: water level at 9 AM

I make another measurement $x_{2}$ at 10 AM (9 cm - with a 2-3cm off as the rains have become stronger and the water is rough. P.S - standard deviation isn't actually 2-3cm, its around 1 cm).

                    


                                Distribution of random variable: $x_{2}$: water level at 10 AM

However, now I have two random variables, instead of representing them as marginal distributions separately, I represent them as a joint probability of $x_{1}$ and $x_{2}$. The distribution is bivariate now, single mean values have now become mean vector =  [$\mu_{1}$, $\mu_{2}$]  and standard deviation(variance) into the covariance matrix = [$\Sigma_{11}$, $\Sigma_{12}$, $\Sigma_{21},  $\Sigma_{22}$]. 




                            Distribution of random variable: $x_{1}$ and $x_{2}$ water levels

This bivariate can be similarly extended to multi-variate set up as well (upto n measurements). What if we wanted a continuous measurement at any time of the day, rather than discrete measurements I considered every hour? That is when a Gaussian Process comes into picture!

This reminds me of the time my uncle forced me to tally the different kind of vehicles that passed by my house and relate it to another mathematical topic (stochastic/poisson processes), but let me tell that story another time.

Comments