# Distribution

## Introductory Note

There has been a lot going on at work and home, and my blogging has suffered from it. I am trying to get back into a regular blogging schedule. I will continue the Burlington County, NJ series later.

… but these are extreme events. Below is a graph of the distribution of the differences in market closes for the past year:

The vertical lines represent zero (the black line), the median (red line) of 28.090, and the first, second, and third standard deviations from the mean. What do the basic statistics look like:

##      Min.   1st Qu.    Median      Mean   3rd Qu.      Max.      NA's
## -1175.211   -80.350    28.090     8.812   118.981   669.400         1

Note: The NA was created when I ran the diff() function on the closing prices. It is not an actual missing observation.

If you look at the graph, pretty much everything falls with three standard deviations of the mean with the exception of the five largest declines. This is to be expected since three standard deviations covers just slightly less than 100% of all values around the mean in a normally distributed dataset. But all of these extreme movements are negative.

So, what do they look like and when did they happen?

date close_diff
78 2018-02-02 -665.7500
79 2018-02-05 -1175.2109
82 2018-02-08 -1032.8887
111 2018-03-22 -724.4199
251 2018-10-10 -831.8301

Today’s drop of 545.91 points is inside the third standard deviation (-663.4169).

I think an interesting research project for a later date is to answer the question: Are five extreme events like this normal for the time period covered in the data?

I am going to leave this here, bid everyone a good night.