Why are Normal Distributions Normal
I read a paper by Aidan Lyon, Why are Normal Distribution Normal?, which is apparently properly cited as Brit. J. Phil. Sci. 65 (2014), 621-649
I don’t think this is a good paper. Lyon seems to have a beef with calling a particular probability density function “normal”. He doesn’t think that what we call a normal distribution is normal (in the sense of “it matches reality, mostly”), and tries to back up his contrary intuition.
He dislikes invoking the Central Limit Theorem as an explanation for distributions of quantities with a lot of factors turning up in a near normal distribution. I, too, dislike citing a mathematical theorem as a reason. I do think it’s really good to know that there’s a mathematical underpinning for an observed phenomenon.
The two examples given to call doubt on using the Central Limit Theory don’t definitively reveal ignorance, but they hint at it.
Lyon gives weights of loaves of bread, breaks down the constituent ingredients, and does a partial breakdown of how getting to a baked loaf might change the weight. No gotcha here, but unfortunately he doesn’t note that non-professional bakers measure mostly by volume. It’s very difficult to get a specific weight of flour when measuring with cups. When bakers knead dough by hand, they sprinkle an indeterminate amount of flour on the loaf as well, the amount of which depends a lot on the temperature and humidity of the kitchen.
The metallic part’s strength is a bit more irritating to me, who has a small amount of training in such fields. Alloying elements do affect strength, as does heat treating, as does machining. Lyon blows off all of the variations in these as not possible to quantify, and therefore not possible to add up or multiply together. The distributions of the various factors is known to be non-normal: machinists will tend to stop once they’re just inside tolerance bands for some processes, like chem-milling.
The upshot is that most or all aspects of machining can be quantified. Tolerance bands, surface finish, details like correct fillet radius, all these can and are quantified, and they do affect a part’s strength. Heat treatment of various kinds can definitely be quantified: soaking temperature, soaking time, quench temperature and liquid, all can be and are quantified. Lyon gives the impression he doesn’t know this.
Just inside tolerance
The aerospace industry used chem-milling quite a lot, at least up to the early 1990s. The name “chem-milling” covers up the fact that the process basically amounts to immersing pieces of metal in high molar acid, letting the acid eat away un-masked metal. The rocket I worked on, Commercial Titan, had chem-milled pressure heads on its fuel and oxidizer tanks, on both first and second stages. The folks running that operation tended to pull segments of “domes” out of the acid just as soon as it was within tolerance. That made the domes consistently weigh more than nominal. For a 10 foot diameter rocket, that can make enough extra weight to cause problems.
I haven’t worked in the field in 35 years or more, I don’t know if they still do chem-milling, or if CNC machining has replaced it.
The other place where we had to acknowledge that manufacturing processes didn’t consistently attain nominal values was when calculating first body bending vibration modes. Every skinny object vibrates at some resonant frequency, which depends on length, cross-sectional geometry, mass distribution, and Young’s Modulus of material. It’s important to know what that resonant frequency is, because rocket engines produce enough energy over enough frequency spectrum, that they will excite the entire rocket at that frequency. To avoid the onboard guidance system picking up false displacements, the first body bending mode gets filtered out.
The tendency to have parts like longerons have dimensions just inside tolerance meant that unless slightly thicker than nominal dimensions were taken into account when calculating first body bending mode, we got numbers wrong enough to actually affect guidance systems.
Highly Relevant Programming Exercise
Lyon’s example of a machine part’s strength and his advocacy for log-normal distributions got me to thinking about how distribution of tolerances might affect the distribution of the volume of a solid.
I decided to look at the easy case of dimensional tolerance of a cube affecting its volume distribution. Despite my observation of machined parts’ dimensions mostly existing just inside their tolerance ranges, I used normal distributions of length to see how volume varies.

That’s the distribution of 1,000 simulated cubes, each 1,000 mm ±5 mm on each side. I have a differently-seeded PRNG calculating each of the sides’ lengths. Each sides’ length is chosen from a normal distribution with μ = 1000., σ = 5/3 = 1.67.
One important thing to note is that a log-normal distribution and a normal distribution are essentially identical in this case. The key shows that an appropriate normal distribution is drawn on the graph in blue, while a log-normal distribution is drawn in green. My code draws the normal distribution first, then the log-normal distribution. You can see only green because the log-normal distribution almost exactly overlays the normal distribution.
The distribution of volumes above is pretty raggedy. How do other runs of the program look? I made a GIF animation of 4 other runs’ output. There is a certain variation in distribution, but the normal and log-normal PDFs overlap exactly, or almost exactly, for every run.

The normal distribution does indeed result from sums of non-normally distributed random variables. I tried it. Summing random uniform distributions does result in a normal distribution, or close enough.
In the case of volume variation, I’m multiplying random variables, which is supposed to result in a log-normal distribution. It turns out that log-normal and normal distributions coincide in this case.
What if normal distribution isn’t “normal”?
What if log-normal or some other distribution is a better theoretical match? Ordinary normal theoretical distributions match real-life distributions fairly closely. I mean, there’s never a perfect match, as witnessed by the PRNG distributions of volume above, but it’s certainly close.
If a normal distribution fits close enough, the mathematical properties (stable distribution, easy parameterization from mean and standard deviation) of normal distributions really help to make calculation and estimation easy. Assuming a normal distribution seems like a valid engineering solution to otherwise intractable problems.
Explananda
Lyon uses the word “explananda” a few times in this paper.
I looked this word up, as I don’t think I’ve ever encountered it before, and I couldn’t derive a meaning from context.
According to Wikipedia, “an explanandum is a sentence describing a phenomenon that is to be explained”, and “explananda” is its plural.
“Explananda” in this paper are all the identifications of quantities' distributions that need an explanation.