You are sampling observations from a population which has some underlying distribution. Think about this in a statistical sense:

Say you want to determine the distribution of heights of South African females: you'd obtain observations of the heights from, say, 1000 SA females (a random sample), but the observations from this sample have some true underlying distribution which we cannot observe because it would take too long to measure every women's height in SA. "Every woman's height in SA" is then the true underlying distribution. To estimate this true underlying distribution in statistics we would use regression which "smooths" out the sampling distribution. The reason we do this is because samples are much smaller than their true population sizes and the sampling processes therefore introduces "noise" into the system. We don't want to model the noise, we want to model the signal because the signal is the underlying true population.

The observations are the crude rates, we try to smooth these crude rates via graduation (just like regression!) to obtain the closest approximation to our true underlying distribution.

Hope that helps.