Descriptive statistics are mathematical methods that summarize and interpret some of the properties of a set of data (sample) but do not infer the properties of the population from which the sample was drawn. Descriptive statistics are merely descriptive and do not involve any generalizing beyond the data at hand.
Some descriptive statistics are shown in Table 1. The table shows the average salaries for various occupations in the United States in 1999.
These descriptive statistics provide us with a certain insight into American society. It can be curious for us to point out that generally we pay people who teach our children and protect ourselves substantially less than , for example, we pay It is interesting to note, for example, we pay those who take care of our feet or our teeth.
There is a central tendency which states that there is one number that may summarize the entire set of measurements in the best way. This number is ďcentralĒĚ to the set.
The mean (or average) is a simple measure of the central tendency of the data:
mean = sum of all the data √∑ sample size (often called n)
E.g., with the data set (1, 1, 1, 5), n=4, the mean is 8 √∑ 4 = 2.
The range is a simple measure of the spread of the data. It gives information about the distance between the most extreme data values, but does not address the issue of how frequent these extreme values are.
range = value of maximum data point minus value of minimum data point.
E.g., with the data set (1, 1, 1, 5), the range is 5 – 1 = 4.
The variance is a measure of spread which deals both with the deviations of the given data (not the mean) and frequency of occurrence of them. For each data point, the mean is subtracted from the data point, and this value is squared.
variance = the sum of (each data point minus the mean)2√∑ sample size
For example, with the data set (1, 1, 1, 5): (1-2)2 + (1-2)2 + (1-2)2 +(5-2)2 = 12
The variance is 12 √∑ 4 = 3.
The standard deviation of the data is the square root of the variance, and on that reason it shows both the deviation from the mean and the frequency of this deviation. As the scale of the variance is frequently larger than the scale of the raw data and the standard deviation is on the same level as most of data, – there is often a necessity of usage of the standard deviation instead of the variance.
Standard deviation = sq root (variance)
For example, with the data set (1, 1, 1, 5), the standard deviation is the square root of 3 = 1.73