# Essays on Applying ANOVA

Having completed the simulation, we can assume that the aim of the analysis of variation (ANOVA) is to test the significance of differences between the average values in different groups by comparing the variations of these groups. Division of the total variance into the number of sources (associated with different effects in the design) allows us to compare the variance caused by differences between the groups with the variance caused by the intra-group variability.

The hypothesis under test is that there is no difference between groups. With the validity of the null hypothesis, the estimation of the variance associated with intra-group variability should be close to the estimation of between-group variance. If it is false, they should significantly deviate.

At the same time, one of the factors limiting the application of criteria based on the assumption of normality is the sample size. As long as the sample is large enough (for instance, 100 or more observations), the sampling distribution can be considered normal even if there are doubts that the distribution of the variable in the population is normal. Nevertheless, if the sample is small, these criteria should be used only when there is confidence that the variable actually has a normal distribution. However, there is no means to test this hypothesis on a small sample.

Apart from that, using the criteria based on the assumption of normality is limited by the scale of measurements. Such statistical methods, as t-test, regression, etc., suggest that the source data are uninterrupted. However, there are situations where the data is rather just ranked (measured in ordinal scale) than measured accurately. Therefore, for the analysis of small samples and for the data measured in the poor scales nonparametric methods are used.

Thus, nonparametric methods are most appropriate when the sample volume is small. If there is great number of data (e.g., n > 100), there is often no sense in using nonparametric statistics. If the sample size is very small (e.g., n = 10 or less), the levels of significance for those nonparametric tests which use the normal approximation can be regarded only as rough estimates.
In general, the basis of the analysis of variation consists in the following assumptions: (a) each group should be a random sample from the general population, (b) the dispersion of groups in the population is the same. However, in our opinion, the method is convenient and can be used even if the normality and the expected equal dispersion are not provided. Nevertheless, the condition of carrying out a random sampling in this case still remains necessary.

We consider the impossibility to identify the samples which are different from the others as a disadvantage of a univariate analysis. For this purpose, one should use Scheffe’s method of carry out pairwise comparisons of samples.

Still, using the rest of tabs in the results window it is possible to get the following additional results: the average values of the dependent variable for the selected effect; checking the posteriori criteria (post hoc); checking the assumptions made for the analysis of variation; building the profiles of response / desirability; the analysis of residuals; the output of matrices used in the analysis; the access to the options of sending the specifications of variables, code of analysis and predicted equation to the report, as well as the creation of the model code in the languages C / C + + / SVB / PMML. The results should be available both in numbers and in graphical form. It should be noted that a set of additional results depends on the type of the built model, i.e., on the module used.