Secondly, the F test is sensitive to the assumption of normality of the totals, that is to say, if it is not certain that all of the two totals obeyed a strictly normal distribution, then the F test will fail. A levene test or a non-parametric test can be used instead. So, determine the distribution of the totals before handling the data.
Third, about the t-test
1. In the case of a single sample, the overall population slightly deviates from the normal distribution, and when the sample size is large enough (you need to judge the size of n according to the situation and experience, 30, or 50, or more), the effect on the efficacy of the t-test is not significant. In extreme cases, when the sample n is larger than 120, the t-test and z-test are extremely similar (interested in verifying this :)). However, when the sample size is less than 30 and it is not possible to determine whether the population is approximately normally distributed, the t-test efficacy decreases. It can be replaced by a non-parametric test.
2. In the two-sample case,
a. the overall variance is equal, as long as the sample size n1, n2 are greater than 30, even if the overall does not obey the normal distribution, you can use the t-test. Refer to the central limit theorem.
b. The overall variances are not equal, and the overall should at least approximately obey a normal distribution. Large data samples to determine the overall distribution to obey the normal distribution does not always happen, small sample size is best to do the normality test, bell-shaped graph, compare the median mean sigma and other methods, at least to determine the data to approximately obey the normal distribution. If it really does not obey. Just refer to the non-parametric bar. Or data transformation. And this situation is the same as the paired t-test, if you verify that the data seriously violates the normal distribution, do not use the t-test.
Note that the 2 t-test statistics for a two-sample are different and have different degrees of freedom, but they are very similar, so it feels a bit redundant to make the assumption that the overall variances are equal. Still, sometimes the equivalence of the 2 overall variances has a big impact on their results. Therefore, it is very necessary to use F test to do the difference test of overall variance first before making the choice of t test.
There are some inappropriate things said above, please discuss together again.