Main

April 18, 2009

A Unified Framework for Introductory Statistics

The topics usually covered in introductory statistics courses in the social sciences can be taught from a single statistical framework: The linear regression model. T-tests, correlation, analysis of variance, regression analysis, cross-classification analysis of ordinal variables, and many nonparametric statistics can be conducted using linear regression to standardize concepts and notation across these topics that are conventionally treated as distinct types of statistics. This framework organizes Statistical Analysis in the Social Sciences by McKee J. McClendon, an introductory statistics text for upper-level undergraduates in the social sciences.

Posted by mjm18 at 11:14 AM | Comments (0) | TrackBack

April 17, 2009

Statistical Analysis in the Social Sciences by McKee J. McClendon

Cover of Intro Stats book.jpg

Statistical Analysis for the Social Sciences is an introductory statistics text for upper-level undergraduates in the social sciences. It imbeds the basic descriptive and inferential statistics usually covered in introductory books within a unified statistical framework. The linear regression model provides this framework. The regression model is not only used for conducting standard regression analyses but also for other statistical analyses that are ordinarily taught as separate and distinct techniques, such as correlation, difference of means, analysis of variance, and certain nonparametric tests. The extensive use of a single statistical model will increase the students’ depth of understanding and give them a sense of mastery and self-confidence that they often fail to achieve in their first statistics course.

The basic statistical tools covered in this text consist of describing or summarizing the observed attributes of a single variable, describing the association between two or more observed variables, and making estimates of the likely values of each of these quantities in a population. Although these topics comprise the content of most introductory statistics texts, the typical text covers this material under a rather complex typology of seemingly disparate and unrelated techniques. For example, measures and tests of associations are often covered under these categories: difference of means, correlation, regression analysis, analysis of variance, association between ordinal variables (e.g., rho, tau, gamma, d), association between ordinal and nominal variables (e.g., Wilcoxon-Mann-Whitney test, Kruskal-Wallis test), and association between nominal variables (e.g., chi square, phi, Cramer’s V, lambda).

This book presents the basic statistical tools from a more unified framework. With the exception of the case of nominal variables, the regression model is the statistical technique that will be used for the analysis of all associations between variables. The concepts and notation of the regression model are also consistent with those of two important univariate descriptive statistics, the mean and the standard deviation (or variance); or, to put it another way, the mean and the standard deviation are basic statistical concepts that lie at the very core of regression analysis. The consistent utilization of the regression model will provide a basic statistical framework for the cumulative development of statistical concepts and techniques. It will also provide a consistent set of statistical terminology and notation. The end result will be an increase in the students’ understanding of basic statistical procedures and less reliance on memorization of rules and formulas.

Another benefit of using a unifying statistical model is that there will be less of a cookbook approach to statistical analysis. The typical text teaches the student to first identify the level(s) of measurement of the variables to be analyzed and the general type of analysis to be conducted (central tendency, variation, or association). After this is done the student can then look up the specific statistical technique with which to conduct the analysis. In this approach, the researcher finds the statistical technique that fits the types of variables and data to be analyzed; that is, the technique is fitted to the data. Although the use of a single unifying model will not obviate the need to pay careful attention to the levels of measurement of the variables, the emphasis will be on fitting the data to the model rather than the other way around. Fitting the data to the model will often involve transforming the data in such a way that it can be validly used by the regression technique. For example, nominal variables will be recoded into dummy variables (Chapter 11) and ordinal variables will be transformed into rank-scored variables (Chapter 12).

How can the regression model be justified as a unifying framework for most types of association between variables? The most used parametric statistical techniques are based on a general linear model. In particular, tests for differences of means, bivariate and partial correlations, regression analysis, and analysis of variance are all manifestations of the same general linear model (e.g., Fox 1997, pp. 204–219). Regression is the most general and flexible of these techniques. It can be used to test a difference of means (the t-test of a dummy variable slope), to compute and test correlations (the standardized regression slope), and to conduct an analysis of variance (e.g., the F-test for a set of dummy variables). In addition, regression analysis is the most frequently used statistical technique in the social sciences. Thus, a more extended treatment of and reliance on regression than it normally receives in introductory texts is fully warranted. Despite, however, widespread knowledge about the general linear model and the versatility of regression analysis, there are no texts suitable for use in an undergraduate introductory statistics course in the social sciences that incorporate a general linear model approach to teaching parametric statistics.

Regression analysis will also be used to execute a number of the more commonly used nonparametric techniques that are designed for ordinal dependent variables. The validity of using regression in this capacity is not nearly as well known as its validity for parametric analyses. This method for conducting nonparametric analyses consists of applying a parametric technique, such as regression, to ordinal variables that have been scored for ranks (Conover and Iman 1981; Conover 1980). When two ordinal variables, for example, are scored for ranks, the Pearson correlation equals the Spearman correlation. And when an ordinal dependent variable scored for ranks is regressed on a dummy variable, it is equivalent to the Wilcoxon-Mann-Whitney test. The use of regression for conducting nonparametric tests involving ordinal variables is covered in Chapter 12.

The implementation of the proposed framework simply involves giving the students more instruction in regression and omitting some of the other traditional techniques. In this case, however, it is not a matter of substituting depth for breadth because the increased depth in regression is for the purpose of giving the students the tools needed to handle the same range of statistical problems that are covered in a more traditional fashion.

Posted by mjm18 at 07:22 PM | Comments (0) | TrackBack