Inference with regression.

December 9, 2009

I discussed the regression model as a model for the mean of a response variable expressed as a mathematical function of an explanatory variable. I discussed point and interval estimation and significance tests for the slope parameter for a linear regression model, and the concept of a prediction interval as a kind of confidence interval for predictions made from an estimated regression model.

No class on Friday.


Parameter estimation for regression models, least squares, and correlation.

December 7, 2009

Today we focused on how to use the available data to infer the unknown parameters of a linear regression function. Estimates can be derived using the principle of least squares. We also discussed the correlation coefficient and its relationship to the regression line.

Here is the handout regarding the final examination.


Introduction to regression.

December 4, 2009

Our last topic for the course will be regression. Regression is a family of statistical methods appropriate when the response variable is quantitative, and the explanatory variable(s) is/are also quantitative (although time permitting we may see that we can generalize regression to handle categorical response and/or explanatory variables). We’ll start with simple linear regression where the response variable is predicted using a linear function of a single explanatory variable. Today we discussed the basic concept of regression and prediction. Starting Monday we’ll get more into the inferential details.


Design and analysis.

December 2, 2009

Analysis of variance developed in response to the need to statistically process data from a variety of experimental designs. In turn, ANOVA helped researchers understand the advantages and disadvantages of various designs. There are many different types of experimental and quasi-experimental designs, but we considered three common and fundamental designs: completely randomized, factorial, and randomized block designs. I gave an overview of each design and how ANOVA decomposes the variability differently for each design.


Follow-up analysis to ANOVA.

November 30, 2009

Today we focused on how we might follow-up an ANOVA to determine which pairs of group means are significantly different. We focused on Fisher’s method. We also discussed some of the issues in conducting multiple comparisons. We also briefly discussed the underlying assumptions of ANOVA and follow-up inferences.

Here is the take-home quiz if you didn’t get a hardcopy.

 


Introduction to ANOVA — continued. Mean squares & the F-test.

November 20, 2009

We discussed the ANOVA (summary) table and the interpretation of the mean squares. We saw how the (expected) mean squares measure the variability in the data and how we can use these measurements to decide if there is evidence for a “treatment” effect — i.e., does it matter that we treated the g groups differently? In terms of the underlying g population means, we can view this as an extension of the problem of trying to determine if the difference between two sample means is statistically significant, and we discussed the relationship between the t-test and the F-test when g = 2.

Read: Chapter 14, although you can ignore discussions where ANOVA is related to regression (a topic we’ve not yet discussed).


Introduction to ANOVA.

November 16, 2009

Today I introduced the conceptual foundation of the statistical technique of the analysis of variance (ANOVA). It is a very general method for the analysis of data from a wide variety of experimental and observational study designs. Today we focused on the idea of accounting for the various sourced of variability in a response variable and the measurement of between- versus within-groups variability through the corresponding mean square quantities.


Simpson’s paradox.

November 13, 2009

Simpson’s paradox can occur in many different situations, but it is easiest to understand it in the context of the analysis of the independence/dependence of two categorical variables. We looked at several real-world illustrations of this so-called paradox. We discussed why the paradox occurs and what measures can be taken to deal with it.

Read: 10.5 and 11.3 for some background. You might also look at the entry in Wikipedia for Simpson’s paradox which includes several classic examples.


Tests of independence — continued.

November 11, 2009

We looked at some additional examples of tests of independence for two categorical variables. We saw also that situations in which we don’t have a clear explanatory or response variable but rather simply sample observations and classify them on two categorical variables imply the same expected counts, test statistic, degrees of freedom, and hence p-value and decision as when we are comparing two or more samples stratified by an explanatory variable.


The test of independent/homgeneity for two categorical variables.

November 9, 2009

Today I demonstrated the test of independent/homogeneity for two categorical variables using a 1979 paper published in the Journal of Consulting and Clinial Psychology that conducted an experiment to compare four types of therapy for the treatment of depression. The question was whether there was evidence that the outcome level of therapy was related to the kind of therapy used. We discussed the test procedure, the hypotheses, and the concept of a conditional distibution and what it means to say that one variable is independent of another.

Homework: 11.1, 11.4, 11.8-11.11, 11.15, 11.17, 11.18, 11.21, 11.22.