For your paper this week, pick a handful of continuous variables (organize around a central question) and examine their correlations. Include a table - see the gender paper, P30.
(Don't worry about z-scoring variables here.)
Subscribe to:
Post Comments (Atom)
6 comments:
Hi - A couple of comments on some questions I've received that might be relevant for the rest of you:
1) Remember that for correlations, your variable should be normally distributed - if it's skewed it might make sense to transform the variable using a log - but be sure check the histogram again after transforming to see if it worked.
Also, I had a question about this error message when log-transforming a variable (s2discat):
>Warning # 602
>The argument for the natural log function is less than or equal to
>zero. The result has been set to the system-missing value.
>Command line: 950 Current case: 1 Current splitfile group: 1
The problem in this case is that you can't take a log of 0...one way to deal with this is to compute a new variable by adding one to it, and then taking the log (assuming there are no negative numbers in the variable's coding scheme).
And one other question that might be relevant:
I just wanted to clarify something regarding the Week 10 paper. The example correlation tables make it look like we should provide correlations between all variables (i.e., not just between SES and all the others, but between each variable we are analyzing and all the others). However, Doug said in class that we should make SES our focus and discuss the relationship between SES and the other variables in our paper.
So what I am doing is I am discussing the relationship between SES and the other variables in my findings section (and not relationships between all other variables and each other), but including all of the relationships in my table.
Is that the correct thing to do?
-------------
In response to this question -
You should focus on the correlations between SES and all the others. But, note other relationships as well - the correlations between each variable should probably be included in your discussion if significant, in the sense that it might explain the correlation between SES and another variable or vice versa.
Hi Megan,
So if after you do the log it still seems skewed does that mean I did it wrong or I just shouldn't use that log and stick with the original variable?
Shelly
Shelly - it means that log transforming won't work....you can also try taking the square root. But if it's still not normally distributed, you would have to recode it into a categorical variable - and so you'll need to use a different variable in this analysis, if the distribution isn't normal either way....
(If you're using it in your final paper, you would just use it as a categorical variable and include in the regression.)
Hi Megan,
Thanks for the helpful feedback on my final paper idea.
For the assignment due on Nov.18, I want to look at children's approaches to learning in the fall of kindergarten.
That measure is NOT normally distributed! I don't want to log transform it, because I won't know how to interpret the output after that.
To make the measure more normally distributed, I averaged the parent report and teacher report of learning skills from the fall. The variable looks much better after that, but it is still slightly skewed.
My question for you: how do I know if the variable is normal enough?
Thanks,
Alex
Good evening,
A question on Assignment 9. For our table, we created one table, with 5 models listed in 5 columns across. Is creating one table for all five models OK, or should we create a separate table for each model?
Thanks much,
Ariel and Martie
Post a Comment