Monday, September 15, 2008

First Homework Assignment: Due September 23rd

a) Using the class ECLS-K data file, produce frequencies for the following variables: gender (FEMALE), race/ethnicity (RACE5), and mother’s educational attainment (WKMOMED; note the missing data). In a short paragraph, describe the children in the sample in terms of these demographic characteristics.

b) Produce histograms for the following variables: approaches to learning, Spring (T2LEARN), externalizing behavior problems, Spring (T2EXTERN), general knowledge skills, Spring (C2GSCALE), family income (WKINCOME), and mother’s age (P1HMAGE). Create a table that provides the M and SD of these four variables. Discuss your findings, making sure to describe the distributions of each variable. As with the first part of this assignment, provide some insight into these children and their families.

Please include both your syntax and output with each assignment.

21 comments:

tracy said...

Hi,
We are having some trouble understanding the data we produced, specifically with the approaches to learning, general knowledge and externalizing behavior variables. We don't know what the values mean and are wondering where can we get information about these specific variables so that we can talk about them.

Also, are we supposed to run bell curves for our data?

Thank you, Tracy and Kathleen

Megan said...

Hi,
The variables are described in Doug's papers, and you can also find information about them in the ECLS-K user manual and the link to the ECLS-K site on this blog. As for the values, you might think about standardizing them.

Yes, you should be running histograms for (T2LEARN),(T2EXTERN), (C2GSCALE), (WKINCOME), and (P1HMAGE).

-Megan

Anonymous said...

I don't see where the c2GSCALE came from. Assign says develop table for 4 variables-this would be 5

Anonymous said...

Did not mean to leave the last anonymous. This is my first time blogging.
I am confused about where the bycw0 fits in. Do I run descriptives on it?
Also a question on the Races-it says (Races 5 )but there are more than 5 catagories-are we only interested in teh first 5?

Megan said...

Martie - I think the "four" is a typo - the table should be of the five variables. (Also, make sure you're looking at the most recent version of the syllabus, which is posted on classweb).
Also, I think you're using the wrong race variable - there are couple of variables - use "race5", which has 5 categories, rather than "race" or the other dummy variables.

Anonymous said...

Thanks that is helpful about race.
Now I am confused about integrating the wts bycwo. I am not sure what to do with it. Just produce a histogram? or run a test?
I saw where DOug put it in teh article but is there anything else to do with it?

kate said...

hey guys:
can anyone give me a quick rundown on how to tell which variables we are supposed to standardize/z score and which we don't? i feel like i should be standardizing t2learn, t2extern and c2gscale ... but have shaky reasoning as to why. anyone know?
thanks!!!
katie

kate said...

actually, take that back: c2gscale and p1hmage ???

Megan said...

It's useful to standardize variables where you the scale isn't particularly meaningful across a wide audience - so for example, it might make more sense to talk about the general knowledge test in terms of standard deviations rather than points because who knows what a score of 51 would mean...but for age, it's easier to understand in terms of years rather than standard deviations, so you probably wouldn't standardize age.

Unknown said...

Can someone clarify what parts of Doug's papers we are suppose to cut and paste.
It seems like the "ECLS-K" data section from the Explaining Girls... paper would be applicable to this assignment as well as the weight section of the same paper.
As for the analytic approach section of our paper, should we just mention that we used a random subset of the ECLS-K data as it is described in the syllabus?
Thanks,
Courtney

Anonymous said...

Are we required to standardize any of the variables? I am thinking that the general skills test variable is the only one that it makes sense to standardize, but in writing my results, I feel like it is interesting to note the raw score mean, rather than the z-score mean. Please advise. Thanks.

Megan said...

It's pretty much always a good idea to standardize variables with an unknown metric. I also checked with Doug on this question (It's a good point and I can see how it might seem weird to be talking about means of 0 throughout), who stated "A [mean of] zero is every bit as "interesting" as a 22.34 (or whatever)." This will be particularly important when we start looking at differences between groups, to be able to think about differences in terms of effect sizes or to compare scores between spring and fall, for example.

Megan said...

Courtney - regarding your question: yes, sounds good.
-Megan

Anonymous said...

Megan -- Thanks for checking on that standardization question. In that case, would it make sense to standardize the scales (T2LEARN and T2EXTERN) as well as test scores?

oo said...

Hi - Did Doug say where in the paper the histograms belong? I know he wants the table(s) after everything. Megan, if he didn't say, what is your bet?

Many thanks, Karen

Megan said...

Karen -
Don't include the picture of the histograms in your write-up (just discuss them), only include the pics in the syntax/output attachments.

Also, just a reminder for everyone not to simply paste in the SPSS ouput for frequencies, etc. as a table - be sure to use the word table format Doug passed out in class.

-Megan

oo said...

Megan,
Thanks for info on histograms. Another query: maybe I'm being too literal, but that handout that Doug passed out didn't have any specific places for min max values which I think are needed for standardized values. Mean and SD will always be 0 and 1 for those, or am I completely missing something? I've improvised something but just wondering. Thanks! (This blog works great!) - Karen again

First Flight said...

Good evening, all.

The gender measure in the manual on the ECLS-K data set states that girls=1, and boys=2. We were under the impression that dummy variables are always 1 and 0. Should we refer to this variable as a dummy variable if it isn't 1 and 0, or just as a dichotomous categorical variable of 1 and 2?

Ariel and Le

Megan said...

Hi Ariel- Yes, the "dichotomous categorical variable" label is more correct. In the future - once we learn this - I think you'd transform the variable into a dummy (female).
-Megan

martie boulton said...

Hi
CAn someone remind me about Effect size
We do not do it if the findings are not sig
and we only use it on Z scaled numbers?
Thanks
martie

Megan said...

Yes, that's correct.