Tumgik
#groupse
I've just lost at least 4 braincells, 2 hours and my laptop is melting so storytime!
I have this class at Uni that specifically teaches how to write and format research papers/reports/theses. For our semester long project each student had to pick a random article from a science magazine and write a few pages on its topic, along with some pictures and graphs.
Mistake number 1: The article was in English. The project was supposed to be in Polish (my native language). And since the topic of the article was "Social outcomes of education" while the subject of my studies is nowhere near that field of science, I figured I'll just translate the thing, format it as the professor likes and be done with that course. It's not like I have to do my own research (that doesn't stand nowhere near my future engineering degree) in this class that focuses purely on mastering Word, Excel and PowerPoint, right? Wrong! I have to write the project myself and use real, accurate data to make my own graphs.
Mistake number 2: Remember when teachers in high school would tell you "You have to cite your sources. Wikipedia is NOT a source"? I've never had to use it but it was a well known lifehack to cite Wikipedia's sources instead. So where did my mind go? Just see where the article got its data and use it yourself. Fuck the text, I'll rewrite it later, now let's make some graphs! The sources are right there, it's that easy!
So I went to the European Social Survey website, feeling so good that my data will be up to date, downloaded the results of the latest survey, opened it with my spreadsheet,
which started lagging immensly,
after a few minutes I could move the mouse again.
There were THOUSANDS of variables and TEN TIMES MORE the amount of values. The only words I could recognize, while searching through the program ad 2 seconds per frame, were names of countries.
I decided that even for the course's standards this was way out of my league. Unfortunately I had a hard time finding any more casual user friendly data on the topic so I was back at the ESS. This time around I saw I can manually select variables I actually need. And they have descriptions what their values mean! This will go smoothly!
It did not.
Mistake number 3 and the reason I decided to share this: I open a clean soreadsheet with only 2 variables - educational attainment and life satisfaction. Entries are divided by countries. Goal: Make a graph that shows people with higher education feel happier. I need to take the average score (1-10) from each group (1-5). Let's start with Belgium. I go to the data spreadsheet, filter the score for only the grouop with the lowest education. Select first value, endless, laggy scrolling to the last value, shift, select. Average score: 7,6.
This made me feel weird, since it's a 1-10 scake and this score should be the lowest from the 5 groups. Oh well, maybe Belgians are just very satisfied with their lives. Jealous.
Next group's score was 7,2. Now that raised my suspicion. All 5 groups of people with different educational attachments scored well within error of 1 point. That wasn't supposed to happen. I double checked with the article and, well, on its survey the differences between Belgian groupse were indeed small. So I decided to check which country had the highest differences and check it next. What will it be? Poland. Oh, yeah, Poland do be having differences.
Back to the spreadsheet, I settled on only checking the lowest, highest and the group in the middle to save time. Group 1 scored... around 7. Similar to group 3. Didn't even check the last one, Poles can't be that satisfied with their lives, can they? I should know, I'm one! And right now I'm miserable!
Devastated I checked the data sheet. Maybe I screwed up with the filteres? Let's check. Country? Show only Poland. Education? Show only level 3. Life satisfaction? There it is, scores of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 88. Wait.
Well it turns out, that the survey, apart from the regular scores was also accounting for answers like "I don't know", or "I won't say". And values for these were 55, 66, 77 and so on. So no shit the average scores were coming out wrong when each group had multiple Happiness Georges who were living their best lives Adn. Should. Not. Have. Been. Counted. I selected the scores excluding outliers and Poles' life satisfaction dropped to 5,5. Yeah, that's more like it.
Anyway that's it there's no moral I'm just really tired and wanted to vent have a great day
3 notes · View notes