Final Project Proposal
The link to our data: World Happiness
Project
The data was collected from Gallup World Poll. Their
survey consisted of questions that asked participants to rank their own
life on a Cantril ladder with a scale from 1 to 10, 10 being the best
ideal way of living and 0 being the worst. This data set focuses on the
happiness score of each country, which ranges from 0 to 10. Each country
is ranked based on that averaged happiness score for participants. The
team recorded scores for these factors: economy or GDP per Capita,
family or social support, health or life expectancy, and freedom to help
explain the happiness score of each country and these factors are scaled
from 0 to 1.85 instead of 0 to 10.
The dystopia residual variable is most notable among the variables,
which requires additional description. It measures the lowest national
average considering economic production, social support, life
expectancy, freedom, absence of corruption, and generosity parameters.
In other words, by creating a hypothetical nation with the lowest
percentage (for each of 6 different scores), each country’s difference
or residual with the given hypothetical nation can be measured.
We have a few ideas of how we would like to use this data. First,
data manipulation is required since the variables are not completely
aligned year by year. After the data preparation, we will visualize how
the happiness scores have changed over the five years by country and
region. Also, we can determine which variables have the highest effect
on happiness scores using a model. By utilizing linear regression and
correlation tests, we will be able to observe the relationships between
different variables among given regions. In addition, we wish to
demonstrate a world map with clear visualization depicting the happiness
scores for each country.
Furthermore, we wish to merge our existing dataset with other datasets to derive meaningful conclusions. For example happiness scores can be compared to divorce rate, the number of significant economic crises, birth rate, etc.. We have not chosen which data would be most adequate; however, we believe that adding these variables to our existing dataset would provide more bountiful results for our happiness project.