Final Project Proposal

The link to our data: World Happiness Project


The data was collected from Gallup World Poll. Their survey consisted of questions that asked participants to rank their own life on a Cantril ladder with a scale from 1 to 10, 10 being the best ideal way of living and 0 being the worst. This data set focuses on the happiness score of each country, which ranges from 0 to 10. Each country is ranked based on that averaged happiness score for participants. The team recorded scores for these factors: economy or GDP per Capita, family or social support, health or life expectancy, and freedom to help explain the happiness score of each country and these factors are scaled from 0 to 1.85 instead of 0 to 10.

The dystopia residual variable is most notable among the variables, which requires additional description. It measures the lowest national average considering economic production, social support, life expectancy, freedom, absence of corruption, and generosity parameters. In other words, by creating a hypothetical nation with the lowest percentage (for each of 6 different scores), each country’s difference or residual with the given hypothetical nation can be measured.

We have a few ideas of how we would like to use this data. First, data manipulation is required since the variables are not completely aligned year by year. After the data preparation, we will visualize how the happiness scores have changed over the five years by country and region. Also, we can determine which variables have the highest effect on happiness scores using a model. By utilizing linear regression and correlation tests, we will be able to observe the relationships between different variables among given regions. In addition, we wish to demonstrate a world map with clear visualization depicting the happiness scores for each country.

Furthermore, we wish to merge our existing dataset with other datasets to derive meaningful conclusions. For example happiness scores can be compared to divorce rate, the number of significant economic crises, birth rate, etc.. We have not chosen which data would be most adequate; however, we believe that adding these variables to our existing dataset would provide more bountiful results for our happiness project.