Does ‘highest year’ of school completed contribute to one’s income?
Variables
Using the students’ dataset, we will study ‘highest year of school completed’ and ‘respondents income in constant dollars’ variables and, look for any association between the two variables. We will use Pearson’s correlation and bivariate regression to examine if the highest year of school completed determines one’s income. Pearson’s correlation is a test statistic used by the researchers to examine if there is an association between the variables of interest. The variables used for Pearson’s correlation analysis should be continuous, i.e. continuous dependent and continuous independent variables. A bivariate regression, just like Pearson’s, screens for the relationship between two variables. It is a simple multivariate regression (has one predictor variable) since it just examines two variables (Frankfort-Nachmias & Leon-Guerrero, 2017).
Research question and study objective
‘Does the highest year of school completed contribute to one’s income?’ is our research question in this case. Our interest is to screen for any association between the income of a participant and their highest year of school completion. We would also focus on answering questions like; ‘If the value of one variable is increased/decreased, what happens to the value of the other variable?’, ‘What happens to income if value of highest year is changed?’ and ‘If there is an association, how strong is the association between the two variables?’ Answering these questions would help us know how the two variables correlate. We will, therefore, answer our research questions by testing our hypothesis at 5% significance level (Kutner, Nachtsheim &Li, 2005).
A significant association between one’s highest year of school complete and their income does not exist.
A significant association between one’s highest year of school complete and their income exists.
Results
Pearson’s Correlation Results
Table 1; Pearson’s correlation table
Correlations | |||
RESPONDENT INCOME IN CONSTANT DOLLARS | HIGHEST YEAR OF SCHOOL COMPLETED | ||
RESPONDENT INCOME IN CONSTANT DOLLARS | Pearson Correlation | 1 | .353** |
Sig. (2-tailed) | .000 | ||
N | 1523 | 1523 | |
HIGHEST YEAR OF SCHOOL COMPLETED | Pearson Correlation | .353** | 1 |
Sig. (2-tailed) | .000 | ||
N | 1523 | 2537 | |
**. Correlation is significant at the 0.01 level (2-tailed). |
Having set a significance level of 5%, the results above (Table 1) show a p-value (sig) p= 0.000 for a two-tailed direction. Since p= .000 < α = 0.05, then we have evidence to reject our null hypothesis that says ‘there is no significant association’ and, thus conclude that there is the existence of a significant association between one’s highest year of school complete and their income at 95% confidence level. There is a relationship between one’s highest year of school complete and their income for the 1523 participants who were interviewed.
We are now confident that there is a correlation between one’s highest year of school complete and their income. We want to answer our second research question; i.e., how the two variables correlate. Pearson’s correlation coefficient value between one’s highest year of school complete and their income is 0.353. The correlation coefficient in our case is in a positive direction (+ .353). The latter implies that, as the value of the highest year of school completion increases, the amount of one’s income increases. Using the general rule for correlations, we deduce that the correlation between one’s highest year of school complete and their income is a weak one (Wagner, 2012).
Bivariate Regression Results
Table 2; bivariate model summary
Model Summary | ||||
Model | R | R Square | Adjusted R Square | Std. Error of the Estimate |
1 | .353a | .125 | .124 | 31246.915 |
a. Predictors: (Constant), HIGHEST YEAR OF SCHOOL COMPLETED | ||||
b. Dependent Variable: RESPONDENT INCOME IN CONSTANT DOLLARS |
Table 3; ANOVA table
ANOVAa | ||||||
Model | Sum of Squares | df | Mean Square | F | Sig. | |
1 | Regression | 211453220920.304 | 1 | 211453220920.304 | 216.571 | .000b |
Residual | 1485058299653.447 | 1521 | 976369690.765 | |||
Total | 1696511520573.751 | 1522 | ||||
a. Dependent Variable: RESPONDENT INCOME IN CONSTANT DOLLARS | ||||||
b. Predictors: (Constant), HIGHEST YEAR OF SCHOOL COMPLETED |
Table 4; bivariate regression coefficient table
Coefficientsa | ||||||
Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | ||
B | Std. Error | Beta | ||||
1 | (Constant) | -21258.539 | 3876.208 | -5.484 | .000 | |
HIGHEST YEAR OF SCHOOL COMPLETED | 3948.706 | 268.321 | .353 | 14.716 | .000 | |
a. Dependent Variable: RESPONDENT INCOME IN CONSTANT DOLLARS |
The p-value (sig) for bivariate regression (Table 3); ρ = .000 < α=0.05 which still confirms that there is a significant association between the two variables. The standardized coefficient value for bivariate regression (Table 4) and the correlation coefficient value produced in Pearson’s correlation (Table 1) are equal; β= 0.353. The coefficient does not change because we used a single predictor for both tests. Therefore, from the standardized coefficient value (β= 0.353) we still deduce the same interpretation that; as the value of the highest year of school completion increases, the value of one’s income increases because the value is positive.
So far we know that; as the value of the highest year of school completion increases, the amount of one’s income increases, but we yet do not understand how it increases. The unstandardized coefficient closes this gap. We are using the unstandardized coefficient, (Table 4) Β= 3948.706. Therefore, we say that whenever ‘highest year of school complete’ increases by one (1) unit, then ‘respondent’s income’ increases by 3948.706 constant dollars. Finally, we can deduce that there is a weak positive association where a one unit increase in highest year of school completion results to the rise in income by $3948.706
References
Frankfort-Nachmias, C., & Leon-Guerrero, A. (2017). Social statistics for a diverse society. Sage Publications.
Kutner, M. H., Nachtsheim, C. J., Neter, J., & Li, W. (2005). Applied linear statistical models (Vol. 103). Boston: McGraw-Hill Irwin.
Wagner, W. E. (2012). Using IBM® SPSS® statistics for research methods and social science statistics. Sage.