Pearson’s And Bivariate Correlations

Pearson’s And Bivariate Correlations

Does ‘highest year’ of school completed contribute to one’s income?

Variables

Using the students’ dataset, we will study ‘highest year of school completed’ and ‘respondents income in constant dollars’ variables and, look for any association between the two variables. We will use Pearson’s correlation and bivariate regression to examine if the highest year of school completed determines one’s income. Pearson’s correlation is a test statistic used by the researchers to examine if there is an association between the variables of interest. The variables used for Pearson’s correlation analysis should be continuous, i.e. continuous dependent and continuous independent variables. A bivariate regression, just like Pearson’s, screens for the relationship between two variables. It is a simple multivariate regression (has one predictor variable) since it just examines two variables (Frankfort-Nachmias & Leon-Guerrero, 2017).

Research question and study objective

‘Does the highest year of school completed contribute to one’s income?’ is our research question in this case. Our interest is to screen for any association between the income of a participant and their highest year of school completion. We would also focus on answering questions like; ‘If the value of one variable is increased/decreased, what happens to the value of the other variable?’, ‘What happens to income if value of highest year is changed?’ and ‘If there is an association, how strong is the association between the two variables?’ Answering these questions would help us know how the two variables correlate.  We will, therefore, answer our research questions by testing our hypothesis at 5% significance level (Kutner, Nachtsheim &Li, 2005).

A significant association between one’s highest year of school complete and their income does not exist.

A significant association between one’s highest year of school complete and their income exists.

 

Results

Pearson’s Correlation Results

Table 1; Pearson’s correlation table

Correlations
  RESPONDENT INCOME IN CONSTANT DOLLARS HIGHEST YEAR OF SCHOOL COMPLETED
RESPONDENT INCOME IN CONSTANT DOLLARS Pearson Correlation 1 .353**
Sig. (2-tailed)   .000
N 1523 1523
HIGHEST YEAR OF SCHOOL COMPLETED Pearson Correlation .353** 1
Sig. (2-tailed) .000  
N 1523 2537
**. Correlation is significant at the 0.01 level (2-tailed).

 

Having set a significance level of 5%, the results above (Table 1) show a p-value (sig) p= 0.000 for a two-tailed direction. Since p= .000 < α = 0.05, then we have evidence to reject our null hypothesis that says ‘there is no significant association’ and, thus conclude that there is the existence of a significant association between one’s highest year of school complete and their income at 95% confidence level. There is a relationship between one’s highest year of school complete and their income for the 1523 participants who were interviewed.

We are now confident that there is a correlation between one’s highest year of school complete and their income. We want to answer our second research question; i.e., how the two variables correlate. Pearson’s correlation coefficient value between one’s highest year of school complete and their income is 0.353. The correlation coefficient in our case is in a positive direction (+ .353). The latter implies that, as the value of the highest year of school completion increases, the amount of one’s income increases. Using the general rule for correlations, we deduce that the correlation between one’s highest year of school complete and their income is a weak one (Wagner, 2012).

Bivariate Regression Results

Table 2; bivariate model summary

Model Summary
Model R R Square Adjusted R Square Std. Error of the Estimate
1 .353a .125 .124 31246.915
a. Predictors: (Constant), HIGHEST YEAR OF SCHOOL COMPLETED
b. Dependent Variable: RESPONDENT INCOME IN CONSTANT DOLLARS

 

Table 3; ANOVA table

ANOVAa
Model Sum of Squares df Mean Square F Sig.
1 Regression 211453220920.304 1 211453220920.304 216.571 .000b
Residual 1485058299653.447 1521 976369690.765    
Total 1696511520573.751 1522      
a. Dependent Variable: RESPONDENT INCOME IN CONSTANT DOLLARS
b. Predictors: (Constant), HIGHEST YEAR OF SCHOOL COMPLETED

 

 

Table 4; bivariate regression coefficient table

Coefficientsa
Model Unstandardized Coefficients Standardized Coefficients t Sig.
B Std. Error Beta
1 (Constant) -21258.539 3876.208   -5.484 .000
HIGHEST YEAR OF SCHOOL COMPLETED 3948.706 268.321 .353 14.716 .000
a. Dependent Variable: RESPONDENT INCOME IN CONSTANT DOLLARS

The p-value (sig) for bivariate regression (Table 3); ρ = .000 < α=0.05 which still confirms that there is a significant association between the two variables. The standardized coefficient value for bivariate regression (Table 4) and the correlation coefficient value produced in Pearson’s correlation (Table 1) are equal; β= 0.353. The coefficient does not change because we used a single predictor for both tests. Therefore, from the standardized coefficient value (β= 0.353) we still deduce the same interpretation that; as the value of the highest year of school completion increases, the value of one’s income increases because the value is positive.

So far we know that; as the value of the highest year of school completion increases, the amount of one’s income increases, but we yet do not understand how it increases. The unstandardized coefficient closes this gap. We are using the unstandardized coefficient, (Table 4) Β= 3948.706. Therefore, we say that whenever ‘highest year of school complete’ increases by one (1) unit, then ‘respondent’s income’ increases by 3948.706 constant dollars. Finally, we can deduce that there is a weak positive association where a one unit increase in highest year of school completion results to the rise in income by $3948.706

 

 

References

Frankfort-Nachmias, C., & Leon-Guerrero, A. (2017). Social statistics for a diverse society. Sage Publications.

Kutner, M. H., Nachtsheim, C. J., Neter, J., & Li, W. (2005). Applied linear statistical models (Vol. 103). Boston: McGraw-Hill Irwin.

Wagner, W. E. (2012). Using IBM® SPSS® statistics for research methods and social science statistics. Sage.