Econometrics Assignment

Econometrics Assignment

  1. Explain the following terms

Econometrics is the quantitative measurement and that analyses the actual business and actual business and economic phenomena. It makes attempts to quantitatively bridge the gap between economic notion and the real world.

Uses of econometrics

It is used to quantify and measure marginal effects, while also estimating numbers for theoretic equations. Econometrics is used in testing the hypothesis around economic theory and strategy. It is used to forecast future economic activity using past data. Leaders are equipped on what decisions to make based on the extent to which econometrics can shed light on the future.

Purpose of a regression estimation.

It is used to assess how fit the estimated equation is when compared to the theoretical equation.

Meaning of estimated coefficients

  • Error, it is value added to the regression equation, accounting for the variation in Y that is unexplainable by the inclusion of X(s).
  • R-squared and adjusted R-squared. The coefficient determination is the ratio of the explained sum of squares to the total sum of squares. Adjusted R-squared is the measurement of the percentage deviation of Y around its mean defined by the regression equation and adjusted for degrees of freedom.
  • Degrees of freedom are extra of the number of observations over the coefficients.
  • Possible uses of regression in econometrics
  • To predict the cause and effect of related variables.

Reason(s) for an ‘insignificant’ result for an estimated coefficient of the parameter

The insignificance is due to the estimated coefficient being greater than the common level of 0.05.

Reasons and conditions to use ‘2’ as a ballpark number for a critical ‘t’ value

The two-sided test is used to test whether an estimate coefficient is significantly altered than zero. It also test whether an expected coefficient is considerably different from a specified value.

Confidence interval, its calculation and reasons to calculate it

It is a range of values that are to comprise the true amount of beta a convinced percentage of the time.

Confidence interval = beta hat (+/-) t-statistic*standard deviation of the beta hat.

It tells how specific a coefficient estimate is.

P-value and its use in determining the reliability of the estimated coefficient

If the p-value is less than the level of significance and if beta hat(k) has the sign implied by the alternative hypothesis, reject the null hypothesis. If this is not met, not to reject the null hypothesis.

Example of displaying different uses of one-sided versus to sided t-Tests

One-sided t-tests (LaHuis et al., 2014). Null hypothesis: beta [1] is less than or equal to zero versus the alternative hypothesis: beta [1] is greater than zero.

Two-sided t-tests. Null hypothesis: beta [0] equals to zero vs. alternative hypothesis: beta [0] is not equal to zero.

Nature, role, and importance of F-test

It is designed to deal with a null hypothesis that contains both multiple and single predictions about a group of coefficients (Zietz et al., 2008).  F-test tests the significance of seasonal dummies, testing the hypothesis of significant seasonality in the given data.

 

 

  1. Display a hypothetical result of a regression model with its accompanying details and explain, wherever possible, role of the details.

Revenue = 1550 -0.556*rent -100

Revenue is the dependent variable, it depends on the other variables.

1550 is the intercept on the y-axis,

-0.556 is the coefficient of rent.

Rent is the explanatory variable, its one unit change causes a one unit change in revenue,

-100 is the error term, and it is independent of both revenue and rent.

  1. Explain the OLS method and its process to yield the BEST fitting line for the sample data

The technique is mostly used to obtain estimates since it is reasonably easy to use. It calculates beta hats by minimizing the sum of the squared residuals (Zietz et al., 2008). The method is a mathematical technique that produces an estimate of the real regression coefficient of a population after its application. The beta has been produced by the process are estimates where it selects the beta hats that minimize the squared residuals summed over the total sample data points. R-squared is then used to measure the fit

  1. Explain the nature, functions, and characteristics of the stochastic error term.’
  • The error term is random. That is it cannot be predicted.
  • Every observation’s value of the error term can only be determined by chance, and the remarks made will be positive or negative.
  • The error term has zero as its mean of the distribution.
  • The independent variables are all uncorrelated with the error term.
  • In case variables are uncorrelated; mainly due to “omission of some vital independent variable correlated with an included variable,” the estimates will possibly attribute some of the variation in Y to X. This will cause biasedness in the coefficient estimate of X.
  • The observations of the error term are uncorrelated with each other. Their systematic correlation will lead to inaccuracy of the OLS estimates (Zietz et al., 2008). The OLS estimates’ relationship occurs mostly in time-series models.
  • The variance of the error term is constant. There is an assumption that observations of the error term are drawn from matching distributions (LaHuis et al., 2014). If this is not net, the variance becomes non-constant.
  • The error term has a normal distribution which does not affect the OLS estimation, but slightly the hypothesis testing and the confidence intervals.
  1. Derivation (without deriving), meaning and the interpretation of estimated (parameter) coefficients

explain possible uses of unexplained error terms in the OLS.

  • Beta hat (0) is the estimated value of the constant term beta (0).
  • Beta hat (1) is the estimation of the coefficient beta (1). It is the impact on the dependent variable.
  • The dissimilarity between the Y hat and Y is the residual. The residual can be though as an estimation of the error term.
  1. Display on a graph for a single regression equation, explained, unexplained and total error terms. And briefly explain possible uses of unexplained error terms in the OLS.

 

 

  1. Explain the factors to determine the quality of the regression equation?

Collecting data that is required when quantifying the models.

The linearity of the equation form. That is if the line produced is straight.

The slope coefficient should have a linear effect on the overall equation; hence the equation must be linear.

  1. Explain the standard errors that an un-savvy researcher may commit in an empirical work using regression.

The researcher might omit some important values from the equation. This omission produces variations in the dependent variable. He might also make errors when measuring the dependent variable, thus producing a change in the overall linear model (LaHuis et al., 2014). Creating a different functional form other than the correct theoretical equation. Researchers might do or add an unpredictable action which alters the given model.

  1. Explain the interplay of Economics and Statistics in the process of an empirical work based on regression.

Regression is used to quantify the relationships between one variable and other variables. It identifies the closeness and how well the determined relationship is.

  1. Explain briefly the steps of an empirical work using regression.
  2. Form the data structure to fully understand
  3. Check sampling
  4. Crisscross missing values present
  5. Study qualitative background to form some hypothesis while selecting key dependent variables
  6. Select groups of explanatory variables which are the main causes and the basic descriptive statistics that is the mean, standard deviation, and variance among others.
  7. Do measurement work to form models.
  8. Use local models to explore about relationship, using linear and logistic regression,
  9. Conduct partial correlation analysis to aid model specification
  10. Propose structural equations models by use of the obtained results.
  11. Initial fits, using programming software.
  12. Diagnostics of the distribution, residuals and curves.
  13. Final model estimation and explanation of the models.
  14. Explain briefly the possible errors a researcher using regression may commit.

An omission of an important independent variable correlated with the inclusion of an independent variable.

An assumption that a low level of significance is better. However if the level of relevance is low, the probability of Type II Error increases.

 

  1. Explain the nature and function of the OLS assumptions.
  2. The regression model is linear. It means that coefficients must enter model linearly implying that functional form is correct with zero omission of variables.
  3. The population of the error term is zero.
  4. The explanatory variables and the error term are uncorrelated. It implies that the explanatory variables and independent with the error term (LaHuis et al., 2014). It ensures that OLS estimates are accurate.
  5. The variance of the error term is constant. It helps to counter heteroscedasticity; hence the OLS estimates of standard errors are accurate.

There is no perfect linear function of the explanatory variable to other explanatory variables.

The interpretations of the error terms are uncorrelated with each other.

The error term is normally distributed. However, estimating OLS does not require normality assumption helping in hypothesis and confidence intervals.

  1. Explain the process of hypothesis testing for the parameters.

(i) State the hypothesis to be tested. It is before the estimation of the equation. Break the hypothesis into two theories. That is the null hypothesis and alternative hypothesis.

(ii) Perform a two-tailed test, to test the alternative hypothesis with values on both sides of the null hypothesis.

(iii) Use a typical technique to hypothesize an expected value for every coefficient then determine whether to reject the null hypothesis. Two errors might occur; if type 1 occurs, reject the null hypothesis if type 2 occurs do not deny the null hypothesis. Alternatively, one can use the decision rule method to decide if to reject the null hypothesis. It is a comparison of a sample statistic with a critical value. Then a division of the range of possible value of beta hats is made into row regions; acceptance and rejection region.

  1. Explain the three levels of testing to establish the complete ‘robustness’ of the results.

The t-Test. It is used to test hypotheses about individual slope coefficients

The confidence interval. It is a range of values at a certain percentage of time containing the real amount of beta. They are vital in telling the preciseness of a coefficient estimate.

The F-test has a design of handling a null hypothesis containing multiple hypotheses or a single hypothesis around a collection of coefficients. One translates the null hypothesis into constraints placed on the equation. Then an estimation of the constrained equation with OLS is done. After the estimation, there is a comparison of the fit of the constrained equation with the fit of the unconstrained equation (LaHuis, 2014). “The decision rule is rejection of the null hypothesis if calculated F-value is more significant than the critical F-value.”

  1. How are the type one and type two errors committed? What may be the issue(s) to reduce either one?
  2. Type I error is committed by an accurate assumption that the beta is NOT positive whereas the researcher’s estimate to rejects the null hypothesis.
  3. Type II error is committed by a valid assumption that beta is positive whereas the researcher’s estimate rejects the null hypothesis.

These errors can be reduced by using the 5-percent level of significance.

  1. Explain the source and derivation (w/o deriving) the t-test, its use, and possible abuses (what it is s not!).

A t-test is used in testing the hypothesis about individual slope coefficients. It stands as the most suitable test whenever the random error term has a normal distribution. And then an estimation of the variance made. The t-values for every estimated coefficient of standard multiple regression equation are calculated (Cameron and Trivedi, 2013). The mostly used t-statistic is used to test whether a particular regression coefficient is significantly different from zero in most regression hypotheses. The calculated t-value and the critical t-value comparison, form the basis on which we reject or not reject the null hypothesis.

  • Setting up the desired null and alternative hypotheses.
  • Choosing an appropriate level of significance and hence a t-value.
  • Running the regression model, hence obtaining an estimation of t-score.
  • Application of the decision rule by a comparison of the calculated t-value to the critical t-value. It is done so as to know whether to reject or not to reject the null hypothesis.

Abuses

Using the t-test for testing the entire population, whereas it is used to make inference about the real value of a parameter from a sample of the whole population.

  1. How is the following dilemma explained: ‘There are insignificant estimated coefficients of the parameters in the equation, yet an F-test yields a robust result.’

One translates the null hypothesis to constraints that are afterward attached on the linear equation. Make an estimate of the constrained equation with OLS and then a compare the fit restricted equation with the fit of the unconstrained equation.

 

References

Cameron, A. C., & Trivedi, P. K. (2013). Regression analysis of count data (Vol. 53). Cambridge university press.

LaHuis, D. M., Hartman, M. J., Hakoyama, S., & Clark, P. C. (2014). Explained variance measures for multilevel models. Organizational Research Methods17(4), 433-451.

Osborne, J. W. (2014). Best practices in logistic regression. Sage Publications.

Zietz, J., Zietz, E. N., & Sirmans, G. S. (2008). Determinants of house prices: a quantile regression approach. The Journal of Real Estate Finance and Economics37(4), 317-333.