 Home
 Free Samples
 Statistics
 STAT6003 Statistics For Financial Dec...
STAT6003 Statistics for Financial Decisions: Report on Regression Analysis Assessment 4 Answer
ASSESSMENT BRIEF  
Subject Code and Title  STAT6003 : Statistics for Financial Decisions 
Assessment  Assessment 4 (individual assignment) – 30% 
Individual/Group  Group 
Length  2000 words (+/ 10%) 
Learning Outcomes 

Weighting  30% 
Total Marks  30 marks 
Context:
The main aims to develop students’ competency in statistical literacy for decision making in the local and global business environment. It reviews statistical techniques for the quantitative evaluation of data in Financial applications. Students will develop analytical and statistical skills to enable them to transform data into meaningful information for the purpose of decision making.
Objectives:
 To more broadly understand the statistical literacy for decision making.
 Interpret statistical results and communicate their statistical analysis in business reports.
Instructions:
This individual assignment requires you to apply statistical knowledge and skills learned from STAT6003 lectures between week 9, 10 and 11.
 You will specify a regression model for this assignment. This model can be based on a theory, several theories, your experience, and/or ideas.
 Please use Excel for statistical analysis in this assignment. Relevant Excel statistical output must be properly analysed and interpreted.
 Please provide a number for every table, graph or figure used and make clear reference to the table/graph/figure in your discussion.
 The word limit for this group assignment is 2,000 words (excluding Excel output).
 This assignment is due in week 12 by Wednesday 11:55 pm.
Submit copy of presentation Report in .docx, or .pdf format via the Assessment link in the main navigation menu in STAT6003. The Learning Facilitator will provide feedback with reference to the criteria below via the Grade Centre in the LMS portal. Feedback can be viewed in My Grades.
Assignment tasks:
The variables for this assignment are as follows: House Price Index (a)(b): Brisbane, Sydney and Melbourne, 2002–03 to 2016–17.
V1) Market Price ($000) V2) Sydney price Index V3) Annual % change
V4) Total number of square meters V5) Age of house (years)
1) Module 5 topic – Regression Analysis
You will specify a regression model for this assignment. This model can be based on a theory, several theories, your experience, and/or ideas from research article(s). Suggest you consider a regression model that is of interest to you or one that is related to your profession or one that you have knowledge about.
 Using Ordinary Least Square (OLS), estimate the model (below is a template for developing your regression model):
Y = b_{0} + b_{1} X_{1} + b_{2} X_{2} + b_{3} X_{3} + b_{4} X_{4} + e.
In your model, there must be one dependent variable and four independent variables
 For statistical analysis involving any hypothesis test in this assignment, you are required to:
 Formulate the null and alternative hypotheses.
 State your statistical decision using significant value (��) of 5% for each test.
 State your conclusion in context.
Assignment tasks:
 Provide an introduction section on the rationale of your model , sample size, and the dependent and independent variables (including their unit of measurement) in this model.
 Plot the dependent variable against each independent variable using scatter plot/dot function in Excel. Describe the relationship from the plots.
 Present the full model in your assignment.
 Write down the least squares regression equation and correctly interpret the equation.
 Interpret the estimated coefficients of the regression model and discuss their sig values.
 What is the value of the coefficient of determination for the relationship between the dependent and independent variables. Interpret this value accurately and in a meaningful way.
 State the 95% confidence intervals for each parameters and interpret these intervals.
 Estimate the linear regression model to investigate the relationship between the market price and the land size in total number of square meters.
 Compare the original model (question 1) and reestimated model (question 2) and evaluate the goodness of fit between them (Hint: Use R^{2} and Coefficient of determination to evaluate the goodness of fit of the model).
 Predict the market price of a house (in $) with a building area of 400 square meters.
This assignment is due in week 12 by Wednesday 11:55 pm. Please submit your assignment in Word document in the main navigation menu in STAT6003. The Learning Facilitator will provide feedback with reference to the criteria below via the Grade Centre in the LMS portal. Feedback can be viewed in My Grades.
Answer
Statistics Assignment
Report on Regression Analysis
1. Introduction
The following pages determine a regression model for the given data related to market prices for houses in Sydney. The variables included for analysis are: Market Price (in $000), Sydney price Index, Annual percentage change, Total number of square meters and Age of house (in years). Furthermore, the sample size is 18.
As can be seen from above, there are multiple variables and hence, a multiple variable regression model will be appropriate. Further, the Ordinary Least Square technique, popularly known as OLS, will be used to arrive at an appropriate regression model for provided data.
This model will have one dependent variable the value of which will be predicted through regression model. This dependent variable is the Market Price (in $000). The remaining variables will be independent variables that impact the value of dependent variable, that is, Market Price (in $000). The independent variables include: Sydney price Index, Annual percentage change, Total number of square meters and Age of house (in years).
The Sydney price index is based on base value of 100 in 201112 and calculated accordingly. The Annual percentage change is in percentage form and reflects the increase or decrease in market prices in yearonyear basis. The Total number of square meters reflects the size of the land on which house is built and it is measured in square meters. The age of the house reflects the number of years since the house was built.
The data for all the variables has been sourced from ABS6416.0, Residential Property Price Indexes for the Three Capital casitas.
To perform the regression analysis, Microsoft Excel will be utilised in order to generate the regression output.
2. Scatter Plots
The following graphs were generated using Microsoft Excel and represent relationship between the dependent variable and each of independent variables:
The above graph is a scatter plot for Sydney Price Index versus the Market Price. The independent variable (x) is the Sydney Price Index while the dependent variable (y) is the Market Price. It can be seen that there is a positive linear relationship between the two variables such that as the value of Sydney Price Index increases, the market price also increases.
The above graph is a scatter plot for Annual Percentage Change versus the Market Price. The independent variable (x) is the Annual Percentage Change while the dependent variable (y) is the Market Price. It can be seen that there is a positive linear relationship between the two variables such that as the value of Annual Percentage Change increases, the market price also increases. The trend line slope is not as steep as in case of Sydney Price Index, indicating that this variable has relatively lesser impact.
The above graph is a scatter plot for Total number of Square Meters versus the Market Price. The independent variable (x) is the Total number of Square Meters while the dependent variable (y) is the Market Price. It can be seen that there is a positive linear relationship between the two variables such that as the value of Total number of Square Meters increases, the market price also increases. The data is somewhat concentrated indicating close range values of the variables.
The above graph is a scatter plot for Age of House (years) versus the Market Price. The independent variable (x) is the Age of House (years) while the dependent variable (y) is the Market Price. It can be seen that there is a negative linear relationship between the two variables such that as the value of Age of House (years) increases, the market price decreases and vice versa. This can be related to real life where the older house will fetch a lower price and a newly constructed house will fetch a higher price.
3. Regression Model
As discussed earlier, Microsoft Excel was used to generate Regression output as follows:
SUMMARY OUTPUT  
Regression Statistics  
Multiple R  0.900  
R Square  0.810  
Adjusted R Square  0.751  
Standard Error  45.001  
Observations  18  
ANOVA  
Df  SS  MS  F  Significance F  
Regression  4  1,12,068.128  28,017.032  13.835  0.000  
Residual  13  26,326.316  2,025.101  
Total  17  1,38,394.444  
Coefficients  Standard Error  t Stat  Pvalue  Lower 95%  Upper 95%  Lower 95.0%  Upper 95.0%  
Intercept  531.463  80.386  6.611  0.000  357.801  705.126  357.801  705.126  
Sydney price Index  2.264  0.560  4.043  0.001  1.054  3.473  1.054  3.473  
Annual % change  6.310  3.060  2.062  0.060  12.919  0.300  12.919  0.300  
Total number of square meters  0.507  0.315  1.607  0.132  0.174  1.187  0.174  1.187  
Age of house (years)  2.637  1.146  2.301  0.039  5.113  0.161  5.113  0.161 
4. Regression Equation
We can say that the Market Price (‘000) is regressed on Sydney price Index, Annual percentage change, Total number of square meters and Age of house (in years).
From the above output, we can interpret the least squares regression equation as follows:
y^{^ }= 531.46 + 2.264 x_{1} 6.310 x_{2} + 0.507 x_{3} – 2.637 x_{4}
where,
 Dependent variable, Market price ($000) is represented as: y^{^ }
 Independent variable, Sydney price index is represented as x_{1}
 Independent variable, Annual percentage change is represented as x_{2}
 Independent variable, Total number of Square Meters is represented as x_{3}
 Independent variable, Age of House (years) is represented as x_{4}
5. Coefficients & Sig Values
Coefficients
The above equation has a constant of 531.46 which is also known as Yintercept coefficient. It is the minimum value for y even when all x values are zero. It is the point where regression line crosses the vertical axis and is also known as ‘β_{0}’ or ‘constant’.
x_{1} and x_{3} are greater than zero indicating that the relationship is positive such that x and y increase or decrease together.
x_{2} and x_{4} are lesser than zero indicating that the relationship is negative such that as x increases, y decreases and vice versa.
Sig Value
The output above provides coefficients as well some other statistically significant information, such as pvalues for each of the coefficients. These values help in determining whether the variable has statistically significant relationship with the dependent variable or not.
Each of the pvalue entails a null hypothesis that the variable has no correlation with the dependent variable. The alternative hypothesis is that the variable has correlation with the dependent variable. This null hypothesis can be rejected if the pvalue for a coefficient is less than the significance level in consideration. Hence, in such a case, we can conclude that the particular variable has a correlation with the dependent variable.
In our case, the significance level is 0.05. The pvalues for various coefficients are:
 Intercept: 0.000, which is less than 0.05. Hence, the variable is statistically significant.
 x_{1} : 0.001, which is less than 0.05 Hence, the variable is statistically significant.
 x_{2} : 0.060, which is greater than 0.05. Hence, the variable is not statistically significant.
 x_{3} : 0.132, which is greater than 0.05. Hence, the variable is not statistically significant.
 x_{4 }: 0.039, which is less than 0.05. Hence, the variable is statistically significant.
6. Coefficients of Determination
The values of R and R^{2} help in determining the degree of relationship between independent and dependent variables. These are also called as Coefficients of Determination.
In the given case, R value is 0.900 which indicates that there is high degree of correlation between the variables in consideration. R^{2} is at 0.810 with adjusted R^{2} being 0.751. The value of R^{2} is nothing but the proportion of change in dependent variable that can be explained through the independent variables. As can be seen, at 81.0%, a high proportion of change in value of ‘Market Price’ can be attributed to change in independent variables, namely, Sydney price Index, Annual percentage change, Total number of square meters and Age of house (in years).
7. Confidence Intervals
The confidence intervals for various parameters help in determining the change in values of dependent variable with a unit change in the parameter. In the given case, the confidence interval for various parameters can be stated as:
 x_{1} : We can be 95% confident that with each increase in Sydney price index, the market price ($000) will increase between 1.054 and 3.473.
 x_{2} : We can be 95% confident that with each decrease in Annual % change, the market price ($000) will increase between 12.919 and 0.300.
 x_{3} : We can be 95% confident that with each increase in Total number of Square meters, the market price ($000) will increase between 0.174 and 1.187.
 x_{4 }: We can be 95% confident that with each increase in Age of House (years), the market price ($000) will decrease between 5.113 and 0.161.
Points that should be noted in above statements are:
 In each case, the first value is Lower 95% confidence level and the second number is Upper 95% confidence level.
 The change is in $000. Hence, for example, a change of 1.054 indicates an increase of $1,054.
8. Market Price versus Land Size
The following is the output when regression is done for single variable, that is, size in total square meters.
SUMMARY OUTPUT  
Regression Statistics  
Multiple R  0.30  
R Square  0.09  
Adjusted R Square  0.03  
Standard Error  88.83  
Observations  18  
ANOVA  
Df  SS  MS  F  Significance F  
Regression  1  12,150.69  12,150.69  1.54  0.23  
Residual  16  1,26,243.76  7,890.23  
Total  17  1,38,394.44  
Coefficients  Standard Error  t Stat  Pvalue  Lower 95%  Upper 95%  Lower 95.0%  Upper 95.0%  
Intercept  659.26  110.93  5.94  0.00  424.10  894.42  424.10  894.42 
Total number of square meters  0.64  0.52  1.24  0.23  0.46  1.74  0.46  1.74 
We can say that the Market Price (‘000) is regressed on Total number of square meters. From the above output, we can interpret the least squares regression equation as follows:
y^{^ }= 659.26 + 0.64 x_{1}
where,
 Dependent variable, Market price ($000) is represented as: y^{^ }
 Independent variable, Total number of Square Meters is represented as x_{1}
It can be seen above that Total number of square meters explains around 9% of change in the dependent variable, that is, market price. Further, when we see the pvalue for this variable, it is 0.23 which is greater than 0.05. Hence, we are unable to reject the null hypothesis that there is no correlation between the selected variables. In other words, this is not a statistically significant variable for our purpose at the significance level of 0.05.
9. Reestimated Model
Basis above analysis, we can compare the two regression models:
In case of Model 1, R value is 0.900 which indicates that there is high degree of correlation between the variables in consideration. R^{2} is at 0.810 with adjusted R^{2} being 0.751. The value of R^{2} is nothing but the proportion of change in dependent variable that can be explained through the independent variables. As can be seen, at 81.0%, a high proportion of change in value of ‘Market Price’ can be attributed to change in independent variables, namely, Sydney price Index, Annual percentage change, Total number of square meters and Age of house (in years).
In case of Model 2, R value is 0.300 which indicates that there is low degree of correlation between the variables in consideration. R^{2} is at 0.09 with adjusted R^{2} being 0.03. As can be seen, at 9.0%, a very low proportion of change in value of ‘Market Price’ can be attributed to change in independent variable, Total number of square meters.
Hence, it can be seen that Model 1 is a much better fit as compared to Model 2.
10. Prediction of Market Price
From above analysis, the regression equation in case of one variable that is, total square meters is:
y^{^ }= 659.26 + 0.64 x_{1}
where,
 Dependent variable, Market price ($000) is represented as: y^{^ }
 Independent variable, Total number of Square Meters is represented as x_{1}
When value of independent variable is given at 400 square meters, we can plug the value of x and predict a value of y as follows:
y^{^ }= 659.26 + 0.64 x_{1}
y^{^ }= 659.26 + 0.64 (400)
y^{^ }= 659.26 + 256
y^{^ }= 915.26
Hence, value of dependent variable is 915.26. In other words, when the total size is 400 square meters, the market price of the house in Sydney is expected to be $915,260.