BUMAN201A Business Maths and Statistics Assessment: Project Assessment Answer
Task 1: Sales Sheet
Above pivot table and chart presents breakup of revenue for regions. Further, it is broken up by representatives and products also.
Above pivot table and chart presents breakup of revenue for regions. Further, it is broken up by representatives and products also. The difference from previous section is that numbers are presented as percentage of total revenue.
The above pivot tables present information the units sold and revenue is presented for various regions, categorized by representatives. Filter on column labels has been used to create separate tables for separate products.
The above pie charts present region-wise percentage of revenue for each of the three products.
The above pie charts present representative-wise percentage of revenue for each of the three regions.
The above data is useful in determining various factors, such as, what product is selling the maximum, which product is generating the maximum revenue, which region has maximum demand and of which product, which representative is performing well, etc. Such information can be further utilized for forecasting purposes, appraisal of representatives, marketing strategy etc.
Task 2: Weight of Tub Sheet
For calculating various probabilities, z score has been calculated using formula (x-Mean/SD). Since it is a normal distribution, z table has been used to find corresponding probabilities as follows:
- Using given data, the average is 750.75gms while standard deviation is 7.17gms.
- There is 51.2% probability that tub weight will be between 745gms and 755gms.
- There is 16.5% probability that tub weight will be less than 740gms or more than 760gms.
- There is 1.4% probability that tub weight will be less than 735gms
- There is 2.3% probability that tub weight will be more than 765gms.
- As seen above, there is only 51.2% probability that the tub weight is between 745gms and 755gms. Hence, the company is not meeting its target that at least 95% of the tubs will be in this range. The consequence is the variation in tubs as seen in high standard deviation. This may cause dissatisfaction in customers as they will receive different quantity despite paying same price. Additionally, the company quality will be under question due to so much of variation. The company must make efforts to streamline the process and achieve more precision in filling the tubs.
Task 3: Hours of Attendance
- The correlation tool in excel was used :
It can be seen that the correlation between two variables, namely, hours of attendance and marks, is high and positive at 0.6534. This indicates that a unit increase in hours of attendance will lead to 0.65 unit increase in marks. Also, a unit decrease in hours of attendance will lead to 0.65 unit decrease in marks.
(b) The regression tool in excel was used:
The above indicates same value for R as correlation, that is, 0.653.
The regression equation is: y^ = 15.13 + 1.37 x1
- Dependent variable, Marks is represented as: y^
- Independent variable, Hrs of Attendance is represented as x1
The above equation has a constant of 15.13 which is also known as Y-intercept coefficient. It is the minimum value for y even when all x values are zero. It is the point where regression line crosses the vertical axis and is also known as ‘β0’ or ‘constant’. Also, x1 is greater than zero indicating that the relationship is positive such that x and y increase or decrease together.
The output also provides coefficients and other statistically significant information, such as p-values for each of the coefficients. These values help in determining whether the variable has statistically significant relationship with the dependent variable or not.
Each of the p-value entails a null hypothesis that the variable has no correlation with the dependent variable. The alternative hypothesis is that the variable has correlation with the dependent variable. In above, the p-value for both intercept (0.0013) and x variable (0.0000) is less than assumed significance level of 0.05, indicating that there is correlation between the two variables.
(c) The regression equation is: y^ = 15.13 + 1.37 x1
Now, if x = 44, y^ = 15.13 + 1.37*44 = 75.29.
Hence, a student who attended 44 hours can be predicted to get 75.29 marks.
(d) The regression equation is: y^ = 15.13 + 1.37 x1
Now, if x = 40, y^ = 15.13 + 1.37*40 = 69.82.
Hence, a student who attended 40 hours can be predicted to get 69.82 marks.
(e) From the above output, the ANOVA output helps to understand reliability of the regression model generated. The F value is 132.62 while Significance F is 0.000. Hence, it can be concluded that regression model is reliable. In other words, the above predictions through the regression model are also reliable.
(f) Some of the other factors that can impact marks are: number of hours studied, difficulty level of questions in the exam, and number of hours of sleep before the exam.
Task 4: Census
The selected region is Bondi Beach, NSW (State Suburb). The required graphic representations are:
- The age group table is as follows:
|85 years and over||178||11659|
Median = 11659/2 = 5829.5. This will lie in group 30-34 years.
Q1 step 1 = 11659/4 = 2914.75 = This will lie in group 25-29 years.
Q1 step 2 = 25 + ((2914.75-2262)/1952)*4
Q1 = 26.3 years
Q3 step 1 = 3*(11659/4) = 8744.25 = This will lie in group 45-49 years.
Q3 step 2 = 45 + ((8744.25-8676)/684)*4
Q3 = 45.4 years
(c) As can be seen, the data for age is normally distributed: