ASSESSMENT DETAILS
Plagiarism occurs when a student passes off as the student’s own work, or copies without acknowledgement as to its authorship, the work of any other person or resubmits their own work from a previous assessment task.
Collusion occurs when a student obtains the agreement of another person for a fraudulent purpose, with the intent of obtaining an advantage in submitting an assignment or other work.
Work submitted may be reproduced and/or communicated by the university for the purpose of assuring academic integrity of submissions: https:// support/referencing/academic-integrity
Using aggregation functions for data analysis
The provided zip file containsthedatafile [Energy20.txt ] and the R code [AggWaFit718.R ] to use with the following tasks, include these in your R working directory.
Total Marks 100, Weighting 20%
Energy Prediction of Domestic Appliances Dataset
The given dataset, "Energy20.txt", can be used to create models of energy use of appliances in a energy-efficient house. The dataset provides the Energy use of appliances (denoted as Y) using 671 samples. It is a modified version of data used in the study [1]. The dataset includes 5 variables, denoted as X1, X2, X3, X4, X5, and Y, described as follows:
X1: Temperature in kitchen area, in Celsius
X2: Humidity in kitchen area, given as a percentage
X3: Temperature outside (from weather station), in Celsius
X4: Humidity outside (from weather station), given as a percentage
X5: Visibility (from weather station), in km
Y: Energy use of appliances, in Wh
Assignment Tasks
Understand the data [20 marks]
the.data <- as.matrix(read.table("Energy20.txt "))
The variable of interest is Energy use of appliances (Y). To investigate Y, generate a subset of 350 data, e.g. using:
my.data <- the.data[sample(1:671,350),c(1:6)]
Using scatter plots and histograms, report on the general relationship between each of the variables X1, X2, X3, X4, X5 and the variable of interest Y. Include 5 scatter plots, 6 histograms, and 1 or 2 sentences for each of the variables,
including the variable of interest Y.
Transform the data [10 marks]
write.table(your.data,"name-transformed.txt")
where “name” is replaced with your name - you can use your surname or first name.
Briefly explain the transformations applied for the selected four variables and the
variable of interest. (1- 2 sentences each)
source("AggWaFit718.R")
Use the fitting functions to learn the parameters for
Use your model for prediction [20 marks]
Using your best fitting model, predict the Energy use of appliances for the following input X1=17; X2=39; X3=4; X4=77; X5=32.
Comparing with a linear regression model [20 marks]
Linear regression is used to predict the value of an outcome variable Y based on one or more input predictor variables X. The equation is
The built- in function lm() is used to fit linear models in R.
All supporting information should be presented in the pdf report. It will be assessed for style and grammar, professional presentation of figures, tables and references. List and quote in the text the references used, including books, articles and web resources.
Use the Harvard style: https://www.deakin.edu.au/students/studying/study-support/referencing/harvard
Submit to the SIT718 CloudDeakin Dropbox. Your final submission must include the following three files:
1. A report, "name-report.pdf", in pdf format (created in any word processor), covering all of the items in above (where “name” is replaced with your name -you can use your surname or first name). The total report must be up to 8 pages.
2. A data file named "name-transformed.txt" - just to help us distinguish them!).
3. The R code file (that you have written to produce your results) named "name- code.R" (where “name” is replaced with your name - you can use your surname or first name).
For solution, connect with online professionals.