MAT10251 STATISTICAL ANALYSIS
This project leads you through a
statistical analysis of residential
property data from a given non-capital city or town in Australia.
We Write Essays for Students
Tell us about your assignment and we will find the best writer for your paper
Get Help Now!The data for this project was obtained
from http://www.realestate.com.au/buy during January 2020.
Part A covers parts of Topics 1
and 2, Part B parts of Topics 5 to 9.
You will
need to work on this project throughout Session 1.
Project
Data
The data for this project can be accessed from the MySCU site for MAT10251 in Task 2 – Project in Assessment Tasks and Submission under ASSESSMENT.
The data set provided
contains 10 randomly chosen samples of size 100.
To obtain your data
(1) Click on the Project
Data
file. This will download an Excel file.
(2) Select
the 4 columns (Price $000 to Type) of data for the sample specified
by the last digit of your student ID number.
(3) Copy
this into a new Excel file.
There are 10 sample data sets
each of four (4) columns (Price $000 to Type)
Your sample number matches the last digit of your SCU student ID number. For example, if your student ID number ends in 2 your sample is Sample 2 and you will be analysing residential property data from Gold Coast Queensland in columns K to N and cells K2:N102.
Project
Situation
Your statistical analysis of
residential property data is to enable you to answer questions from a relative
who is seeking to buy a property in the particular town or city of your sample
and has asked you for information and advice.
In each part of the project you are required to analyse your sample data in response to given questions and provide a written answer. You can assume that each written answer is a part of a letter or email to your relative.
Project Preparation MAT10251 STATISTICAL ANALYSIS PROJECT
You are expected to use Excel
when completing the project.
Your
written answers presenting findings and conclusions should be considered as a
part of a letter or email to your relative. Each written answer should be a
word document into which your Excel output has been copied.
In
addition, your statistical workings for Part B should appear as appendices to
your written answer. This should include all necessary steps and appropriate
Excel output.
Each
part of the project should be submitted as a SINGLE Word document, with appropriate Excel output added.
Note:
- You
should not need to read beyond the study guide and textbook to complete the
project.
Project
Submission
- Each
part of the project should be submitted as a SINGLE
Word file with Excel output. - The given
cover sheets should be the first
pages of your submitted project and are not part of the page limit. - DO NOT submit your appendices, which
are not part of the page or word limit, for Part B as a separate file. - Ensure
that the page setup of your submitted document is A4 Portrait, with an
appropriate format so that it is easily readable if printed. - Use
line spacing of at least 1.5. - Please
name your file
“Family
Name_First Name_Part_A/B/_Campus”
For example; Jayne_Nicola_Part_A_Lismore
Penalties For Incorrect Sample MAT10251 STATISTICAL ANALYSIS PROJECT
- If
you use a sample that does not correspond to the last digit of your student ID
number, to be entered on the cover sheet, a maximum of two marks may be
deducted, as this causes the marker extra work and frustration.
Incorrect
Format
- If
the page setup of your submitted Word file is not as required (that is, A4
Portrait, with appropriate format so that it is easily readable if printed),
with at least 1.5 line spacing or your project is not submitted as a single
Word document a maximum of two marks may be deducted, as this causes the marker
extra work and frustration. - If
your submitted file is not a Word file, for example it is a pdf or a zip file, a
maximum of two marks may be deducted, as this causes the marker extra work and
frustration. - In
addition, if your file is not named as requested or the required cover sheets
are not included or correctly completed a maximum of two marks may also be
deducted, as this can cause the marker extra work and frustration.
MAT10251 STATISTICAL ANALYSIS
PROJECT
– PART A
Due Week 4 Tuesday 24 March 202
If you are a late enrolment in MAT10251, email Nicola Jayne nicola.jayne@scu.edu.au with the date you enrolled in MAT10251 for a revised due date
Value: 10%
Objectives: 1
to 5
Topics: 1
and 2
Purpose: To
- introduce
you to the project data, situation and Excel - use
Excel to graph data and calculate statistics - interpret
and communicate Excel results
Part
A Preliminary Analysis of Sample Data
Your
relative is interested in buying a property in the town or city specified by
your sample and asks you to obtain information on the property prices in this
location.
Your relative is considering purchasing a two or three bedroom property, as they are either downsizing since their children have left home or they are new home buyers. Therefore, they are interested in the typical price of two and three bedroom properties. They are also interested in the difference in price between units and houses and the relationship between number of bedrooms and price. myscu
Tasks
– Part A Submission
Complete
the following
1) Download and save your data.
2) Download the Project
Part A cover
sheets, name and save this file as
“Family Name_First Name_Part_A_Campus”
3) Enter your Sample
Number on page 2 of the Part A coversheets.
4) Statistical
Output: For your sample perform the following tasks:
- Price of two and
three bedroom properties
Use Price $000 (1st column of data) for two and
three bedroom residential properties for sale to explore the typical price
of a two or three bedroom residential property, by using Excel to:
- Construct
a frequency histogram or polygon for the price of two and three bedroom
residential properties. - Calculate
descriptive statistics for the price of two and three bedroom residential
properties.
Notes:
- The
required data for two and three bedroom residential properties are in the first
rows of your sample. - Analyse
the price of two and three bedroom properties together. Do NOT separate into two or three bedrooms or into units or houses.
b) Difference between
unit and house prices
Use Price $000 (1st column of data) and Type
(4th column of data) for all
100 residential properties for sale to explore the difference between unit
and house prices, by using Excel
to:
- Construct
separate boxplots, on the same plot or separately, for house prices and for
unit prices.
Hint: Sort data on Type to obtain two samples. One for
house prices and the other for unit prices.
c) Relationship
between number of bedrooms and price
Explore the relationship between number of bedrooms
and price using Number of Bedrooms
(2nd column of data) as the independent variable and Price $000 (1st column of
data) as the dependent variable for all
100 properties by using Excel to:
- Construct
a scatter plot for number of bedrooms and price - Calculate
the correlation coefficient for number of bedrooms and price.
5) Written Answer – Email or letter
Using
the instructions given on pages 4 and 5 of the Part A coversheets, introduce
your data and the results of your preliminary investigation of residential
property prices
This
should be three to five pages and 400 to 800 words.
Use
an appropriate style, without statistical jargon and equations, to clearly
communicate your results.
6) Complete
Coversheets 1 and 2, save and submit Part A of the project online using Project
Part A link
in Submit
Project by the due date Tuesday 24 March 2020.
Marking
Criteria – Part A
Read
the marking criteria carefully and consider them when preparing your
Part A Submission.
See the marking and feedback
sheet, page 3 Part A coversheets, for allocation of marks.
Statistical
Calculations
- To
obtain full marks your graphs and plots must be correct, including
correct labels on both axes and a title.
Marks
will be deducted if:
- Graph or plot incorrect
Examples
- Gaps between classes of non-zero
frequency in a histogram for continuous data- Incorrect independent and
dependent variables in a scatter plot.
- Line graph instead of a histogram
- Incorrect independent and
- Excel, PhStat, Excel Workbooks, or
similar, is not used. - Axes incorrectly or not labelled.
- No title.
- For a histogram or frequency polygon inappropriate
classes are used. - Scale on axes distorts graphs.
- To
obtain full marks for descriptive
statistics copy the output table of the Descriptive Statistics command in
Data Analysis or the Descriptive Summary and/or Boxplot command in PhStat or Descriptive
workbook. You may delete unnecessary statistics in these tables. - Marks
will be deducted if any descriptive statistics are incorrect, so check:
- Your sample size.
- Whether you are calculating sample
statistics or population parameters.
Written
Answer – Email or letter
- 400
to 800 words and three to five pages – marks will be deducted if this is
greatly exceeded. - To
obtain full marks must:
- Be well structured.
- Clearly communicate the results of the
Excel output in language appropriate for your audience. - Include appropriate graphs and plots with
appropriate statistics. - Provide information on average/typical
price of a two or three bedroom residential property, how the price of two
and three bedroom residential properties vary and any pattern to the price
of two and three bedroom residential properties. - Provide information on the difference in price
of units and houses. - Provide information on the relationship between
number of bedrooms and the price of residential properties. Comment on the
strength, shape and sign of the relationship. - Marks
will be deducted if:
- There is little or no comment on, or
interpretation of, the Excel output. - Unnecessary statistical jargon and
equations appear. - It is confusing or not readable.
- It is handwritten.
- For
each major spelling and/or grammatical error half a mark will be deducted, up
to a maximum of two marks. - Also
up to two marks may be deducted for poor structure and/or presentation.
MAT10251 STATISTICAL ANALYSIS
PROJECT
– PART B
Due: Week 11 Sunday 17 May
2020
Value: 25%
Objectives: 1
to 5
Topics: 5
to 9
Purpose: To
apply your knowledge of statistical inference and regression to answer
questions about residential property prices by analysing the data and
communicating the results.
Part
B Further Analysis of Data – Using Statistical Inference and Regression
Analysis
In response to your
letter or email in Part A, your relative asks for further information and
clarification. You use the graphs and statistics obtained in Part A and
techniques from statistical inference and regression and correlation to provide
this information.
Part B
Submission
You should submit a
single word document consisting of:
- Part B
coversheets - Written
answer either as a letter or an email or emails. See instructions on page 4 of Part B coversheets - Appendices
for Part B which contain full statistical working for the required statistical
tasks. This should follow the format given on pages 5 of Part B coversheets
Part B
Preparation
Graphs and statistics from Part A
are required in the statistical and written answers in Part B. Therefore, check
these and make any required corrections.
While the submission date for
Part B is Sunday 17 May 2020, you should be working on Part B during Weeks 6 to
11.
It is recommended that you follow
the following timetable:
- Question
1, covering Topic 5, should be completed in Week 6 - Question
2, covering Topic 6, should be completed in Week 8 - Question 3, covering Topic 7,
should be attempted in Week 9 - Question 4, covering Topic 8,
should be attempted in Week 10 - Question 5, covering Topic 9,
should be attempted in Week 11
Task 1 Part B – Appendices Statistical
Inference and Regression and Correlation Tasks (38 marks)
The following statistical tasks
should appear as appendices to your written answers. These should include all
necessary steps and appropriate Excel output.
These
appendices should come after your written answer within your single Word
document for Part B.
Statistical Inference
Choose a level of
significance for any hypothesis tests and a level of confidence for any
confidence intervals. Enter these values on page 2 of the Part B coversheets
along with the sample number from Part A.
For your sample answer the following questions using appropriate
statistical inference and regression techniques.
Question
1 – Topic 5 (5.5 marks)
Your relative is considering buying a unit which from
your previous research you have shown appear to be cheaper than houses.
However, your relative is concerned that if they only consider units their
choice will be limited.
To explore if their choice will
be limited if they restrict their search to units use Type (4th column of your data) for all 100 residential properties for
sale and an appropriate statistical
inference technique to:
- Estimate
the population proportion of
residential properties for sale, in the location and state specified by your
sample, which are units.
Hint: Sort data on Type
to enable you to easily count the number of properties in your sample which are
units
Question
2 – Topic 6 (7.5 marks)
Your relative has a maximum of $330,000 to purchase a residential
property. If the average price of two or three bedroom residential properties
is more than this your relative will consider the location to be too expensive.
Explore
if your relative will find the location specified by your sample too expensive by using Price
$000 (1st column of data) for two and
three bedroom residential propertiesfor sale, your output from Part A, and an appropriate statistical
inference technique to answer the following question
- In
the location specified by your sample, is the mean two and three bedroom
residential property price more than $330,000?
Notes:
- The
required data for two and three bedroom residential properties are in the first
rows of your sample - If you
have sorted your data on Type in Question 1 download your data again.
Question 3 Topic 7 (6 marks)
From your previous research you have shown that units
appear to be cheaper than houses. Your relative asks you to estimate how much
they would save if they purchased a unit instead of a house.
To
provide a justified answer use Price
$000 (1st column of data) and Type (4th column of data)
for all 100 properties for sale, your output from Part A and an
appropriate statistical inference technique to
answer the following question.
- Estimate
the mean difference in price between units and houses for sale in the location specified by your
sample.
Hint: Sort data on Type
to obtain two samples. One for house prices and the other for unit prices.
Questions 4 and 5 Simple and Multiple Linear
Regression (19 marks)
Your relative asks what factors
influence the price of a residential property and if you can estimate the price
of a residential property from these factors.
To answer this you develop
a simple linear regression model to estimate price from number of bedrooms and
a multiple linear regression model to estimate price from number of bedrooms,
number of bathrooms and type. Then you choose and interpret the linear model
that best fits your data.
Question 4 Simple Linear Regression Model Topic 8
Use Number of Bedrooms (3rd column
of data) as the independent variable and Price
$000 (1st column of data) as the dependent variable for all 100 properties and your output from
Part A to develop and then explore a simple linear relationship between the two
variables by:
- Calculating
the least squares regression line, correlation coefficient and coefficient of
determination. - Interpreting
the gradient and vertical intercept of the simple linear regression equation. - Interpreting
the coefficient of determination.
Question 5 Multiple Linear Regression Model Topic 9
To explore whether
being a house or unit and number of bathrooms also influences price, add Number of Bathrooms (3rd
column of data) and Type (4th
column of data) as additional independent variables to the simple linear
regression model in Question 4. Then develop and explore the relationship
between price and the three independent variables by:
- Calculating
the multiple regression equation and coefficient of determination. - Interpreting
the values of the multiple regression coefficients. - Interpreting
the value of the coefficient of determination. Compare the value with the
corresponding value for the simple linear regression model.
Then determine the best model to
estimate price by:
- Using
appropriate tests to determine which independent variables make a significant
contribution to the regression model. - Then state
or calculate the simple or multiple regression equation which best fits the
data.
Notes:
- You
may need to transform or manipulate the given data, before using Excel for the
corresponding statistical calculations. - Use
Excel for all statistical calculations. You do not need to repeat any Excel
calculations by hand. However, make sure that you define your random variables
and include any steps not given by Excel. For example, in a hypothesis test
include the null and alternative hypotheses, along with the decision to reject
or not reject the null hypothesis. - Mention
any assumptions you need to make,
where appropriate justify these from Part A output. - In
Question 4 fit a linear model even if from your scatter plot you decide that a
non-linear relationship better fits the data or that no apparent relationship
exists. However, mention this in your written answer and/or corresponding
appendix. - Comment on
why a test or confidence interval has been chosen. Where appropriate include
and refer to Part A output. - Make sure
you interpret confidence intervals and write conclusions to hypothesis tests.
Task 2 – Written Answer – Email or letter (12 marks)
For Questions 1, 2, 3
and Questions 4 and 5 combined present the results of your calculations, with
your interpretation and conclusions as either a letter or email/emails to your
relative.
Use the instructions given on page
4 of the Part B coversheets.
This should be 400 to 900 words and two to
five pages.
It should be submitted as a Word file with Excel
output included.
Make sure you:
- Introduce
each question and put it in context - Answer
each question in non-statistical language. - Present
the result of your calculations and tests without unnecessary statistical
jargon - Include
a conclusion which answers the given question.
In particular, for Questions 4 and 5
- Include
and justify the best model. - Discuss
and interpret the values of the regression coefficients and coefficient of
determination of the best model.
Marking
Criteria – Part B
Read
these marking criteria carefully and consider them when preparing Part B.
See the marking and feedback
sheet, page 3 Part B coversheets, for allocation of marks.
Statistical Calculations
- For
statistical inference calculations
(Questions 1, 2, 3 and 5) marks will be given for:
- Choice of appropriate statistical
technique/s. - Random variable/s defined.
- Correct hypotheses for tests.
- Correct Excel output.
- Correct interpretation of results.
- For regression coefficients and coefficient of
determination (Questions
4 and 5) use either:
- The Regression command in Data Analysis and copy
resultant tables. - Or the simple/multiple regression command
in PhStat and copy the resultant tables. - Or the Simple Linear and Multiple
Regression workbooks and copy the resultant tables.
- For regression coefficients and coefficient of
determination (Questions 4 and 5) marks will be deducted if Excel is not
used and also for incorrect equations or coefficients, so check:
- Your independent and dependent variables.
- Your sample size.
Written
Answer – Email/Emails or Letter
- 400
to 900 words and two to five pages – marks will be deducted if this is greatly
exceeded. - To
obtain full marks must:
- Be well structured and analysed
- Clearly communicate the results of the
Excel output in language appropriate for your audience - Include an introduction to each question
and a conclusion - Include appropriate Excel output
- Answer the questions in non-statistical
language. - Marks
will be deducted if:
- There is little or no comment on, or
interpretation of, the Excel output - Unnecessary statistical jargon and
equations appear - It is confusing or not readable
- For
each major spelling and/or grammatical error half a mark will be deducted, up
to a maximum of two marks - Also
up to two marks may be deducted for poor structure and presentation.
- For
Questions 1 to 3, and Questions 4 and 5 combined in (), the following rubric
will be used
Mark | Acceptable | |
Poor | 0 (0) |
Question not introduced and/or results not presented. Confused response. Incorrect and/or inconsistent comments and conclusions. Unnecessary statistical jargon, especially symbols, equations and definitions (copied from the textbook) Question unanswered. |
Acceptable |
1 (2) |
Question introduced and results presented. Minimal interpretation and/or conclusions on how to use the information and/or only minimally relates information obtained to residential property prices. Only minor errors and inconsistencies in comments and conclusions. Question answered. |
More than acceptable |
2 (4) |
Results presented and questions introduced and answered, clearly and concisely. Includes interpretation and/or conclusions on how to use the information and/or relates information obtained to residential property prices. No errors or inconsistencies in comments and conclusions. Questions answered and justified |
Welcome to originalessaywriters.com, our friendly and experienced essay writers are available 24/7 to complete all your assignments. We offer high-quality academic essays written from scratch to guarantee top grades to all students. All our papers are 100% plagiarism-free and come with a plagiarism report, upon request
Tell Us “Write My Essay for Me” and Relax! You will get an original essay well before your submission deadline.
