HM6007 STATISTICS FOR MANAGERS
T2 2024
Group Assignment
Student Name –
Student ID –
Table of Content:
Descriptive Statistics Analysis and Review 4
Graphical Representation of Data 7
Correlation and Regression Analysis 8
a) Multiple Regression Equation 9
b) Interpretation of the coefficients 9
c) Interpretation of Coefficient of Determination 9
d) Assessing the Overall Model of Significance 9
e) Assessing the Significance of Independent Variables in the Model 9
Data Collection
|
Price ($) |
No. of bedrooms |
No. of garage space |
Distance (Km) |
Land size (m²) |
|
|
|
|
|
|
Mean |
2677775.00 |
3.30 |
1.70 |
1.00 |
579.43 |
Standard Error |
579504.02 |
0.29 |
0.21 |
0.15 |
211.92 |
Median |
1833500.00 |
3.00 |
1.00 |
0.83 |
289.50 |
Standard Deviation |
1.31 |
0.95 |
0.65 |
947.74 |
|
Kurtosis |
6.74 |
-0.76 |
1.30 |
0.94 |
9.88 |
Skewness |
2.53 |
0.26 |
1.43 |
1.15 |
3.12 |
Range |
11111000.00 |
5.00 |
3.00 |
2.47 |
4048.00 |
Minimum |
490000.00 |
1.00 |
1.00 |
0.12 |
54.00 |
Maximum |
11601000.00 |
6.00 |
4.00 |
2.59 |
4102.00 |
Sum |
53555500.00 |
66.00 |
34.00 |
20.03 |
11588.50 |
Coefficient of variation |
0.97 |
0.40 |
0.56 |
0.65 |
1.64 |
Count |
20.00 |
20.00 |
20.00 |
20.00 |
20.00 |
Descriptive Statistics Analysis and Review
Interpretation of mean of Houses Sold:
In mathematics and statistics, the mean is a crucial idea. The average of the common value of the gathered information or variables is referred to as the mean (Geeks for Geeks, 2024). Together with the mean and medium, the mean in statistics indicates the profitability distribution's center tendency. The expected value is likewise defined by the mean. The mean aids in the business's analysis of the common static data used in the variable collection. Based on the calculations and table above, it can be determined that the average price of the houses is 2677775.00 ($), that the average number of bedrooms is 3.30, that the average number of garages is 1.70, that the average distance between the houses and secondary schools is 1.00, and that the average size of the land is 579.43 (m²).The aforementioned interpretation further demonstrates that Australians desire four rooms in their homes and prefer to live close to secondary schools. Additionally, the average Australian wants two garage spots for parking cars in their home.
Interpretation of median of Houses Sold:
According to Brittanica (2024), in statistics, the median is the midpoint value of a given set of data when it is sorted in order. The data or observations might be arranged in an order that is either descending or ascending. Together with the median, the other two central trends are the mean and mode. The mean is the proportion of the total number of occurrences to the total number of observations. The mode is the value that appears most frequently in the given data-set. $1,833,500 is the dataset's median home price. This suggests that the sample's half of homes are priced over this threshold and the other half are below.
When there are outliers in the data (such as an extremely expensive house), the median provides a better central measure than the mean since it is less sensitive to extreme values. With three bedrooms as the median, half of the homes have more than three bedrooms and the other half have fewer. This indicates that most residences have three bedrooms. There is one garage spot in the midway. This demonstrates that the most prevalent value in the dataset one garage space is what households normally have. Half of the homes are situated within 0.83 km of a reference point, while the other half are situated further away, according to the median distance of 0.83 km. This suggests that the majority of homes are situated reasonably close to the reference point. There are 289.5 square meters of land in the median. This suggests that while the other half of the residences have smaller lots, the other half have land areas greater than 289.5 m².
Interpretation of Range of Houses Sold:
According to Corporate Financial Institute (2024), Range is also a statistical measure that is used to find which the difference or the variation between the maximum and minimum value of the given dataset. The measure is calculated to find out the dispersion of the data. From the above-mentioned table, it can be seen that the range of the prices of the houses is $ 11111000, which the same of number of bedrooms and number of garage space to park vehicle is 5 and 3 respectively. Considering the range of distance in kilometres, it is 2.47 which is the shortest distance of the properties. Lastly the range of area of properties comes out to be 4048.
Interpretation of Standard deviation of Houses Sold:
According to BMJ (2024), the standard deviation is the positive square root of the variance. It is one of the core methods in statistical analysis. The sign "?," which stands for standard deviation, also refers to the degree to which data values deviate from the mean value. The standard deviation of the sale prices in this analysis of the sold residences is $2591620.77, which shows a considerable amount of variation in real estate values throughout the dataset. A substantial variation in the number of bedrooms between homes is shown by the standard deviation of 1.31 for the number of bedrooms. The standard deviation for parking spaces is 0.95, indicating that the majority of homes have roughly the same amount of garage spaces. The distance's 0.65 km standard deviation indicates a moderate degree of variability in the properties' distances from a given spot. Ultimately, with a standard deviation of 947.74 square meters, the land size of these dwellings exhibits the most variability, suggesting that there are substantial variations in land size between properties.
Interpretation of Kurtosis value of Houses Sold:
According to Corporate Finance Institute (2024a), the descriptive statistic that measures the information that is moved between the distribution's tails and center is computed using kurtosis. It is easier to spot possible issues with far-fetched or extreme remarks if a representative delivers "heavy" ends, or remarks that are throatily focused or extended with extreme explanations and observations. The dataset’s research indicates that the sold houses' Kurtosis value price is 6.74, or the peak position.The position of the peak value is indicated by the Kurtosis value of -0.76 for the number of garages; the position above the normal peak is indicated by the Kurtosis value of 1.3 for the number of beds sold houses; the flat position is indicated by the Kurtosis value of 0.94 for the distance in kilometres from the secondary school; and the peak position is indicated by the Kurtosis value of 9.88 for the land size in square meters.
Interpretation of Skewness of Houses Sold:
According to Geeks for Geeks (2024a), skewness is a metric for asymmetry or distortion of a symmetric distribution. It determines the degree to which the distribution of a given random variable departs from a symmetric distribution, such as the normal distribution. A normal distribution is not skew and is symmetrical on both sides. Thus, a curve is considered to be skewed if it is shifted to the left or right. As per mentioned calculation, the price of sold houses in Australia has a skewness value of 2.53, reflecting smaller values; the number of beds has a skewness value of 0.26, indicating smaller values; the number of garages has a skewness value of 1.43, reflecting smaller values; the number of distances to the secondary school has a skewness value of 1.15, reflecting smaller values; and the skewness size value of the land is 3.12, reflecting smaller values as well.
Interpretation of Coefficient of variation of Houses Sold:
The coefficient of variation, a statistical measure of the distribution of data points around the mean, is often referred to as the relative standard deviation. The metric is widely used to compare the data dispersion over multiple data series. The coefficient of variation provides a very simple and quick approach to compare different data series, in contrast to the standard deviation, which is always best understood in respect to the data mean. The data set described above illustrates or explores the following: The coefficient of variation for the price is 0.97 ($), the coefficient of variation for the number of beds is 0.4, the number of garages is 0.56, the coefficient of variation for the distance to the secondary school is 0.65, and the coefficient of variation for the land size is 1.64.
Graphical Representation of Data
Figure 1 – Histogram
Graphical Representation of Presenting Relationship Between Dependent Variable and Land Size
Figure 2 - Scatter Plot
In order to examine the correlation between the independent and dependent variables, create a scatter plot with price as the dependent variable and land size as the independent variable. The data shows that the price of homes sold in Australia is negatively correlated with the dimension of the land. Furthermore, there is a negative link between prices and house sizes, as indicated by the dots being less than, not equal to, or near the tendency line.
Correlation and Regression Analysis
Regression Statistics |
|
Multiple R |
0.761726881 |
R Square |
0.580227841 |
Adjusted R Square |
0.468288599 |
Standard Error |
1938865.289 |
Observations |
20 |
ANOVA |
|
|
|
|
|
|
df |
SS |
MS |
F |
Significance F |
Regression |
4 |
7.7942E+13 |
1.94855E+13 |
5.183417615 |
0.00796333 |
Residual |
15 |
5.6388E+13 |
3.7592E+12 |
|
|
Total |
19 |
1.3433E+14 |
|
|
|
|
Coefficients |
Standard Error |
t Stat |
P-value |
Lower 95% |
Upper 95% |
Lower 95.0% |
Upper 95.0% |
Intercept |
638809.0738 |
1864026.305 |
0.342703894 |
0.736576459 |
-3334268.947 |
4611887.095 |
-3334268.947 |
4611887.095 |
X Variable 1 |
366462.073 |
0.840061766 |
0.414063665 |
-473244.6427 |
1088946.195 |
-473244.6427 |
1088946.195 |
|
X Variable 2 |
1933586.446 |
595384.2829 |
3.247627627 |
0.005409539 |
664554.8866 |
3202618.005 |
664554.8866 |
3202618.005 |
X Variable 3 |
-1997085.709 |
998614.2251 |
-1.999857061 |
0.063961936 |
-4125581.546 |
131410.1269 |
-4125581.546 |
131410.1269 |
X Variable 4 |
-455.5503393 |
872.9054101 |
-0.52187824 |
0.609375816 |
-2316.104179 |
1405.0035 |
-2316.104179 |
1405.0035 |
House price= 638809.0738 + 307850.776 *Number of bedroom - 1933586.446*Number of garage space- 1997085.709* Number of distances + 455.5503393* land size
The projected price of residences when all of the independent variables in the model have zero values is known as the intercept, and its estimated coefficient is 638809.0738. While this is an excellent starting point, it's probably not very helpful. The coefficient for the amount of bedrooms is 307850.7762, implying that an extra house will cost around $307850.7762. However, the coefficient for many garages is negative 1933586.446, indicating that a pessimistic estimate of $1933586.446 is attached to one extra unit of garage space. The negative coefficient for distances (1997085.709) shows that for every increase in distance, house prices decrease by roughly $1997085.709.
When analysing linear regression, the coefficient of determination, also referred to as R2 in statistics, is a metric used to assess a model's ability to predict or explain a result. More precisely, R2 indicates the extent to which the predictor variable (X, an independent variable) and linear regression may be used to predict or explain the variance in the dependent variable (Y). The coefficient of determination comes out to be 58% that shows the positive correlation reflects strong connection between the both the variables.
The overall importance of the F-test is used to evaluate how well the generated regression line matches the given data points. There are a lot of issues and intricate arithmetic involved with the F-test of overall significance, especially when there are more than two variables. The F statistic is 5.18 while the significance F is 0.0079. As in this observation, p value is determined to be less than 0.05 that shows independent variable chosen has a substantial link with the price of the house because the regression equation is noteworthy overall.
The p value shows that the number of bedrooms is substantiate at the level of 0.05. This implies that these factors significantly affect how much houses cost. However, there is no statistically significant correlation between the quantity of garages and sales at the level of 0.005 with the size of land at p value of 0.06.
Examining the correlation between the explanatory variables and checking for the possibility of multicollinearity
|
Price ($) |
No. of bedrooms |
No. of garage space |
Distance (Km) |
Land size (m²) |
Price ($) |
1 |
|
|
|
|
No. of bedrooms |
0.17074218 |
1 |
|
|
|
No. of garage space |
0.493379783 |
0.192394428 |
1 |
|
|
Distance (Km) |
-0.393373561 |
0.120371295 |
0.291958821 |
1 |
|
Land size (m²) |
-0.030400822 |
0.366719783 |
0.610209498 |
0.707261267 |
1 |
The links between home prices and the independent variables which are number of bedrooms, garage spaces, distance, and land size are revealed by the correlation matrix. There appears to be a slight positive correlation (0.17) between the number of bedrooms and house prices, indicating that an increase in bedrooms has a noticeable effect on price. The relationship between house prices and garage spaces is much stronger (0.49), indicating that having more garage spaces generally boosts property values, while going against the regression model's negative coefficient.
Considering the regression analysis, it is noteworthy that there is a negative correlation of -0.39 between distance from the property and house prices, indicating that the tendency is for prices to decline with increasing distance. This implies that properties located further from important areas or city centers have a lower value. The relatively weak negative correlation between land size and price (-0.03) suggests that bigger land sizes have little to no effect on house prices, which may account for the regression's lack of statistical significance.
Furthermore, there are significant positive connections between land size and distance (0.71) and garage spaces (0.61). This suggests that properties with larger lots typically have more garages and are farther away, perhaps in more rural locations.
Lastly, land size of the properties and the price of those share positive correlation between each other which can be reflected through positive figure of 0.0304. This reflects that with the increase in land size, the prices at which the properties were sold increases.
Summary:
The home values are predicted by the multiple regression model based on several characteristics such as land area, bedrooms, parking spots, and distance. The intercept, which is 638,809.07, represents the expected price when all variables are zero. A longer distance and a separate garage add $1,933,586.45 and $1,997,085.71, respectively, to the cost of a home; each additional bedroom adds $307,850.78. The amount of land has very little positive impact. The model explains 58% of the variance in house prices (R2 = 0.58), and the full regression is significant (F-statistic = 5.18, p = 0.0079). Bedrooms have a big influence on prices, although garages and land size don't really matter statistically.
Summary
The characteristics used in this statistical research to evaluate the factors influencing property prices include distance, land size, number of bedrooms, and parking spaces. The typical house price of $2,677,775 is very variable, as indicated by the standard deviation of $2,591,620. The skewness of 2.53 suggests that there are outliers, or pricey properties, while the median price of $1,833,500 shows that there is a concentration of homes around this amount. Regression analysis indicates that the number of bedrooms significantly affects home values; an additional bedroom increases the price by roughly $307,850. The cost decreases by roughly $1,997,085 per kilometre when one goes farther away from major centers.These factors account for 58% of the fluctuation in housing prices, according to the coefficient of determination (R2 = 0.58). The regression study reveals no significant relationship between parking spaces and land size, despite the correlation matrix indicating an upward association between garage spaces and prices (0.49).
References
BMJ (2024) 2. Mean and standard deviation. Available at:https://www.bmj.com/about-bmj/resources-readers/publications/statistics-square-one/2-mean-and-standard-deviation(Accessed on: 10 October 2024).
Brittanica (2024) The Britannica Dictionary. Available at:https://www.britannica.com/dictionary(Accessed on: 10 October 2024).
Corporate Finance Institute (2024a) Kurtosist. Available at:https://corporatefinanceinstitute.com/resources/data-science/kurtosis/(Accessed on: 9 October 2024).
Corporate Financial Institute (2024) Range. Available at:https://corporatefinanceinstitute.com/resources/data-science/range/(Accessed on: 9 October 2024).
Geeks for Geeks (2024) Average in Maths. Available at: https://www.geeksforgeeks.org/average/(Accessed on: 9 October 2024).
Geeks for Geeks (2024a) Skewness – Measures and Interpretation. Available at:https://www.geeksforgeeks.org/skewness-measures-and-interpretation/ (Accessed on: 9 October 2024).
Also Read
- Introducing People Management: Motivation, Zero-Hour Contracts, Employment Legislation, and Retention
- Pearl Manor Hotel: A Case Study on Leadership and Change Management in the Service Industry
- HND BUSINESS: An Analysis of Public, Private, and Voluntary Organizations
- Unit 46: Developing Individuals, Teams, and Organisations


