HM6007 STATISTICS FOR MANAGERS Group Assignment

HM6007 STATISTICS FOR MANAGERS

T2 2024

Group Assignment





























Student Name –

Student ID –





Data Collection

 

Price ($)

No. of bedrooms

No. of garage space

Distance (Km)

Land size (m²)

 

 

 

 

 

 

Mean

2677775.00

3.30

1.70

1.00

579.43

Standard Error

579504.02

0.29

0.21

0.15

211.92

Median

1833500.00

3.00

1.00

0.83

289.50

Standard Deviation

2591620.77

1.31

0.95

0.65

947.74

Kurtosis

6.74

-0.76

1.30

0.94

9.88

Skewness

2.53

0.26

1.43

1.15

3.12

Range

11111000.00

5.00

3.00

2.47

4048.00

Minimum

490000.00

1.00

1.00

0.12

54.00

Maximum

11601000.00

6.00

4.00

2.59

4102.00

Sum

53555500.00

66.00

34.00

20.03

11588.50

Coefficient of variation

0.97

0.40

0.56

0.65

1.64

Count

20.00

20.00

20.00

20.00

20.00





Descriptive Statistics Analysis and Review

Interpretation of mean of Houses Sold:

In mathematics and statistics, the mean is a crucial idea. The average of the common value of the gathered information or variables is referred to as the mean (Geeks for Geeks, 2024). Together with the mean and medium, the mean in statistics indicates the profitability distribution's center tendency. The expected value is likewise defined by the mean. The mean aids in the business's analysis of the common static data used in the variable collection. Based on the calculations and table above, it can be determined that the average price of the houses is 2677775.00 ($), that the average number of bedrooms is 3.30, that the average number of garages is 1.70, that the average distance between the houses and secondary schools is 1.00, and that the average size of the land is 579.43 (m²).The aforementioned interpretation further demonstrates that Australians desire four rooms in their homes and prefer to live close to secondary schools. Additionally, the average Australian wants two garage spots for parking cars in their home.

Interpretation of median of Houses Sold:

According to Brittanica (2024), in statistics, the median is the midpoint value of a given set of data when it is sorted in order. The data or observations might be arranged in an order that is either descending or ascending. Together with the median, the other two central trends are the mean and mode. The mean is the proportion of the total number of occurrences to the total number of observations. The mode is the value that appears most frequently in the given data-set. $1,833,500 is the dataset's median home price. This suggests that the sample's half of homes are priced over this threshold and the other half are below.

When there are outliers in the data (such as an extremely expensive house), the median provides a better central measure than the mean since it is less sensitive to extreme values. With three bedrooms as the median, half of the homes have more than three bedrooms and the other half have fewer. This indicates that most residences have three bedrooms. There is one garage spot in the midway. This demonstrates that the most prevalent value in the dataset one garage space is what households normally have. Half of the homes are situated within 0.83 km of a reference point, while the other half are situated further away, according to the median distance of 0.83 km. This suggests that the majority of homes are situated reasonably close to the reference point. There are 289.5 square meters of land in the median. This suggests that while the other half of the residences have smaller lots, the other half have land areas greater than 289.5 m².

Interpretation of Range of Houses Sold:

According to Corporate Financial Institute (2024), Range is also a statistical measure that is used to find which the difference or the variation between the maximum and minimum value of the given dataset. The measure is calculated to find out the dispersion of the data. From the above-mentioned table, it can be seen that the range of the prices of the houses is $ 11111000, which the same of number of bedrooms and number of garage space to park vehicle is 5 and 3 respectively. Considering the range of distance in kilometres, it is 2.47 which is the shortest distance of the properties. Lastly the range of area of properties comes out to be 4048.

Interpretation of Standard deviation of Houses Sold:

According to BMJ (2024), the standard deviation is the positive square root of the variance. It is one of the core methods in statistical analysis. The sign "?," which stands for standard deviation, also refers to the degree to which data values deviate from the mean value. The standard deviation of the sale prices in this analysis of the sold residences is $2591620.77, which shows a considerable amount of variation in real estate values throughout the dataset. A substantial variation in the number of bedrooms between homes is shown by the standard deviation of 1.31 for the number of bedrooms. The standard deviation for parking spaces is 0.95, indicating that the majority of homes have roughly the same amount of garage spaces. The distance's 0.65 km standard deviation indicates a moderate degree of variability in the properties' distances from a given spot. Ultimately, with a standard deviation of 947.74 square meters, the land size of these dwellings exhibits the most variability, suggesting that there are substantial variations in land size between properties.

Interpretation of Kurtosis value of Houses Sold:

According to Corporate Finance Institute (2024a), the descriptive statistic that measures the information that is moved between the distribution's tails and center is computed using kurtosis. It is easier to spot possible issues with far-fetched or extreme remarks if a representative delivers "heavy" ends, or remarks that are throatily focused or extended with extreme explanations and observations. The dataset’s research indicates that the sold houses' Kurtosis value price is 6.74, or the peak position.The position of the peak value is indicated by the Kurtosis value of -0.76 for the number of garages; the position above the normal peak is indicated by the Kurtosis value of 1.3 for the number of beds sold houses; the flat position is indicated by the Kurtosis value of 0.94 for the distance in kilometres from the secondary school; and the peak position is indicated by the Kurtosis value of 9.88 for the land size in square meters.

Interpretation of Skewness of Houses Sold:

According to Geeks for Geeks (2024a), skewness is a metric for asymmetry or distortion of a symmetric distribution. It determines the degree to which the distribution of a given random variable departs from a symmetric distribution, such as the normal distribution. A normal distribution is not skew and is symmetrical on both sides. Thus, a curve is considered to be skewed if it is shifted to the left or right. As per mentioned calculation, the price of sold houses in Australia has a skewness value of 2.53, reflecting smaller values; the number of beds has a skewness value of 0.26, indicating smaller values; the number of garages has a skewness value of 1.43, reflecting smaller values; the number of distances to the secondary school has a skewness value of 1.15, reflecting smaller values; and the skewness size value of the land is 3.12, reflecting smaller values as well.

Interpretation of Coefficient of variation of Houses Sold:

The coefficient of variation, a statistical measure of the distribution of data points around the mean, is often referred to as the relative standard deviation. The metric is widely used to compare the data dispersion over multiple data series. The coefficient of variation provides a very simple and quick approach to compare different data series, in contrast to the standard deviation, which is always best understood in respect to the data mean. The data set described above illustrates or explores the following: The coefficient of variation for the price is 0.97 ($), the coefficient of variation for the number of beds is 0.4, the number of garages is 0.56, the coefficient of variation for the distance to the secondary school is 0.65, and the coefficient of variation for the land size is 1.64.





Graphical Representation of Data

Figure 1 – Histogram

Graphical Representation of Presenting Relationship Between Dependent Variable and Land Size

Figure 2 - Scatter Plot

In order to examine the correlation between the independent and dependent variables, create a scatter plot with price as the dependent variable and land size as the independent variable. The data shows that the price of homes sold in Australia is negatively correlated with the dimension of the land. Furthermore, there is a negative link between prices and house sizes, as indicated by the dots being less than, not equal to, or near the tendency line.

Correlation and Regression Analysis

Regression Statistics

 

Multiple R

0.761726881

R Square

0.580227841

Adjusted R Square

0.468288599

Standard Error

1938865.289

Observations

20



ANOVA

 

 

 

 

 

 

df

SS

MS

F

Significance F

Regression

4

7.7942E+13

1.94855E+13

5.183417615

0.00796333

Residual

15

5.6388E+13

3.7592E+12

 

 

Total

19

1.3433E+14

 

 

 



 

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Upper 95%

Lower 95.0%

Upper 95.0%

Intercept

638809.0738

1864026.305

0.342703894

0.736576459

-3334268.947

4611887.095

-3334268.947

4611887.095

X Variable 1

307850.7762

366462.073

0.840061766

0.414063665

-473244.6427

1088946.195

-473244.6427

1088946.195

X Variable 2

1933586.446

595384.2829

3.247627627

0.005409539

664554.8866

3202618.005

664554.8866

3202618.005

X Variable 3

-1997085.709

998614.2251

-1.999857061

0.063961936

-4125581.546

131410.1269

-4125581.546

131410.1269

X Variable 4

-455.5503393

872.9054101

-0.52187824

0.609375816

-2316.104179

1405.0035

-2316.104179

1405.0035



  1. Multiple Regression Equation

House price= 638809.0738 + 307850.776 *Number of bedroom - 1933586.446*Number of garage space- 1997085.709* Number of distances + 455.5503393* land size

  1. Interpretation of the coefficients

The projected price of residences when all of the independent variables in the model have zero values is known as the intercept, and its estimated coefficient is 638809.0738. While this is an excellent starting point, it's probably not very helpful. The coefficient for the amount of bedrooms is 307850.7762, implying that an extra house will cost around $307850.7762. However, the coefficient for many garages is negative 1933586.446, indicating that a pessimistic estimate of $1933586.446 is attached to one extra unit of garage space. The negative coefficient for distances (1997085.709) shows that for every increase in distance, house prices decrease by roughly $1997085.709.

  1. Interpretation of Coefficient of Determination

When analysing linear regression, the coefficient of determination, also referred to as R2 in statistics, is a metric used to assess a model's ability to predict or explain a result. More precisely, R2 indicates the extent to which the predictor variable (X, an independent variable) and linear regression may be used to predict or explain the variance in the dependent variable (Y). The coefficient of determination comes out to be 58% that shows the positive correlation reflects strong connection between the both the variables.

  1. Assessing the Overall Model of Significance

The overall importance of the F-test is used to evaluate how well the generated regression line matches the given data points. There are a lot of issues and intricate arithmetic involved with the F-test of overall significance, especially when there are more than two variables. The F statistic is 5.18 while the significance F is 0.0079. As in this observation, p value is determined to be less than 0.05 that shows independent variable chosen has a substantial link with the price of the house because the regression equation is noteworthy overall.

  1. Assessing the Significance of Independent Variables in the Model

The p value shows that the number of bedrooms is substantiate at the level of 0.05. This implies that these factors significantly affect how much houses cost. However, there is no statistically significant correlation between the quantity of garages and sales at the level of 0.005 with the size of land at p value of 0.06.

  1. Examining the correlation between the explanatory variables and checking for the possibility of multicollinearity

 

Price ($)

No. of bedrooms

No. of garage space

Distance (Km)

Land size (m²)

Price ($)

1

 

 

 

 

No. of bedrooms

0.17074218

1

 

 

 

No. of garage space

0.493379783

0.192394428

1

 

 

Distance (Km)

-0.393373561

0.120371295

0.291958821

1

 

Land size (m²)

-0.030400822

0.366719783

0.610209498

0.707261267

1



The links between home prices and the independent variables which are number of bedrooms, garage spaces, distance, and land size are revealed by the correlation matrix. There appears to be a slight positive correlation (0.17) between the number of bedrooms and house prices, indicating that an increase in bedrooms has a noticeable effect on price. The relationship between house prices and garage spaces is much stronger (0.49), indicating that having more garage spaces generally boosts property values, while going against the regression model's negative coefficient.

Considering the regression analysis, it is noteworthy that there is a negative correlation of -0.39 between distance from the property and house prices, indicating that the tendency is for prices to decline with increasing distance. This implies that properties located further from important areas or city centers have a lower value. The relatively weak negative correlation between land size and price (-0.03) suggests that bigger land sizes have little to no effect on house prices, which may account for the regression's lack of statistical significance.

Furthermore, there are significant positive connections between land size and distance (0.71) and garage spaces (0.61). This suggests that properties with larger lots typically have more garages and are farther away, perhaps in more rural locations.

Lastly, land size of the properties and the price of those share positive correlation between each other which can be reflected through positive figure of 0.0304. This reflects that with the increase in land size, the prices at which the properties were sold increases.

Summary:

The home values are predicted by the multiple regression model based on several characteristics such as land area, bedrooms, parking spots, and distance. The intercept, which is 638,809.07, represents the expected price when all variables are zero. A longer distance and a separate garage add $1,933,586.45 and $1,997,085.71, respectively, to the cost of a home; each additional bedroom adds $307,850.78. The amount of land has very little positive impact. The model explains 58% of the variance in house prices (R2 = 0.58), and the full regression is significant (F-statistic = 5.18, p = 0.0079). Bedrooms have a big influence on prices, although garages and land size don't really matter statistically.



Summary

The characteristics used in this statistical research to evaluate the factors influencing property prices include distance, land size, number of bedrooms, and parking spaces. The typical house price of $2,677,775 is very variable, as indicated by the standard deviation of $2,591,620. The skewness of 2.53 suggests that there are outliers, or pricey properties, while the median price of $1,833,500 shows that there is a concentration of homes around this amount. Regression analysis indicates that the number of bedrooms significantly affects home values; an additional bedroom increases the price by roughly $307,850. The cost decreases by roughly $1,997,085 per kilometre when one goes farther away from major centers.These factors account for 58% of the fluctuation in housing prices, according to the coefficient of determination (R2 = 0.58). The regression study reveals no significant relationship between parking spaces and land size, despite the correlation matrix indicating an upward association between garage spaces and prices (0.49).



References

BMJ (2024) 2. Mean and standard deviation. Available at:https://www.bmj.com/about-bmj/resources-readers/publications/statistics-square-one/2-mean-and-standard-deviation(Accessed on: 10 October 2024).

Brittanica (2024) The Britannica Dictionary. Available at:https://www.britannica.com/dictionary(Accessed on: 10 October 2024).

Corporate Finance Institute (2024a) Kurtosist. Available at:https://corporatefinanceinstitute.com/resources/data-science/kurtosis/(Accessed on: 9 October 2024).

Corporate Financial Institute (2024) Range. Available at:https://corporatefinanceinstitute.com/resources/data-science/range/(Accessed on: 9 October 2024).

Geeks for Geeks (2024) Average in Maths. Available at: https://www.geeksforgeeks.org/average/(Accessed on: 9 October 2024).

Geeks for Geeks (2024a) Skewness – Measures and Interpretation. Available at:https://www.geeksforgeeks.org/skewness-measures-and-interpretation/ (Accessed on: 9 October 2024).



9


FAQ's