A Data Mining Study on House Price in Central Regions of Taiwan Using Education Categorical Data, Environmental Indicators, and House Features Data
Abstract
:1. Research Background and Objectives
2. Research Literature: Real Price Registration
2.1. House Features
2.2. School District Features
2.3. Air Quality Features
2.4. Literature on Data Mining
3. Data Mining and Feature Engineering
3.1. Data Extraction, Consolidation, and Sampling
3.2. Data Mining Tools
4. Results
4.1. Results of Clustering by k-Means
4.2. Results of Decision Tree Rules
5. Conclusions and Discussion
5.1. Features of House Price District
5.2. Features of Education Category Data
5.3. Air Quality
5.4. Attribute Features
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Citibanker. 2016 Global Market Outlook Adapting to Local Conditions and Flexible Layout. Available online: https://www.citibank.com.tw/sim/citigold/pdf/citibanker-2016-spring.pdf (accessed on 17 March 2022).
- Tajani, F.; Di Liddo, F.; Ranieri, R.; Anelli, D. An automatic tool for the determination of housing rental prices: An analysis of the Italian context. Sustainability 2021, 14, 309. [Google Scholar] [CrossRef]
- R.O.C. Environmental Protection Administration Executive Yuan and Taiwan. Environmental Protection Administration Environmental Information Open Platform. Available online: https://data.epa.gov.tw/ (accessed on 17 March 2022).
- Education Bureau of Taichung City Government. Available online: https://english.taichung.gov.tw/education (accessed on 17 March 2022).
- Hua, C.-C. The Importance of Real Price Registration Data to the Compilation and Release of House Price Affordability Indicators. Available online: https://blog.xuite.net/fullland/twblog/173123430-%E5%AF%A6%E5%83%B9%E7%99%BB%E9%8C%84%E8%B3%87%E6%96%99%E5%B0%8D%E6%88%BF%E5%83%B9%E8%B2%A0%E6%93%94%E8%83%BD%E5%8A%9B%E6%8C%87%E6%A8%99%E7%B7%A8%E8%A3%BD%E8%88%87%E7%99%BC%E5%B8%83%E7%9A%84%E9%87%8D%E8%A6%81%E6%80%A7 (accessed on 17 March 2022).
- Rosen, S. Hedonic prices and implicit markets: Product differentiation in pure competition. J. Pol. Econ. 1974, 82, 34–55. [Google Scholar] [CrossRef]
- Gu, M.-F. The Effect of Characteristics of Elementary Schools on House Price—The Case of High-Price Districts of New Taipei City. Available online: https://ndltd.ncl.edu.tw/cgi-bin/gs32/gsweb.cgi/login?o=dnclcdr&s=id=%22107SHU00389008%22.&searchmode=basic (accessed on 17 March 2022).
- Estes, E.A.; Smith, V.K. Price, quality, and pesticide related health risk considerations in fruit and vegetable purchases: An hedonic analysis of Tucson, Arizona supermarkets. J. Food Distrib. Res. 1996, 27, 59–76. [Google Scholar]
- Combris, P.; Lecocq, S.; Visser, M. Estimation of a hedonic price equation for Bordeaux wine: Does quality matter. In World Scientific Reference on Handbook of the Economics of Wine: Volume 1: Prices, Finance, and Expert Opinion; World Scientific: Singapore, 1997; pp. 167–183. [Google Scholar] [CrossRef]
- Gibbs, J.P.; Halstead, J.M.; Boyle, K.J.; Huang, J.-C. An hedonic analysis of the effects of lake water clarity on New Hampshire lakefront properties. Agric. Resour. Econ. Rev. 2002, 31, 39–46. [Google Scholar] [CrossRef] [Green Version]
- Freccia, D.M.; Jacobsen, J.P.; Kilby, P. Exploring the relationship between price and quality for the case of hand-rolled cigars. Q. Rev. Econ. Financ. 2003, 43, 169–189. [Google Scholar] [CrossRef]
- Connell-Variy, T.; Berggren, B.; McGough, T. Housing markets and resource sector fluctuations: A cross-border comparative analysis. Sustainability 2021, 13, 8918. [Google Scholar] [CrossRef]
- Lin, J.-J.; Chang, Y.-C. The Shop Rents Analysis of Underground Arcades in Taipei Metro System: Application of Hedonic Price Approach. Available online: https://www.airitilibrary.com/Publication/alDetailedMesh?docid=16068238-200606-7-1-47-69-a (accessed on 17 March 2022).
- González-Val, R. House prices and marriage in Spain. Sustainability 2022, 14, 2848. [Google Scholar] [CrossRef]
- Tiebout, C.M. A pure theory of local expenditures. J. Pol. Econ. 1956, 64, 416–424. [Google Scholar] [CrossRef]
- Oates, W.E. The effects of property taxes and local public spending on property values: An empirical study of tax capitalization and the Tiebout hypothesis. J. Pol. Econ. 1969, 77, 957–971. [Google Scholar] [CrossRef]
- Reback, R. House prices and the provision of local public services: Capitalization under school choice programs. J. Urban Econ. 2005, 57, 275–301. [Google Scholar] [CrossRef]
- Gravel, N.; Michelangeli, A.; Trannoy, A. Measuring the social value of local public goods: An empirical analysis within Paris metropolitan area. Appl. Econ. 2006, 38, 1945–1961. [Google Scholar] [CrossRef] [Green Version]
- Lin, L.-W. Applying the Hedonic Price Method to Assess the Benefits of Air Quality Improvement in Taiwan’s Metropolitan Area. Available online: https://ndltd.ncl.edu.tw/cgi-bin/gs32/gsweb.cgi/login?o=dnclcdr&s=id=%22091NTPU0399001%22.&searchmode=basic (accessed on 17 March 2022).
- Yeh, H.S. Estimating the Impact of Air Pollution on Housing Price—An Application of Hedonic Price Method. Available online: https://ndltd.ncl.edu.tw/cgi-bin/gs32/gsweb.cgi/login?o=dnclcdr&s=id=%22081NCCU0303007%22.&searchmode=basic (accessed on 17 March 2022).
- Qiu, Z.H. A Study of Housing Imputed Rent in Taipei City and Taiwan. Available online: http://nccur.lib.nccu.edu.tw/handle/140.119/64366 (accessed on 17 March 2022).
- Sent-ian, W. Price Estimation of Air Pollution in Taipei Metropolitan Area—Application of Hedonic Price Method. Available online: https://ndltd.ncl.edu.tw/cgi-bin/gs32/gsweb.cgi/ccd=IP8Kq7/record?r1=1&h1=0 (accessed on 17 March 2022).
- Chiang, Y.-S.; Wang, S.-E.; Lin, Y.-L. The Direct Effect of the Air Pollution Control Fees on Air Quality Improvement. Available online: https://tpl.ncl.edu.tw/NclService/JournalContentDetail?SysId=A00018855&ji%5B0%5D=%E9%81%8B%E8%BC%B8%E8%A8%88%E5%8A%83&cn%5B0%5D=567&q%5B0%5D.f=KW&q%5B0%5D.i=%E7%A9%BA%E6%B0%A3%E6%B1%A1%E6%9F%93&page=1&pageSize=1&orderField=score&orderType=desc (accessed on 17 March 2022).
- Lin, C.-W. A Spatial Analysis of Land Price Based on the Hedonic Price Theory: With the Case Study of the Town House Real Estates in the Old CBD Area of Taichung City in 2008. Master’s Thesis, Graduate Institute of Earth Science, Chinese Culture University, Taipei, Taiwan, 2010. Available online: https://hdl.handle.net/11296/r343he (accessed on 19 May 2022).
- Chen, S.-M. The Effect of Central Taiwan Science Park on Local Housing Price Using Hedonic Price Method. Master’s Thesis, Tunghai University, Taichung, Taiwan, 2012. Available online: https://hdl.handle.net/11296/wjdrab (accessed on 20 April 2022).
- Tsai, M.-C. The Valuation of Climate and Air Quality in Taiwan—An Application of the Hedonic Price Method. Master’s Thesis, Insitiute of Natural Resources Management, National Taipei University, Taipei, Taiwan, 2015. Available online: https://hdl.handle.net/11296/r6fn54 (accessed on 20 April 2022).
- Wu, Y.-P. The Impact of Air Pollution on Housing Price—A Case Study of Taichung City. Master’s Thesis, Business Administration, National Chung Hsing University, Taichung, Taiwan, 2020. Available online: https://hdl.handle.net/11296/3z9jeb (accessed on 20 April 2022).
- Frawley, W.J.; Piatetsky-Shapiro, G.; Matheus, C.J. Knowledge discovery in databases: An overview. AI Mag. 1992, 13, 57. [Google Scholar] [CrossRef]
- Hand, D.; Mannila, H.; Smyth, P. Principles of Data Mining; MIT Press: Cambridge, MA, USA, 2001. [Google Scholar]
- Yehuda, R.; Halligan, S.L.; Grossman, R. Childhood trauma and risk for PTSD: Relationship to intergenerational effects of trauma, parental PTSD, and cortisol excretion. Dev. Psychopathol. 2001, 13, 733–753. [Google Scholar] [CrossRef] [PubMed]
- Guevara-Viejó, F.; Valenzuela-Cobos, J.D.; Grijalva-Endara, A.; Vicente-Galindo, P.; Galindo-Villardón, P. Data mining techniques: New method to identify the effects of aquaculture binder with sardine on diets of juvenile litopenaeus vannamei. Sustainability 2022, 14, 4203. [Google Scholar] [CrossRef]
- Chen, Y.-S.; Lin, C.-K.; Lin, Y.-S.; Chen, S.-F.; Tsao, H.-H. Identification of potential valid clients for a sustainable insurance policy using an advanced mixed classification model. Sustainability 2022, 14, 3964. [Google Scholar] [CrossRef]
- Li, M.-F. Analyzing the Learner’s Emotions and Color Relation Framework Uses Data Mining Models. Available online: https://ndltd.ncl.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dnclcdr&s=id=%22104NTCT0629001%22.&searchmode=basic (accessed on 17 March 2022).
- Chiang, M.-C. Can Luxury Tax Effectively Suppress Rising Housing Prices?—A Case Study in Taipei Residence. Available online: https://ndltd.ncl.edu.tw/cgi-bin/gs32/gsweb.cgi/login?o=dnclcdr&s=id=%22103YUNT0304007%22.&searchmode=basic (accessed on 17 March 2022).
- Tsao, H.-C.; Lu, C.-J. Assessing the impact of aviation noise on housing prices using new estimated noise value: The case of Taiwan Taoyuan International Airport. Sustainability 2022, 14, 1713. [Google Scholar] [CrossRef]
Target of Transaction | Target Information | Price Information |
---|---|---|
House number at a land section | Total area of land transfer | Total price of house transaction |
House number at a building section | Total area of building transfer | Total price of real estate transaction |
Immovable property mark | Total area of parking space transfer | Total price of building transaction |
Number of buildings per transaction | Division of use area | Total price of parking space transaction |
Current layout of the building | Unit price per square meter | |
Type of parking space | Year and month of the transaction | |
Type of community management |
Researcher | Region | Research Time | Research Subject | Research Results |
---|---|---|---|---|
Estes and Smith (1996) [8] | Arizona, United States | 1994 | Fruits and vegetables | The price of fruits and vegetables will be affected by “packaging, size, and organic product label”. |
Combris, Lecocg And Visser (1997) [9] | Bordeaux, France | 1992 | Bordeaux wine | Wine price will be affected by the objective quality indicated on the bottle. |
Gibbs, Halstead, Boyle And Hung (2002) [10] | New Hampshire, United States | 1990–1995 | Cleanness of lake water | Cleanness of lake water will affect the price of houses nearby. |
Freccia, Jacobsen and Kilby (2003) [11] | Cigar production places | 1992–1999 | Cigars | The effect of cigars made in Cuba has the largest effect among all features. |
T. Connell-Variy, B. Berggren, and T. McGough (2021) [12] | Queensland, Australia | 2000–2018 | Local mineral products | By comparing the resource reliance on the community in various countries regarding two independent resource areas, house price area is studied through resource relation. |
Author | Real Estate Property Price Features | Analysis Factor |
---|---|---|
Yeh (1993) [20] | Residential house transaction price in 1991 | PM10 |
Lin (1992) [21] | Survey data of the Directorate General of Budget, Accounting, and Statistics in 1989 | Air pollution and odor |
Wu (1995) [22] | Adjusted residential house price in 1994 | TSP |
Gieng, Wang, and Lin (2000) [23] | Investigation of residential house status in Kaohsiung region in 1994 | CO, PM10 |
Lin (2008) [24] | Town House Real Estates in the Old CBD Area of Taichung City | Housing prices and other housing features |
Chen (2012) [25] | Central Taiwan Science Park on Local Housing Price from 2003 to 2012 | Impact of Central Taiwan Science Park on Home Prices |
Tasi (2015) [26] | The value assessment of climatic conditions and air quality in the Taiwan metropolitan area from 2003 to 2012 | Temperature, Rainfall, Air Quality |
Wu (2020) [27] | The effect of air pollution on housing prices in Taichung City from 2016 to 2018 | Rainfall, Season, and Air Pollution Factors |
Scholar | Time | Definition |
---|---|---|
W. Frawley, et al. [28] | 1992 | Extract potentially useful and non-general information from the past unknown information implied by data. |
D. Hand, et al. [29] | 2001 | Data mining is a science that searches for useful information from the big data database. |
R. Grossman [30] | 2001 | Data mining uses a semi-automated extraction model on data to discover correlated and statistically meaningful datasets. |
F. Guevara-Viejó, J. D. Valenzuela-Cobos, A. Grijalva-Endara, P. Vicente-Galindo, and P. Galindo-Villardón [31] | 2022 | The K-means clustering algorithm and PCA Biplot discover the result value stably produced through observation value of different parameters. |
Y.-S. Chen, C.-K. Lin, Y.-S. Lin, S.-F. Chen, and H.-H. Tsao [32] | 2022 | This study consolidates the calculation of 7 kinds of data mining technologies, such as decision tree, Bayes, Function, Lazy, Meta, Mise, and Rule, and 23 kinds of important clustering algorithms (or classifier), and finds out the best classifier among them. |
Real Price Registration Item | Item Description |
---|---|
Administrative area | The administrative area where the building being transacted is located |
Year of the transfer | The year when the transaction takes place |
Quarter of the transfer | The quarter when the transaction takes place |
Parking space | Form of the parking space |
Total price in NT | Total transaction price |
Total square meters of land transfer | Total floor area of the house |
Floor of transfer | The floor where the house being transacted is located |
Main use | Division of land-use area |
Whether it has community management | Whether or not it has community management |
Total square meters of building transfer | Indoor area of the house |
Number of living rooms | Number of living rooms |
Number of bathrooms | Number of bathrooms and toilets |
Month of the transaction | The month when the transaction takes place |
Number of bedrooms | Number of bedrooms |
Unit price per square meter | Selling price per square meter of the architecture interior |
Whether it has partition | Whether it has partition |
House age | The gap between the year/month of the transaction and the year/month of completion |
Statistical Parameter\Descriptive Statistical Coefficient, R-Value at 0.913 | Correlation Coefficients | Standard Deviation (SD) | t Value | p-Value | Significance Level α/2 = 0.025 | Confidence Level (0.975) |
---|---|---|---|---|---|---|
Administrative area | −0.0131 | 0.001 | −9.504 | 0 | −0.016 | −0.01 |
Total square meters of land transfer | 0.0691 | 0.006 | 12.076 | 0 | 0.058 | 0.08 |
Year of the transfer | 0.0078 | 0.002 | 3.351 | 0.001 | 0.003 | 0.012 |
Quarter of the transfer | −0.003 | 0.002 | −1.775 | 0.076 | −0.006 | 0 |
Floor of transfer | 0.0168 | 0.004 | 3.999 | 0 | 0.009 | 0.025 |
Total floors of the building | 0.0035 | 0.001 | 6.68 | 0 | 0.002 | 0.005 |
Main use | −0.0147 | 0.059 | −0.247 | 0.804 | −0.131 | 0.102 |
House age | −0.003 | 0 | −10.302 | 0 | −0.004 | −0.002 |
Total square meters of building transfer | 0.445 | 0.024 | 18.583 | 0 | 0.398 | 0.492 |
Number of bedrooms | 0.0162 | 0 | 113.126 | 0 | 0.016 | 0.016 |
Number of living rooms | 0.0081 | 0 | 26.198 | 0 | 0.008 | 0.009 |
Number of bathrooms | 0.0025 | 0.0000385 | 65.42 | 0 | 0.002 | 0.003 |
Whether it has partition | 0.0004 | 0 | 1.103 | 0.27 | 0 | 0.001 |
Whether it has community management | 0.0061 | 0.01 | 0.618 | 0.537 | −0.013 | 0.026 |
Unit price per square meter | 0.0966 | 0.003 | 30.977 | 0 | 0.09 | 0.103 |
Parking space | 0.0219 | 0.002 | −12.153 | 0 | −0.025 | −0.018 |
The Attributes for Clustering and Their Quantity | Number of Clusters | Intra-Group Square and WSS |
---|---|---|
17 internal features | 2 | 9203 |
17 internal features | 3 | 8430 |
17 internal features | 4 | 8220 |
17 internal features | 5 | 7950 |
2 features, “total square meter of building transfer” and “total price in NT$” | 2 | 22 |
Administrative Area | Egg Yolk District | Egg White District | Total |
---|---|---|---|
Dadu District | 0 | 6 | 6 |
Daya District | 0 | 46 | 46 |
Taiping District | 0 | 18 | 18 |
Beitun District | 85 | 2131 | 2216 |
North District | 92 | 838 | 930 |
Xitun District | 615 | 2328 | 2943 |
West District | 35 | 289 | 324 |
Shalu District | 0 | 17 | 17 |
East District | 0 | 72 | 72 |
Nantun District | 469 | 1795 | 2264 |
South District | 1 | 561 | 562 |
Shengang District | 0 | 14 | 14 |
Tanzi District | 0 | 150 | 150 |
Longjing District | 0 | 14 | 14 |
Fengyuan District | 0 | 136 | 136 |
Qingshui District | 0 | 62 | 62 |
Wuqi District | 0 | 5 | 5 |
Dali District | 0 | 2 | 2 |
Dajia District | 0 | 4 | 4 |
Total | 1297 | 8488 | 9785 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lee, M.-f.; Chen, G.-s.; Lin, S.-p.; Wang, W.-j. A Data Mining Study on House Price in Central Regions of Taiwan Using Education Categorical Data, Environmental Indicators, and House Features Data. Sustainability 2022, 14, 6433. https://doi.org/10.3390/su14116433
Lee M-f, Chen G-s, Lin S-p, Wang W-j. A Data Mining Study on House Price in Central Regions of Taiwan Using Education Categorical Data, Environmental Indicators, and House Features Data. Sustainability. 2022; 14(11):6433. https://doi.org/10.3390/su14116433
Chicago/Turabian StyleLee, Min-feng, Guey-shya Chen, Shao-pin Lin, and Wei-jie Wang. 2022. "A Data Mining Study on House Price in Central Regions of Taiwan Using Education Categorical Data, Environmental Indicators, and House Features Data" Sustainability 14, no. 11: 6433. https://doi.org/10.3390/su14116433