Analysis Comparison of K-Nearest Neighbor, Multi-Layer Perceptron, and Decision Tree Algorithms in Diamond Price Prediction

Authors

  • Ahya Radiatul Kamila Program Studi Data Science, Universitas Bunda Mulia
  • Johanes Fernandes Andry Program Studi Sistem Informasi, Universitas Bunda Mulia
  • Adi Wahyu Candra Kusuma Program Studi Data Science, Universitas Bunda Mulia
  • Eko Wahyu Prasetyo Program Studi Data Science, Universitas Bunda Mulia
  • Gerry Hudera Derhass Program Studi Computer Science, Institut Pertanian Bogor

DOI:

https://doi.org/10.31154/cogito.v10i2.532.298-311

Keywords:

Decision tree, Diamond Price Prediction, K-Nearest neighbor, Machine learning, Multi Layer Perceptron

Abstract

Diamond price predictions are essential due to the high demand for these gemstones, valued as investments and jewelry. Diamonds are expensive due to their rarity and extraction process. Their prices vary depending on key factors like the diamond's inherent value and secondary factors such as marketing costs, brand names, and market trends. These variations often confuse customers, potentially leading to investment losses. This research aims to help investors determine the true price of diamonds based solely on their intrinsic value, excluding secondary factors. A machine learning approach was utilized to predict diamond prices, focusing on primary determinants. Three models such as Multi-Layer Perceptron (MLP), Decision Tree, and K-Nearest Neighbor (KNN) were compared with manual hyperparameter tuning to identify the best performing algorithm. Model performance was evaluated using Mean Absolute Percentage Error (MAPE), Mean Absolute Error (MAE), and Mean Squared Error (MSE). Among the models, KNN demonstrated the best results, achieving MAPE, MAE, and MSE values of 1.1%, 0.00038, and 〖2.687 x 10〗^(-6) respectively. This study offers valuable insights for investors by accurately predicting diamond prices based on fundamental attributes, minimizing the impact of secondary factors.

References

Sonia, “Diamond Price Prediction Using Machine Learning Algorithms”, International Journal of Multidisciplinary Educational Research, vol. 12, pp.99-106, June 2023.

W. Alsuraihi, E. Al-Hazmi, K. Bawazeer, and H. Alghamdi, “Machine Learning Algorithms for Diamond Price Prediction,” in ACM International Conference Proceeding Series, Association for Computing Machinery, Mar. 2020, pp. 150–154. doi: 10.1145/3388818.3393715.

G. Sharma, V. Tripathi, M. Mahajan, and A. K. Srivastava, “Comparative analysis of supervised models for diamond price prediction,” in Proceedings of the Confluence 2021: 11th International Conference on Cloud Computing, Data Science and Engineering, Institute of Electrical and Electronics Engineers Inc., Jan. 2021, pp. 1019–1022. doi: 10.1109/Confluence51648.2021.9377183.

A. A. Mankawade, C. Kokate, K. Soman, A. Mohite, A. Vispute, and O. More, “Diamond Price Prediction Using Machine Learning Algorithms,” Int J Res Appl Sci Eng Technol, vol. 11, no. 5, pp. 4867–4871, May 2023, doi: 10.22214/ijraset.2023.52741.

H. Zhang, “Prediction and Feature Importance Analysis for Diamond Price Based on Machine Learning Models,” Advances in Economics, Management and Political Sciences, vol. 46, no. 1, pp. 254–259, Dec. 2023, doi: 10.54254/2754-1169/46/20230347.

J. F. Andry, F. M. Silaen, H. Tannady, and K. H. Saputra, “Electronic health record to predict a heart attack used data mining with Naïve Bayes method,” International Journal of Informatics and Communication Technology (IJ-ICT), vol. 10, no. 3, p. 182, Dec. 2021, doi: 10.11591/ijict.v10i3.pp182-187.

I. H. Sarker, “Data Science and Analytics: An Overview from Data-Driven Smart Computing, Decision-Making and Applications Perspective,” Sep. 01, 2021, Springer. doi: 10.1007/s42979-021-00765-8.

J. Fernandes Andry, H. Tannady, I. Ivana Limawal, G. Dwinoor Rembulan, and R. Farady Marta, “Big Data Analysis on Youtube with Tableau,” J Theor Appl Inf Technol, vol. 99, p. 22, 2021, [Online]. Available: www.jatit.org

M. B. Courtney, “Exploratory Data Analysis in Schools: A Logic Model to Guide Implementation,” International Journal of Education Policy and Leadership, vol. 17, no. 4, May 2021, doi: 10.22230/ijepl.2021v17n4a1041.

A. Radiatul Kamila and A. Subiantoro, “Coronary Heart Disease Detection Using a Combination of Adaptive Synthetic Sampling Approach and Stacking Method on Imbalanced and Incomplete Dataset”, International Engineering Student Conference, June 2022.

J. M. Waworundeng et al., “Sentiment Analysis of Online Lectures Tweets using Naïve Bayes Classifier Analisis Sentimen Tweet Kuliah Online menggunakan Naïve Bayes Classifier,” Cogito Smart Journal |, vol. 8, no. 2, p. 2022.

F. F. Tampinongkol, R. Ilham, A. R. Kamila, Y. Purnomo, C. Herdian, S.Virginia, “Deteksi Ciri Link Phishing Menggunakan Algoritma Random Forest Untuk Meningkatkan Keamanan Cyber”, Techno Xplore Jurnal Ilmu Komputer dan Teknologi Informasi”, vol. 9, no.2, 2024.

C. Fan, M. Chen, X. Wang, J. Wang, and B. Huang, “A Review on Data Preprocessing Techniques Toward Efficient and Reliable Knowledge Discovery From Building Operational Data,” Mar. 29, 2021, Frontiers Media S.A. doi: 10.3389/fenrg.2021.652801.

Y. Qin, C. Li, X. Shi, and W. Wang, “MLP-Based Regression Prediction Model For Compound Bioactivity,” Front Bioeng Biotechnol, vol. 10, Jul. 2022, doi: 10.3389/fbioe.2022.946329.

A. Saleh, and M. Maryam, “Pemanfaatan Teknik Data Mining Dalam Menentukan Standar Mutu Jagung The Utilization Data Mining Technique in Determining the Quality Standard of Corn,” Cogito Smart Journal |, vol. 5, no. 2, p. 171. 2019

E. Hasmin, C. Susanto, K. Aryasa, U. Dipa Makassar, and J. Perintis Kemerdekaan Km, “Sistem Pakar Prediksi Penyakit Diabetes Menggunakan Metode K-NN Berbasis Android Expert System for Predicting Diabetes Using the Android-Based K-NN Method,” Cogito Smart Journal |, vol. 8, no. 2. 2022.

A. Pamuji, “Performance of the K-Nearest Neighbors Method on Analysis of Social Media Sentiment,” JUISI, vol. 07, no. 01, 2021.

A. Sujiana and U. Budiyanto, “Prediksi jumlah Produksi Perakitan Komponen Menggunakan ANFIS Yang Dioptimasi Dengan Algoritma K-Means Prediction of Component Assembly Production Using ANFIS Optimized With K-Means Algorithm,” Cogito Smart Journal |, vol. 9, no. 2, 2023.

I. Nabillah and I. Ranggadara, “Mean Absolute Percentage Error untuk Evaluasi Hasil Prediksi Komoditas Laut,” JOINS (Journal of Information System), vol. 5, no. 2, pp. 250–255, Nov. 2020, doi: 10.33633/joins.v5i2.3900.

A. A. Suryanto, A. Muqtadir, and S. Artikel, “Penerapan Metode Mean Absolute Error (Mea) Dalam Algoritma Regresi Linear Untuk Prediksi Produksi Padi” Jurnal Sains dan Teknologi, no. 1, p. 11, 2019.

T. O. Hodson, T. M. Over, and S. S. Foks, “Mean Squared Error, Deconstructed,” J Adv Model Earth Syst, vol. 13, no. 12, Dec. 2021, doi: 10.1029/2021MS002681.

H. Mustafidah and S. N. Rohman, “Mean Square Error pada Metode Random dan Nguyen Widrow dalam Jaringan Syaraf Tiruan Mean Square Error on Random and Nguyen Widrow Method on Artificial Neural Networks”, 2023, doi: 10.30595/sainteks.v20i2.19516.

D. Chicco, M. J. Warrens, and G. Jurman, “The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE, and RMSE in regression analysis evaluation,” PeerJ Comput Sci, vol. 7, pp. 1–24, 2021, doi: 10.7717/PEERJ-CS.623.

Downloads

Published

2024-12-31

How to Cite

Kamila, A. R., Andry, J. F., Kusuma, A. W. C., Prasetyo, E. W., & Derhass, G. H. (2024). Analysis Comparison of K-Nearest Neighbor, Multi-Layer Perceptron, and Decision Tree Algorithms in Diamond Price Prediction. CogITo Smart Journal, 10(2), 298–311. https://doi.org/10.31154/cogito.v10i2.532.298-311