KLASIFIKASI  DATA TIDAK SEIMBANG MENGGUNAKAN ALGORITMA SMOTE DAN k-NEAREST NEIGHBOR

Rimbun Siringoringo

Penulis

Rimbun Siringoringo Universitas Methodist Indonesia

Abstrak

Unbalanced data classification is a crucial problem in the field of machine learning and data mining. Data imbalances have a poor impact on classification results where minority classes are often misclassified as a majority class. k-Nearest Neighbor is one of the most popular and simple classification methods but it is not equipped with the ability to work on unbalanced datasets. In this study, the Synthetic Minority Over-Sampling Technique (SMOTE) was applied to solve the class imbalance problem on the Credit Card Fraud dataset. By applying the 10-cross-validation evaluation scheme, it was found that SMOTE increases the mean ofÃ‚Â G-Mean by 53.4% to 81.0% and the mean ofÃ‚Â F-Measure by 38.7 to 81.8%

Keywords: Class imbalance, Synthetic Minority Over-sampling Technique, k-Nearest Neighbor

Referensi

[1]A. Ali, S. M. Shamsuddin, & A. L. Ralescu, Ã¢â‚¬Å“Classification with class imbalance problem: a review,Ã¢â‚¬Â Int J Adv. Soft Compu Appl, vol. 7, no. 3, 2015.
[2]R. Kothan&, Ã¢â‚¬Å“Handling class imbalance problem in miRNA dataset associated with cancer,Ã¢â‚¬Â Bioinformation, vol. 11, no. 1, pp. 6Ã¢â‚¬â€œ10, Jan 2015.
[3]Q. Wu, Y. Ye, H. Zhang, M. K. Ng, & S.-S. Ho, Ã¢â‚¬Å“ForesTexter: An efficient random forest algorithm for imbalanced text categorization,Ã¢â‚¬Â Knowl.-Based Syst., vol. 67, pp. 105Ã¢â‚¬â€œ116, Sep 2014.
[4]C. Li & S. Liu, Ã¢â‚¬Å“A comparative study of the class imbalance problem in Twitter spam detection,Ã¢â‚¬Â Concurr. Comput. Pract. Exp., pp. n/a-n/a.
[5]Q. Gu, X.-M. Wang, Z. Wu, B. Ning, & C.-S. Xin, Ã¢â‚¬Å“An improved SMOTE algorithm based on genetic algorithm for imbalanced data classification,Ã¢â‚¬Â J Dig Inf Manag, vol. 14, no. 2, pp. 92Ã¢â‚¬â€œ103, 2016.
[6]B. Karlik, A. Yibre, & K. BarÃ„Â±Ã…Å¸, Comprising Feature Selection and Classifier Methods with SMOTE for Prediction of Male Infertility, vol. 3. 2016.
[7]R. Pruengkarn, K. W. Wong, & C. C. Fung, Ã¢â‚¬Å“Multiclass Imbalanced Classification Using Fuzzy C-Mean and SMOTE with Fuzzy Support Vector Machine,Ã¢â‚¬Â dalam Neural Information Processing, 2017, pp. 67Ã¢â‚¬â€œ75.
[8]E. M. El Houby, N. I. Yassin, & S. Omran, Ã¢â‚¬Å“A Hybrid Approach from Ant Colony Optimization and K-nearest Neighbor for Classifying Datasets Using Selected Features,Ã¢â‚¬Â Informatica, vol. 41, no. 4, 2017.
[9]N. V. Chawla, K. W. Bowyer, L. O. Hall, & W. P. Kegelmeyer, Ã¢â‚¬Å“SMOTE: synthetic minority over-sampling technique,Ã¢â‚¬Â J. Artif. Intell. Res., vol. 16, pp. 321Ã¢â‚¬â€œ357, 2002.
[10]N. C. Barde & M. Patole, Ã¢â‚¬Å“Classification and Forecasting of Weather using ANN, k-NN and NaÃƒÂ¯ve Bayes Algorithms.Ã¢â‚¬Â
[11]W. Prachuabsupakij & P. Doungpaisan, Ã¢â‚¬Å“Matching preprocessing methods for improving the prediction of studentÃ¢â‚¬â„¢s graduation,Ã¢â‚¬Â dalam Computer and Communications (ICCC), 2016 2nd IEEE International Conference on, 2016, pp. 33Ã¢â‚¬â€œ37.

KLASIFIKASI DATA TIDAK SEIMBANG MENGGUNAKAN ALGORITMA SMOTE DAN k-NEAREST NEIGHBOR

Penulis

Abstrak

Referensi

Unduhan

File Tambahan

Diterbitkan

Terbitan

Bagian

Lisensi

menu

template

visitors

Terbitan Terkini

Informasi

Dikembangkan Oleh

Bahasa