Feature Selection and Dynamic Network Traffic Congestion Classification based on Machine Learning for Internet of Things

Feature Selection and Dynamic Network Traffic Congestion Classification based on Machine Learning for Internet of Things


  • Ahmed A. Elngar Faculty of Computers and Artificial Intelligence, Beni-Suef University, Beni-Suef City, Egypt
  • Adriana Burlea-Schiopoiu professor of Management at the University of Craiova, Romania




Machine learning, Privacy-preserving classification, data set composition, Network Traffic classificatio


The network traffic congestion classifier is essential for network monitoring systems. Network traffic characterization is a methodology to classify traffic into several classes supporting various attributes. In this paper, payload-based classification is suggested for network traffic characterization. It has a broad scope of utilization like network security assessment, intrusion identification, QoS supplier, et cetera; furthermore, it has significance in investigating different suspicious movements in the network. Numerous supervised classification techniques like Support Vector Machines and unsupervised clustering methods like K-Means connected are used in traffic classification. In current network conditions, minimal supervised data and unfamiliar applications influence the usual classification procedure's performance. This paper implements a methodology for network traffic classification using clustering, feature extraction, and variety for the Internet of Things (IoT). Further, K-Means is used for network traffic clustering datasets, and feature extraction is performed on grouped information. KNN, Naïve Bayes, and Decision Tree classification methods classify network traffic because of extracted features, which presents a performance measurement between these classification algorithms. The results discuss the best machine learning algorithm for network congestion classification. According to the outcome, clustering (k-means) with network classification (Decision Tree) generates a higher accuracy, 86.45 %, than other clustering and network classification


Raikar, M.M., Meena, S.M., Mulla, M.M., Shetti, N.S. and Karanandi, M., 2020. Data Traffic Classification in Software Defined Networks (SDN) using supervised learning. Procedia Computer Science, 171, pp.2750-2759. DOI: https://doi.org/10.1016/j.procs.2020.04.299

Shafiq, M., Tian, Z., Bashir, A.K., Jolfaei, A. and Yu, X., 2020. Data Mining and Machine Learning Methods for Sustainable Smart Cities Traffic Classification: A Survey. Sustainable Cities and Society, p.102177. DOI: https://doi.org/10.1016/j.scs.2020.102177

Hussain N, Rani P, Kumar N, Chaudhary MG. A Deep Comprehensive Research Architecture, Characteristics, Challenges, Issues, and Benefits of Routing Protocol for Vehicular Ad-Hoc Networks. International Journal of Distributed Systems and Technologies (IJDST). 2022 Jul 12;13(8):1-23. DOI: https://doi.org/10.4018/IJDST.307900

Dias, K.L., Pongelupe, M.A., Caminhas, W.M. and de Errico, L., 2019. An innovative approach for real-time network traffic classification. Computer Networks, 158, pp.143-157. DOI: https://doi.org/10.1016/j.comnet.2019.04.004

Kim, H., Claffy, K.C., Fomenkov, M., Barman, D., Faloutsos, M. and Lee, K., 2008, December. Internet traffic classification demystified: myths, caveats, and the best practices in Proceedings of the 2008 ACM CoNEXT conference (pp. 1-12). DOI: https://doi.org/10.1145/1544012.1544023

Pervouchine, V. and Leedham, G., 2007. Extraction and analysis of forensic document examiner features used for writer identification. Pattern Recognition, 40(3), pp.1004-1013. DOI: https://doi.org/10.1016/j.patcog.2006.08.008

Appice, A., Ceci, M., Rawles, S. and Flach, P., 2004, July. Redundant feature elimination for multi-class problems. In Proceedings of the twenty-first international conference on Machine Learning (p. 5). DOI: https://doi.org/10.1145/1015330.1015397

Williams, N. and Zander, S., 2006. Evaluating machine learning algorithms for automated network application identification.

Hussain, N. and Rani, P., 2020. Comparative Studies Based on Attack Resilient and Efficient Protocol with Intrusion Detection System Based on Deep Neural Network for Vehicular System Security. In Distributed Artificial Intelligence (pp. 217-236). CRC Press. DOI: https://doi.org/10.1201/9781003038467-13

Rani, P., Hussain, N., Khan, R.A.H., Sharma, Y. and Shukla, P.K., 2021. Vehicular Intelligence System: Time-Based Vehicle Next Location Prediction in Software-Defined Internet of Vehicles (SDN-IOV) for the Smart Cities. In Intelligence of Things: AI-IoT Based Critical-Applications and Innovations (pp. 35-54). Springer, Cham. DOI: https://doi.org/10.1007/978-3-030-82800-4_2

McGregor, A., Hall, M., Lorier, P. and Brunskill, J., 2004, April. Flow clustering using machine learning techniques. In International workshop on passive and active network measurement (pp. 205-214). Springer, Berlin, Heidelberg. DOI: https://doi.org/10.1007/978-3-540-24668-8_21

Erman, J., Mahanti, A. and Arlitt, M., 2006, December. Qrp05-4: Internet traffic identification using machine learning. In IEEE Globecom 2006 (pp. 1-6). IEEE. DOI: https://doi.org/10.1109/GLOCOM.2006.443

Rani P, Sharma R. Intelligent transportation system for internet of vehicles based vehicular networks for smart cities. Computers and Electrical Engineering. 2023 Jan 1;105:108543. DOI: https://doi.org/10.1016/j.compeleceng.2022.108543

Moore, A.W. and Zuev, D., 2005, June. Internet traffic classification using Bayesian analysis techniques. In Proceedings of the 2005 ACM SIGMETRICS international conference on Measurement and Modeling of computer systems (pp. 50-60). DOI: https://doi.org/10.1145/1064212.1064220

Nguyen, T.T. and Armitage, G., 2006, November. Training on multiple sub-flows to optimize the use of machine learning classifiers in real-world ip networks. In Proceedings. 2006 31st IEEE Conference on Local Computer Networks (pp. 369-376). IEEE. DOI: https://doi.org/10.1109/LCN.2006.322122

WEKA: Data Mining Software in Java. https://www.cs.waikato.ac.nz/ml/weka/

Ansari G, Rani P, Kumar V. A Novel Technique of Mixed Gas Identification Based on the Group Method of Data Handling (GMDH) on Time-Dependent MOX Gas Sensor Data. InProceedings of International Conference on Recent Trends in Computing: ICRTC 2022 2023 Mar 21 (pp. 641-654). Singapore: Springer Nature Singapore. DOI: https://doi.org/10.1007/978-981-19-8825-7_55

Auld, T., Moore, A.W. and Gull, S.F., 2007. Bayesian neural networks for internet traffic classification. IEEE Transactions on neural networks, 18(1), pp.223-239. DOI: https://doi.org/10.1109/TNN.2006.883010

Bennett, K.P. and Campbell, C., 2000. Support vector machines: hype or hallelujah? Acm Sigkdd Explorations Newsletter, 2(2), pp.1-13. DOI: https://doi.org/10.1145/380995.380999

Li, Z., Yuan, R. and Guan, X., 2007, June. Accurate classification of the internet traffic based on the SVM method. In 2007 IEEE International Conference on Communications (pp. 1373-1378). IEEE. DOI: https://doi.org/10.1109/ICC.2007.231

Gowsalya, R.A. and Amali, S.M.J., 2014. SVM-Based Network Traffic Classification Using Correlation Information. International Journal of Research in Electronics and Communication Technology (IJRECT 2014), ISSN, pp.2348-9065.

Pradhan, A., 2011. Network Traffic Classification using Support Vector Machine and Artificial Neural Network. International Journal of Computer Applications, 8, pp.8-12.

Gowsalya, R. and Amali, S.M.J., 2014. Naive Bayes-based network traffic classification using correlation information. International Journal of Advanced Research in Computer Science and Software Engineering, 4(3).

Wang, Y., Xiang, Y., Zhang, J. and Yu, S., 2011, September. A novel semi-supervised approach for network traffic clustering. In 2011 5th International Conference on Network and System Security (pp. 169-175). IEEE. DOI: https://doi.org/10.1109/ICNSS.2011.6059997

Liu, Y., Li, W. and Li, Y., 2007, August. Network traffic classification using k-means clustering. In Second international multi-symposiums on computer and computational sciences (IMSCCS 2007) (pp. 360-365). IEEE. DOI: https://doi.org/10.1109/IMSCCS.2007.52

Zander, S., Nguyen, T. and Armitage, G., 2005, November. Automated traffic classification and application identification using machine learning. In The IEEE Conference on Local Computer Networks 30th Anniversary (LCN'05) l (pp. 250-257). IEEE. DOI: https://doi.org/10.1109/LCN.2005.35

Bernaille, L., Teixeira, R., Akodkenou, I., Soule, A. and Salamatian, K., 2006. Traffic classification on the fly. ACM SIGCOMM Computer Communication Review, 36(2), pp.23-26. DOI: https://doi.org/10.1145/1129582.1129589

Shaikh, Z.A. and Harkut, D., 2015. An overview of network traffic classification methods. International Journal on Recent and Innovation Trends in Computing and Communication, 3(2), pp.482-488.

Singhal, P., Mathur, R. and Vyas, H., 2013. State the Art Review of Network Traffic Classification based on Machine Learning Approach. International Journal of Computer Applications, 975, p.8887.

Suganya, G., 2014. An efficient network traffic classification based on unknown and anomaly flow detection mechanisms. Int. J. Comput. Trends Technol. (IJCTT), 10(4). DOI: https://doi.org/10.14445/22312803/IJCTT-V10P132

Mukkamala, S., Janoski, G. and Sung, A., 2002, May. Intrusion detection: support vector machines and neural networks. In Proceedings of the IEEE international joint conference on neural networks (ANNIE) (pp. 1702-1707).

Mahoney, M.V., 2003. A machine learning approach to detecting attacks by identifying anomalies in network traffic.

Laskov, P., Düssel, P., Schäfer, C. and Rieck, K., 2005, September. Learning intrusion detection: supervised or unsupervised? In International Conference on Image Analysis and Processing (pp. 50-57). Springer, Berlin, Heidelberg. DOI: https://doi.org/10.1007/11553595_6

Zamani, M., Movahedi, M., Ebadzadeh, M. and Pedram, H., 2009, December. A DDoS-aware IDS model based on danger theory and mobile agents. In 2009 International Conference on Computational Intelligence and Security (Vol. 1, pp. 516-520). IEEE. DOI: https://doi.org/10.1109/CIS.2009.215

Sommer, R. and Paxson, V., 2010, May. Outside the closed world: On using machine learning for network intrusion detection. In 2010 IEEE symposium on security and privacy (pp. 305-316). IEEE. DOI: https://doi.org/10.1109/SP.2010.25

Bujlow, T., Riaz, T. and Pedersen, J.M., 2012, January. A method for network traffic classification based on C5. 0 Machine Learning Algorithm. In 2012 international conference on computing, networking and communications (ICNC) (pp. 237-241). IEEE DOI: https://doi.org/10.1109/ICCNC.2012.6167418

Jamuna, A. and Ewards, V., 2013. Survey of traffic classification using machine learning. International journal of advanced research in computer science, 4(4).

Suthaharan, S., 2014. Big data classification: Problems and challenges in network intrusion prediction with machine learning. ACM SIGMETRICS Performance Evaluation Review, 41(4), pp.70-73. DOI: https://doi.org/10.1145/2627534.2627557

Blowers, M. and Williams, J., 2014. Machine learning applied to cyber operations. In Network science and cybersecurity (pp. 155-175). Springer, New York, NY. DOI: https://doi.org/10.1007/978-1-4614-7597-2_10

Zheng, N., Bai, K., Huang, H. and Wang, H., 2014, October. You are how you touch: User verification on smartphones via tapping behaviours. In 2014 IEEE 22nd International Conference on Network Protocols (pp. 221-232). IEEE. DOI: https://doi.org/10.1109/ICNP.2014.43

Bartos, K., Sofka, M. and Franc, V., 2016. Optimized invariant representation of network traffic for detecting unseen malware variants. In 25th {USENIX} Security Symposium ({USENIX} Security 16) (pp. 807-822).

Wang, P., Lin, S.C. and Luo, M., 2016, June. A framework for QoS-aware traffic classification using semi-supervised machine learning in SDNs. In 2016 IEEE international conference on services computing (SCC) (pp. 760-765). IEEE. DOI: https://doi.org/10.1109/SCC.2016.133

Furno, A., Fiore, M. and Stanica, R., 2017, May. Joint spatial and temporal classification of mobile traffic demands. In IEEE INFOCOM 2017-IEEE Conference on Computer Communications (pp. 1-9). IEEE. DOI: https://doi.org/10.1109/INFOCOM.2017.8057089

Mirsky, Y., Doitshman, T., Elovici, Y. and Shabtai, A., 2018. Kitsune: an ensemble of autoencoders for online network intrusion detection. arXiv preprint arXiv:1802.09089. DOI: https://doi.org/10.14722/ndss.2018.23204

Lopez-Martin, M., Carro, B., Sanchez-Esguevillas, A. and Lloret, J., 2017. Network traffic classifier with convolutional and recurrent neural networks for the Internet of Things. IEEE Access, 5, pp.18042-18050. DOI: https://doi.org/10.1109/ACCESS.2017.2747560

Mohammed, B., Hamdan, M., Bassi, J.S., Jamil, H.A., Khan, S., Elhigazi, A., Rawat, D.B., Ismail, I.B. and Marsono, M.N., 2020. Edge Computing Intelligence Using Robust Feature Selection for Network Traffic Classification in Internet-of-Things. IEEE Access, 8, pp.224059-224070. DOI: https://doi.org/10.1109/ACCESS.2020.3037492

Faheem, M., Butt, R.A., Ali, R., Raza, B., Ngadi, M.A. and Gungor, V.C., 2021. CBI4. 0: A Cross-layer Approach for Big Data Gathering for Active Monitoring and Maintenance in the Manufacturing Industry 4.0. Journal of Industrial Information Integration, p.100236. DOI: https://doi.org/10.1016/j.jii.2021.100236

Faheem, M., Fizza, G., Ashraf, M.W., Butt, R.A., Ngadi, M.A. and Gungor, V.C., 2021. Big Data acquired by the Internet of Things-enabled industrial multichannel wireless sensors networks for active monitoring and control in the smart grid Industry 4.0. Data in Brief, 35, p.106854. DOI: https://doi.org/10.1016/j.dib.2021.106854

Faheem, M., Ashraf, M.W., Butt, R.A., Raza, B., Ngadi, M.A. and Gungor, V.C., 2019, April. Ambient energy harvesting for low-powered wireless sensor network-based smart grid applications. In 2019 7th International Istanbul Smart Grids and Cities Congress and Fair (ICSG) (pp. 26-30). IEEE. DOI: https://doi.org/10.1109/SGCF.2019.8782404

Karagiannis, T., Broido, A., Brownlee, N., Claffy, K. and Faloutsos, M., 2003. File-sharing in the Internet: A characterization of P2P traffic in the backbone. University of California, Riverside, USA, Tech. Rep.

Moore, A.W. and Papagiannaki, K., 2005, March. Toward the accurate identification of network applications. In International Workshop on Passive and Active Network Measurement (pp. 41-54). Springer, Berlin, Heidelberg. DOI: https://doi.org/10.1007/978-3-540-31966-5_4




How to Cite

Elngar, A., & Burlea-Schiopoiu, A. (2023). Feature Selection and Dynamic Network Traffic Congestion Classification based on Machine Learning for Internet of Things. Wasit Journal of Computer and Mathematics Science, 2(2), 76–91. https://doi.org/10.31185/wjcms.150