Early Identification of Abused Domains in TLD through Passive DNS Applying Machine Learning Techniques
Main Article Content
Abstract
Article Details
References
D. N. Stat, “Domain name registration’s statistics,” 2022, URL: https://domainnamestat.com/statistics/overview, [Online; accessed on January 23, 2022].
S. Khalid, A. Mahboob, F. Azim, A. U. Rehman, “IDHOCNET-A novel protocol stack and architecture for ad hoc networks,” International Journal of Communication Networks and Information Security (IJCNIS), Vol. 7, No. 1, pp. 20, 2015.
K. R. Fall, W. R. Stevens, “TCP/IP illustrated, volume 1: the protocols,” Addison-Wesley, 2011.
J. F. Kurose, K. W. Ross, “Computer Networking: A Top-Down Approach,” Pearson, 2017.
L. Desmet, J. Spooren, T. Vissers, P. Janssen, W. Joosen, “Premadoma: an operational solution to prevent malicious domain name registrations in the .eu TLD,” Digital Threats: Research and Practice, Vol. 2, No. 1, pp. 1-24, 2021.
A. Kountouras, P. Kintis, C. Lever, Y. Chen, Y. Nadji, D. Dagon, M. Antonakakis, R. Joffe, “Enabling network security through active DNS datasets,” International Symposium on Research in Attacks, Intrusions, and Defenses, pp. 188-208, 2016.
M. R. Silveira, L. M. Da Silva, A. M. Cansian, H. K. Kobayashi, “XGBoost applied to identify malicious domains using passive DNS,” 2020 IEEE 19th International Symposium on Network Computing and Applications (NCA), pp. 1-4, 2020.
F. Weimer, “Passive DNS replication,” FIRST Conference on Computer Security Incident, pp. 1-14, 2005.
M. Antonakakis, R. Perdisci, W. Lee, N. Vasiloglou, D. Dagon, “Detecting malware domains at the upper DNS hierarchy,” Proceedings of the 20th USENIX Security Symposium, Vol. 11, pp. 1-16, 2011.
T. Kulikova, T. Shcherbakova, “Spam and Phishing in Q3 2021,” 2021, URL: https://securelist.com/spam-and-phishing-in-q3-2021/104741/, [Online; accessed on December 13, 2021].
Symantec, “Internet security threat report,” Vol. 21, 2019, URL: https://docs.broadcom.com/doc/istr-24-2019-en, [Online; accessed on August 17, 2021].
A. Fernández, S. García, M. Galar, R. C. Prati, B. Krawczyk, F. Herrera, “Learning from imbalanced data sets,” Springer, 2018.
S. J. Yen, Y. S. Lee, “Cluster-based under-sampling approaches for imbalanced data distributions,” Expert Systems with Applications, Vol. 36, No. 3, pp. 5718-5727, 2009.
N. V. Chawla, K. W. Bowyer, L. O. Hall, W. P. Kegelmeyer, “SMOTE: synthetic minority over-sampling technique,” Journal of Artificial Intelligence Research, Vol. 16, pp. 321-357, 2002.
H. Han, W. Y. Wang, B. H. Mao, “Borderline-SMOTE: a new over-sampling method in imbalanced data sets learning,” International Conference on Intelligent Computing, pp. 878-887, 2005.
H. M. Nguyen, E. W. Cooper, K. Kamei, “Borderline over-sampling for imbalanced data classification,” International Journal of Knowledge Engineering and Soft Data Paradigms, Vol. 3, No. 1, pp. 4-21, 2011.
G. Douzas, F. Bacao, F. Last, “Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE,” Information Sciences, Vol. 465, pp. 1-20, 2018.
M. Wullink, G. C. M. Moura, M. Müller, C. Hesselman, “ENTRADA: A high-performance network traffic data streaming warehouse,” NOMS 2016-2016 IEEE/IFIP Network Operations and Management Symposium, pp. 913-918, 2016.
M. Antonakakis, R. Perdisci, D. Dagon, W. Lee, N. Feamster, “Building a dynamic reputation system for DNS,” Proceedings of the 19th USENIX Security Symposium, pp. 273-290, 2010.
L. Bilge, E. Kirda, C. Kruegel, M. Balduzzi, “EXPOSURE: finding malicious domains using passive DNS analysis,” Ndss, pp. 1-17, 2011.
P. Lison , V. Mavroeidis, “Neural reputation models learned from passive DNS data,” 2017 IEEE International Conference on Big Data (Big Data), pp. 3662-3671, 2017.
Z. Bao, W. Wang, Y. Lan, “Using passive DNS to detect malicious domain name,” Proceedings of the 3rd International Conference on Vision, Image and Signal Processing, pp. 1-8, 2019.
Q. Wang, L. Li, B. Jiang, Z. Lu, J. Liu, S. Jian, “Malicious domain detection based on k-means and SMOTE,” International Conference on Computational Science, pp. 468-481, 2020.
L. Watkins, S. Beck, J. Zook, A. Buczak, J. Chavis, W. H. Robinson, J. A. Morales, S. Mishra, “Using semi-supervised machine learning to address the big data problem in DNS networks,” 2017 IEEE 7th Annual Computing and Communication Workshop and Conference (CCWC), pp. 1-6, 2017.
I. Khalil, T. Yu, B. Guan, “Discovering Malicious Domains through Passive DNS Data Graph Analysis,” Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security, pp. 663-674, 2016.
D. Borkin, A. Némethová, G. Micha??onok, K. Maiorov, “Impact of data normalization on classification model accuracy,” Research Papers Faculty of Materials Science and Technology Slovak University of Technology, Vol. 27, No. 45, pp. 79-84, 2019.
T. Chen, C. Guestrin, “XGBoost: A scalable tree boosting system,” Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785-794, 2016.
G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, T. Y. Liu, “LightGBM: A highly efficient gradient boosting decision tree,” Advances in Neural Information Processing Systems 30 (NIPS 2017), Vol. 30, pp. 3146-3154, 2017.
J. Breiman, J. H. Friedman, R. A. Olshen, C. J. Stone, “Classification and regression trees,” CRC Press, 1984.
R. Kohavi, “A study of cross-validation and bootstrap for accuracy estimation and model selection,” Appears in the International Joint Conference on Artificial Intelligence (IJCAI), Vol. 14, No. 2, pp. 1137-1145, 1995.
T. Fawcett, “ROC graphs: notes and practical considerations for researchers,” Machine Learning, Vol. 31, No. 1, pp. 1-38, 2004.
L. M. Da Silva, M. R. Silveira, A. M. Cansian, H. K. Kobayashi, “Multiclass classification of malicious domains using passive DNS with XGBoost:(work in progress),” 2020 IEEE 19th International Symposium on Network Computing and Applications (NCA), pp. 1-3, 2020.