تعداد نشریات | 38 |
تعداد شمارهها | 1,240 |
تعداد مقالات | 8,994 |
تعداد مشاهده مقاله | 7,844,922 |
تعداد دریافت فایل اصل مقاله | 4,706,573 |
انتخاب ویژگی با استفاده از ترکیب الگوریتمهای ژنتیک-وال-کلونی مورچگان، برای پیشبینی خطاهای نرمافزار مبتنی بر یادگیری ماشین | ||
پدافند الکترونیکی و سایبری | ||
مقاله 4، دوره 10، شماره 1 - شماره پیاپی 37، خرداد 1401، صفحه 33-45 اصل مقاله (1000.26 K) | ||
نوع مقاله: مقاله پژوهشی | ||
نویسندگان | ||
علی کریمی* 1؛ محمدرضا ایرجی مقدم2؛ اسماعیل بسطامی3 | ||
1استادیار،دانشگاه جامع امام حسین (ع)، تهران، ایران | ||
2کارشناسی ارشد، دانشگاه جامع امام حسین (ع)، تهران، ایران | ||
3پژوهشگر، دانشگاه جامع امام حسین(ع)، تهران، ایران | ||
تاریخ دریافت: 12 اردیبهشت 1400، تاریخ بازنگری: 21 تیر 1400، تاریخ پذیرش: 05 شهریور 1400 | ||
چکیده | ||
روشهای پیشبینی خطای نرمافزار برای پیشبینی ماژولهای مستعد خطا در مراحل اولیهی توسعهی نرمافزار استفاده میشود. امروزه فنون یادگیری ماشین پرکاربردترین فنون مورد استفاده در زمینهی پیشبینی خطاهای نرمافزار محسوب میشود. ابعاد بالای داده یکی از مشکلاتی است که عملکرد الگوریتمهای یادگیری ماشین را تحت تأثیر قرار میدهد. ابعاد بالای داده به معنای وجود ویژگیهای غیرمفید یا افزونه است که احتمالاً الگوریتم یادگیری را گمراه میکند و درنتیجه دقت آن را کاهش میدهد. دقت پایین پیشبینی خطای نرمافزار باعث شناسایی دیرهنگام بعضی ماژولهای خطادار میشود و در نتیجه تلاش و هزینهی برطرف کردن خطاها را به صورت غیرعادی بالا میبرد. بنابراین حل مشکل ابعاد بالای داده برای افزایش دقت پیشبینی خطای نرمافزار ضروری است. برای کاهش ابعاد داده، محققین از الگوریتمهای انتخاب ویژگی استفاده میکنند. الگوریتمهای انتخاب ویژگی به دو دستهی الگوریتمهای مبتنی بر فیلتر و الگوریتمهای مبتنی بر پوشش تقسیم میشود. الگوریتمهای مبتنی بر پوشش منجر به مدلهای پیشبینی با دقت بالاتری میشود. در این الگوریتمها میتوان از روشهای مختلفی برای جستوجوی راهحلها استفاده نمود که بهترین نوع آن جستوجوی فراابتکاری است. هرکدام از الگوریتمهای فراابتکاری نقاط قوت و ضعفی دارد که محققان برای برطرف کردن این نقاط ضعف از ترکیب این الگوریتمها استفاده میکنند. در این تحقیق برای بهبود نقاط ضعف هرکدام از الگوریتمهای فراابتکاری، از ترکیب سه الگوریتم ژنتیک، کلونی مورچگان و بهینهسازی وال برای انتخاب ویژگی مبتنی بر پوشش استفاده میشود. بدیهی است بهکارگیری روشهای پیشبینی زودهنگام خطاهای نرمافزار قبل از آزمون واقعی آن، یکی از فنون مؤثر پدافند غیرعامل در کاهش هزینههای توسعهی سامانههای نرمافزاری محسوب میشود. برای ارزیابی روش پیشنهادی، 19 پروژهی نرمافزاری مورد بررسی و آزمایش قرار گرفته و نتایج با دیگر روشها مقایسه شده است. نتایج ارزیابی نشان میدهد که روش پیشنهادی از عملکرد بهتری نسبت به سایر روشها برخوردار است. | ||
کلیدواژهها | ||
پیشبینی خطای نرمافزار؛ انتخاب ویژگی؛ الگوریتم فراابتکاری؛ الگوریتم ژنتیک؛ الگوریتم بهینهسازی وال؛ الگوریتم کلونی مورچگان | ||
عنوان مقاله [English] | ||
Feature selection using a combination of Genetic-Whale-Ant colony algorithms for software fault prediction by machine learning | ||
نویسندگان [English] | ||
Ali Karimi1؛ Mohammadreza Irajimoghaddam2؛ Esmaeil Bastami3 | ||
1Assistant Professor, Imam Hossein University, Tehran, Iran | ||
2Master's degree, Imam Hossein University (AS), Tehran, Iran | ||
3Researcher, Imam Hossein University, Tehran, Iran | ||
چکیده [English] | ||
Software fault prediction methods are used to predict fault-prone modules in the early stages of software development. Machine learning techniques are the most common techniques used in software fault prediction. Data dimensionality is one of the problems that affect the performance of machine learning algorithms. Data dimensionality means the existence of irrelevant or redundant features that may mislead the learning algorithm hence decrease its accuracy. The low accuracy of software fault prediction causes the late detection of some faulty modules and as a result increases the effort and cost of fixing faults abnormally. Therefore, solving the data dimensionality problem is necessary to increase the accuracy of software fault prediction. Researchers use the feature selection algorithms for dimensionality reduction. Feature selection algorithms are divided into two types of filter-based feature selection and wrapper-based feature selection algorithms. Wrapper-based algorithms lead to higher accuracy prediction models. In these algorithms we can use different methods to search for the good solutions; the best of which, is the metaheuristic search. Each of the metaheuristic algorithms has some strengths and weaknesses, so the researchers use a combination of these algorithms to address these weaknesses. In this research, to address the weaknesses of each metaheuristic algorithm, a combination of genetic, ant colony and whale optimization algorithms, is used as the wrapper feature selection. Obviously, the application of early software fault prediction methods before the actual test is one of the effective passive defense techniques in reducing the software system development costs. 19 software projects are used to evaluate the proposed method. Comparison of the results with other methods shows that the proposed method outperforms the counterparts. | ||
کلیدواژهها [English] | ||
Software Fault Prediction, Feature Selection, Metaheuristic Algorithm, Genetic Algorithm, Whale Optimization Algorithm, Ant Colony Optimization Algorithm | ||
مراجع | ||
[1] J. Gaur, A. Goyal, T. Choudhury, and S. Sabitha, "A Walk Through of Software Testing Techniques," in 5th International Conference on System Modeling & Advancement in Research Trends, Moradabad, 2016. [2] J. Goyal and B. Kishan, "Progress on Machine Learning Techniques for Software Fault Prediction," International Journal of Advanced Trends in Computer Science and Engineering, vol. 8, no. 2, pp. 305-311, 2019. [3] H. Turabieh, M. Mafarja, and X. Li, "Iterated feature selection algorithms with layered recurrent neural network for software fault prediction," Expert Systems With Applications, vol. 122, pp. 27-42, 2019. [4] F. Karimian and S. M. Babamir, "Evaluation of Classifiers in Software Fault-Proneness Prediction," Journal of AI and Data Mining, vol. 5, no. 2, pp. 149-167, 2017. [5] M. Mafarja, A. Qasem, A. A. Heidari, I. Aljarah, H. Faris, and S. Mirjalili, "Efficient Hybrid Nature-Inspired Binary Optimizers for Feature Selection," Cognitive Computation, vol. 12, no. 1, pp. 150-175, 2019. [6] H. M. Mohammad, S. U. Umar, and T. A. Rashid, "A Systematic and Meta-Analysis Survey of Whale Optimization Algorithm," Computational Intelligence and Neuroscience, vol. 2019, pp. 1-25, 2019. [7] S. Umadevi and K. S. J. Marseline, "A Survey on Data Mining Classification Algorithms," in International Conference on Signal Processing and Communication, Karunya Nagar, 2017. [8] A. Kaur and I. Kaur, "An empirical evaluation of classification algorithms for fault prediction in open source projects," Journal of King Saud University - Computer and Information Sciences, vol. 30, no. 1, pp. 2-17, 2018. [9] P. Singh, R. Malhotra, and S. Bansal, "Analyzing the Effectiveness of Machine Learning Algorithms for Determining Faulty Classes: A Comparative Analysis," in 9th International Conference on Cloud Computing, Data Science & Engineering, Noida, 2019. [10] S. Bernard, L. Heutte, and S. Adam, "Influence of Hyperparameters on Random Forest Accuracy," in International Workshop on Multiple Classifier Systems, Reykjavik, 2009. [11] E. Scornet, "Tuning parameters in random forests," ESAIM: Proceedings and surveys, vol. 60, pp. 144-162, 2018. [12] B. Venkatesh and J. Anuradha, "A Review of Feature Selection and Its Methods," Cybernetics and Information Technologies, vol. 19, no. 1, pp. 3-26, 2019. [13] N. Mlambo, W. K. Cheruiyot, and M. W. Kimwele, "A Survey and Comparative Study of Filter and Wrapper Feature Selection Techniques," The International Journal Of Engineering And Science, vol. 5, no. 8, pp. 57-67, 2016. [14] A. Jović, K. Brkić, and N. Bogunović, "A review of feature selection methods with applications," in 2015 38th international convention on information and communication technology, electronics and microelectronics (MIPRO), 2015: Ieee, pp. 1200-1205. [15] A. O. Balogun, S. Basri, S. J. Abdulkadir, and A. S. Hashim, "Performance Analysis of Feature Selection Methods in Software Defect Prediction: A Search Method Approach," applied sciences, vol. 9, no. 13, p. 2764, 2019. [16] T. Dokeroglu, E. Sevinc, T. Kucukyilmaz, and A. Cosar, "A survey on new generation metaheuristic algorithms," Computers & Industrial Engineering, vol. 137, pp. 1-29, 2019. [17] S. Mirjalili and A. Lewis, "The Whale Optimization Algorithm," Advances in Engineering Software, vol. 95, pp. 51-67, 2016. [18] M. M. Mafarja and S. Mirjalili, "Whale Optimization Approaches for Wrapper Feature Selection," Applied Soft Computing, vol. 62, pp. 441-453, 2018. [19] M. Sharawi, H. M. Zawbaa, and E. Emary, "Feature Selection Approach Based on Whale Optimization Algorithm," in Ninth International Conference on Advanced Computational Intelligence, Doha, 2017. [20] M. Dorigo, M. Birattari, and T. Stutzle, "Ant colony optimization," IEEE Computational Intelligence Magazine, vol. 1, no. 4, pp. 28-39, 2006. [21] E. Zorarpacı and S. A. Özel, "A hybrid approach of differential evolution and artificial bee colony for feature selection," Expert Systems with Applications, vol. 62, pp. 91-103, 2016. [22] G. Haixiang, L. Yijing, J. Shang, G. Mingyun, H. Yuanyue, and G. Bing, "Learning from class-imbalanced data: Review of methods and applications," Expert Systems With Applications, vol. 73, pp. 220-239, 2017. [23] V. López, A. Fernández, S. García, V. Palade, and F. Herrera, "An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics," Information Sciences, vol. 250, pp. 113-141, 2013. [24] P. Branco, L. Torgo, and R. P. Ribeiro, "A Survey of Predictive Modeling on Imbalanced Domains," ACM Computing Surveys, vol. 49, no. 2, pp. 1-50, 2016. [25] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, "SMOTE: Synthetic Minority Over-sampling Technique," Journal of Artificial Intelligence Research, vol. 16, pp. 321-357, 2002. [26] R. Mohammed, J. Rawashdeh, and M. Abdullah, "Machine Learning with Oversampling and Undersampling Techniques: Overview Study and Experimental Results," in 11th International Conference on Information and Communication Systems, Irbid, 2020. [27] C. Catal and B. Diri, "Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem," Information Sciences, vol. 179, no. 8, pp. 1040-1058, 2009. [28] E. Borandag, A. Ozcift, D. Kilinc, and F. Yucalar, "Majority Vote Feature Selection Algorithm in Software Fault Prediction," Computer Science and Information Systems, vol. 16, no. 2, pp. 515-539, 2019. [29] A. K. Jakhar and K. Rajnish, "Software fault prediction with data mining techniques by using feature selection based models," International Journal on Electrical Engineering and Informatics, vol. 10, no. 3, pp. 447-465, 2018. [30] S. Jacob and G. Raju, "Software Defect Prediction in Large Space Systems through Hybrid Feature Selection and Classification," The International Arab Journal of Information Technology, vol. 14, no. 2, pp. 208-214, 2017. [31] M. Anbu and G. S. A. Mala, "Feature selection using firefly algorithm in software defect prediction," Cluster Computing, vol. 22, no. 5, pp. 10925-10934, 2019. [32] C. Manjula and L. Florence, "Deep neural network based hybrid approach for software defect prediction using software metrics," Cluster Computing, vol. 22, no. 4, pp. 9847-9863, 2018. [33] I. Tumar, Y. Hassouneh, H. Turabieh, and T. Thaher, "Enhanced Binary Moth Flame Optimization as a Feature Selection Algorithm to Predict Software Fault Prediction," IEEE Access, vol. 8, pp. 8041-8055, 2020. [34] T. Thaher and N. Arman, "Efficient Multi-Swarm Binary Harris Hawks Optimization as a Feature Selection Approach for Software Fault Prediction," in 11th International Conference on Information and Communication Systems, Irbid, 2020. [35] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten, "The WEKA Data Mining Software: An Update," SIGKDD Explorations, vol. 11, no. 1, pp. 10-18, 2009. [36] E. Özcan, B. Bilgin, and E. E. Korkmaz, "A comprehensive analysis of hyper-heuristics," Intelligent data analysis, vol. 12, no. 1, pp. 3-23, 2008. | ||
آمار تعداد مشاهده مقاله: 1,009 تعداد دریافت فایل اصل مقاله: 254 |