تعداد نشریات | 38 |
تعداد شمارهها | 1,240 |
تعداد مقالات | 8,994 |
تعداد مشاهده مقاله | 7,845,127 |
تعداد دریافت فایل اصل مقاله | 4,706,717 |
تشخیص کاربران جعلی در شبکههای اجتماعی با استفاده از تحلیل مولفههای اصلی و الگوریتم تخمین چگالی هسته(مطالعه موردی: روی شبکه اجتماعی توئیتر) | ||
پدافند الکترونیکی و سایبری | ||
دوره 9، شماره 3 - شماره پیاپی 35، آذر 1400، صفحه 109-123 اصل مقاله (871.46 K) | ||
نوع مقاله: مقاله پژوهشی | ||
نویسنده | ||
محمدرضا محمدرضائی* | ||
مربی، گروه کامپیوتر، دانشگاه آزاد اسلامی واحد رامهرمز، رامهرمز، ایران | ||
تاریخ دریافت: 15 آذر 1399، تاریخ بازنگری: 28 بهمن 1399، تاریخ پذیرش: 29 بهمن 1399 | ||
چکیده | ||
استفاده از شبکههای اجتماعی به شکل فزایندهای در حال رشد است و افراد زمان زیادی از وقت خود را صرف استقاده از این شبکهها میکنند. افراد مشهور و شرکتها از این شبکهها برای ارتباط با طرفداران و مشتریان خود استفاده کرده و آژانسهای خبری برای توزیع خبر از این شبکهها استفادهمیکنند. در راستای ترقی محبوبیت و رواج شبکههای اجتماعی بر خط، خطرات و تهدیدات امنیتی نیز درحال افزایش است و انجام فعالیتهای مخرب و حملاتی از قبیل فیشینگ، ایجاد کاربرانجعلی و اسپمها در این شبکههاافزایش چشمگیری داشته است. در حمله ایجاد کاربر جعلی، کاربران مخرب با ایجاد کاربر جعلی خود را به جای افراد معرفی میکنند و از این طریق از شهرت افراد یا شرکتها سوء استفاده میکنند.در این مقاله یک روش جدید برای کشف کاربران جعلی در شبکههای اجتماعی بر پایه الگوریتمهای یادگیری ماشین ارائه میشود. در روش پیشنهادی برای آموزش ماشین از ویژگیهای شباهت مختلفی مانند شباهت کسینوس، شباهت جاکارد، شباهت شبکه دوستی و معیارهای مرکزیت استفاده میشود که همهاین ویژگیها از ماتریس مجاورت گراف شبکه اجتماعی استخراج میشوند. در ادامه جهت کاهش ابعاد دادهها و حل مشکل بیش برازش از تحلیل مولفههای اصلی استفاده شد. سپس با استفاده از دستهبندهایتخمین چگالی هسته و الگوریتم شبکه عصبی خود سازمانده دادهها دستهبندی شده و نتایج روش پیشنهادی با استفاده از معیارهای دقت، حساسیت ونرخ تشخیص اشتباه ارزیابی میشود. بررسی نتایج نشان میدهد، روش پیشنهادی با دقت6/99% کاربرانجعلی را تشخیص میدهد که نسبت به روش کاوو حدود 5% بهبود یافته است، همچنین نرخ تشخیص اشتباه کاربرانجعلی نیز نسبت به همین روش 3% بهبود پیدا کرد. | ||
کلیدواژهها | ||
اکانت های جعلی؛ شبکه های اجتماعی؛ آنالیز گراف؛ الگوریتم تخمین چگالی هسته | ||
عنوان مقاله [English] | ||
Detecting Fake Accounts in Social Networks Using Principal Components Analysis and Kernel Density Estimation Algorithm (A Case Study on the Twitter Social Network) | ||
نویسندگان [English] | ||
mohammadreza mohammadrezaei | ||
Instructor, Department of Computer, Islamic Azad University, Ramhormoz Branch, Ramhormoz, Iran | ||
چکیده [English] | ||
The use of social networks is growing increasingly and people spend a lot of their time using these networks. Celebrities and companies have used these networks to connect with their fans and customers and news agencies use these networks to publish news. In line with the growing popularity of online social networks, security risks and threats are also increasing, and malicious activities and attacks such as phishing, creating fake accounts and spam on these networks have increased significantly. In a fake account attack, malicious users introduce themselves instead of other people by creating a fake account and in this way, they abuse the reputation of individuals or companies. This paper presents a new method for detecting fake accounts in social networks based on machine learning algorithms. The proposed method for machine training uses Various similarity features such as Cosine similarity, Jaccard similarity, friendship network similarity, and centrality measures. All these features are extracted from the graph adjacency matrix of the social network. Then, principal component analysis was used in order to reduce the data dimensions and solve the problem of overfitting. The data are then classified using the Kernel Density Estimation classification and the Self Organization map and the results of the proposed method are evaluated using the measure of accuracy, sensitivity, and false-positive rate. Examination of the results shows that the proposed method detects fake accounts with 99.6% accuracy which is about 5% better than Cao's method. The rate of misdiagnosis of fake accounts also improved by 3% compared to the same method. | ||
کلیدواژهها [English] | ||
:Fake Accounts, Social Networks, Graph Analysis, Algorithm Kernel density Estimation | ||
مراجع | ||
[1] D. Kagan, Y. Elovichi, and M. Fire, “Generic anomalous vertices detection utilizing a link prediction algorithm,” Social Network Analysis and Mining, vol. 8, no. 1, p. 27, 2018.## [2] H. Gao, J. Hu, T. Huang, J. Wang, and Y. J. I. I. C. Chen, “Security issues in online social networks,” vol. 15, no. 4, pp. 56-63, 2011.## [3] L. A. Cutillo, R. Molva, and T. J. I. C. M. Strufe, “Safebook: A privacy-preserving online social network leveraging on real-life trust,” vol. 47, no. 12, pp. 94-101, 2009.## [4] K. Sakariyah, A. Nor, B. Anuara, A. Kamsina, K. D. Varathana, and S. A. Razakb, “Malicious accounts: Dark of the social networks,” Journal of Network and Computer Applications, vol. 79, pp. 41-67, 1 February 2017.## [5] K. Krombholz, D. Merkl, and E. Weippl, “Fake identities in social media: A case study on the sustainability of the facebook business model,” Journal of Service Science Research, vol. 4, no. 2, pp. 175-212, 2012.## [6] H. Yu, M. Kaminsky, P. B. Gibbons, and A. Flaxman, “Sybilguard: defending against sybil attacks via social networks,” in ACM SIGCOMM Computer Communication Review, ACM, vol. 36, no. 4, pp. 267-278, 2006.## [7] E. Van Der Walt and J. J. I. A. Eloff, “Using Machine Learning to Detect Fake Identities: Bots vs Humans,” vol. 6, pp. 6540-6549, 2018.## [8] V. Subrahmanian et al., “The DARPA Twitter bot challenge,” 2016.## [9] M. Fire, R. Goldschmidt, Y. J. I. C. S. Elovici, and Tutorials, “Online social networks: threats and solutions,” vol. 16, no. 4, pp. 2019-2036, 2014.## [10] J. L. Becker and H. Chen, “Measuring privacy risk in online social networks,” 2009.## [11] S. Jagadish and J. Parikh, “Discovery of friends using social network graph properties,” ed: Google Patents, 2014.## [12] M. Cha, A. Mislove, and K. P. Gummadi, “A measurement-driven analysis of information propagation in the flickr social network,” in Proceedings of the 18th international conference on World wide web, ACM, pp. 721-730, 2009.## [13] S. Wasserman and K. Faust, “Social network analysis: Methods and applications,” Cambridge university press, 1994.## [14] J. Scott, “Social network analysis,” Sage, 2017.## [15] E. Otte and R. J. J. o. i. S. Rousseau, “Social network analysis: a powerful strategy, also for the information sciences,” vol. 28, no. 6, pp. 441-453, 2002.## [16] M. Y. Kharaji and F. S. J. a. p. a. Rizi, “An IAC Approach for Detecting Profile Cloning in Online Social Networks,” 2014.## [17] R. Laxhammar, G. Falkman, and E. Sviestins, “Anomaly detection in sea traffic-a comparison of the gaussian mixture model and the kernel density estimator,” in 2009 12th International Conference on Information Fusion, IEEE, pp. 756-763, 2009.## [18] J. A. J. S. Barnes, “Graph theory and social networks: A technical comment on connectedness and connectivity,” vol. 3, no. 2, pp. 215-232, 1969.## [19] P. J. Carrington, J. Scott, and S. Wasserman, “Models and methods in social network analysis,” Cambridge university press, 2005.## [20] S. Jouili, S. Tabbone, and E. Valveny, “Comparing graph similarity measures for graphical recognition,” in International Workshop on Graphics Recognition, Springer, pp. 37-48, 2009.## [21] F. Golshahi, A. Toroghi Haghighat, “providing an improved method in social networks to predict links in multilayer networks,” Electronic and Cyber Defense, vol. 8 (2),pp. 15-24, 2020. (In Persian)## [22] C. G. Akcora, B. Carminati, E. J. S. N. A. Ferrari, and Mining, “User similarities on social networks,” vol. 3, no. 3, pp. 475-495, 2013.## [23] J. Bank and B. J. W. S. T. Cole, “Calculating the jaccard similarity coefficient with map reduce for entity pairs in wikipedia,” pp. 1-18, 2008.## [24] L. Dong, Y. Li, H. Yin, H. Le, and M. J. M. P. i. E. Rui, “The algorithm of link prediction on social network,” vol. 2013, 2013.## [25] J. Santisteban and J. Tejada-Cárcamo, “Unilateral Jaccard Similarity Coefficient,” in GSB@ SIGIR, pp. 23-27, 2015.## [26] H. Seifoddini, M. J. C. Djassemi, and I. Engineering, “The production data-based similarity coefficient versus Jaccard's similarity coefficient,” vol. 21, no. 1-4, pp. 263-266, 1991.## [27] S. Niwattanakul, J. Singthongchai, E. Naenudorn, and S. Wanapu, “Using of Jaccard coefficient for keywords similarity,” in Proceedings of the International MultiConference of Engineers and Computer Scientists, vol. 1, no. 6,2013.## [28] C. A. Bliss, M. R. Frank, C. M. Danforth, and P. S. J. J. o. C. S. Dodds, “An evolutionary algorithm approach to link prediction in dynamic social networks,” vol. 5, no. 5, pp. 750-764, 2014.## [29] T. Zhou, L. Lü, and Y.-C. J. T. E. P. J. B. Zhang, “Predicting missing links via local information,” vol. 71, no. 4, pp. 623-630, 2009.## [30] Q. Li, Y. Zheng, X. Xie, Y. Chen, W. Liu, and W.-Y. Ma, “Mining user similarity based on location history,” in Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems, ACM, p. 34, 2008.## [31] R. J. Bayardo, Y. Ma, and R. Srikant, “Scaling up all pairs similarity search,” in Proceedings of the 16th international conference on World Wide Web, ACM, pp. 131-140, 2007.## [32] A. Gionis, P. Indyk, and R. Motwani, “Similarity search in high dimensions via hashing,” in Vldb, vol. 99, no. 6, pp. 518-529, 1999.## [33] W. Cukierski, B. Hamner, and B. Yang, “Graph-based features for supervised link prediction,” in Neural Networks (IJCNN), The 2011 International Joint Conference on, IEEE, pp. 1237-1244, 2011.## [34] I. T. Jolliffe, “Principal component analysis and factor analysis,” Principal component analysis, pp. 150-166, 2002.## [35] M. B. Pouyan and D. Kostka, “Random forest based similarity learning for single cell RNA sequencing data,” Bioinformatics, vol. 34, no. 13, [36] E. Parzen, “On estimation of a probability density function and mode,” The annals of mathematical statistics, vol. 33, no. 3, pp. 1065-1076, 1962.## [37] J. Kim and C. D. Scott, “Robust kernel density estimation,” The Journal of Machine Learning Research, vol. 13, no. 1, pp. 2529-2565, 2012.## [38] J. Cao, Q. Fu, Q. Li, and D. J. I. S. Guo, “Discovering hidden suspicious accounts in online social networks,” vol. 394, pp. 123-140, 2017.## [39] S. Gurajala, J. S. White, B. Hudson, B. R. Voter, J. N. J. B. D. Matthews, and Society, “Profile characteristics of fake Twitter accounts,” vol. 3, no. 2, p. 2053951716674236, 2016.## [40] G. Wang, W. Jiang, J. Wu, Z. J. I. T. o. P. Xiong, and D. Systems, “Fine-grained feature-based social influence evaluation in online social networks,” vol. 25, no. 9, pp. 2286-2296, 2014.## [41] Z. Shan, H. Cao, J. Lv, C. Yan, and A. Liu, “Enhancing and identifying cloning attacks in online social networks,” in Proceedings of the 7th International Conference on Ubiquitous Information Management and Communication, ACM, p. 59, 2013.## [42] K. S. Adewole, N. B. Anuar, A. Kamsin, K. D. Varathan, S. A. J. J. o. N. Razak, and C. Applications, “Malicious accounts: dark of the social networks,” vol. 79, pp. 41-67, 2017.## [43] M. Al Hasan, V. Chaoji, S. Salem, and M. Zaki, “Link prediction using supervised learning,” in SDM06: workshop on link analysis, counter-terrorism and security, 2006.## [44] D. Savage, X. Zhang, X. Yu, P. Chou, and Q. J. S. N. Wang, “Anomaly detection in online social networks,” vol. 39, pp. 62-70, 2014.## [45] M. Conti, R. Poovendran, and M. Secchiero, “Fakebook: Detecting fake profiles in on-line social networks,” in Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012), IEEE Computer Society,pp. 1071-1078, 2012.## [46] Y. Zhang, J. J. S. N. A. Lu, and Mining, “Discover millions of fake followers in Weibo,” vol. 6, no. 1, p. 16, 2016.## [47] B. Viswanath, A. Post, K. P. Gummadi, and A. J. A. S. C. C. R. Mislove, “An analysis of social network-based sybil defenses,” vol. 41, no. 4, pp. 363-374, 2011.## [48] J. Xue, Z. Yang, X. Yang, X. Wang, L. Chen, and Y. Dai, “Votetrust: Leveraging friend invitation graph to defend against social network sybils,” in INFOCOM, 2013 Proceedings IEEE, pp. 2400-2408, 2013.## [49] Q. Cao, M. Sirivianos, X. Yang, and T. Pregueiro, “Aiding the detection of fake accounts in large scale social online services,” in Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, USENIX Association, pp. 15-15, 2012.## [50] Y. Boshmaf et al., “Integro: Leveraging Victim Prediction for Robust Fake Account Detection in OSNs,” in NDSS, vol. 15, pp. 8-11,2015.## [51] L. Jin, H. Takabi, and J. B. Joshi, “Towards active detection of identity clone attacks on online social networks,” in Proceedings of the first ACM conference on Data and application security and privacy, ACM, pp. 27-38, 2011.## [52] K. L. Arega, “Social Media Fake Account Detection for Afan Oromo Language using Machine Learning,” 2020.## [53] F. C. Akyon and M. E. Kalfaoglu, “Instagram Fake and Automated Account Detection,” in 2019 Innovations in Intelligent Systems and Applications Conference (ASYU), IEEE, pp. 1-7, 2019.## [54] M. Egele, G. Stringhini, C. Kruegel, G. J. I. T. o. D. Vigna, and S. Computing, “Towards detecting compromised accounts on social networks,” no. 1, pp. 1-1, 2017.## [55] S. Lee and J. J. C. C. Kim, “Early filtering of ephemeral malicious accounts on Twitter,” vol. 54, pp. 48-57, 2014.## [56] Z. Yang, C. Wilson, X. Wang, T. Gao, B. Y. Zhao, and Y. J. A. T. o. K. D. f. D. Dai, “Uncovering social network sybils in the wild,” vol. 8, no. 1, p. 2, 2014.## [57] M. Singh, D. Bansal, and S. Sofat, “Detecting malicious users in Twitter using classifiers,” in Proceedings of the 7th International Conference on Security of Information and Networks, ACM, p. 247, 2014.## [58] K. Gani, H. Hacid, and R. Skraba, “Towards multiple identity detection in social networks,” in Proceedings of the 21st International Conference on World Wide Web,ACM, pp. 503-504, 2012.## [59] Available: https://github.com/Kagandi/anomalous-vertices-detection/tree/master/data## [60] Y. Bengio and Y. J. J. o. m. l. r. Grandvalet, “No unbiased estimator of the variance of k-fold cross-validation,” vol. 5, no. Sep, pp. 1089-1105, 2004.## [61] R. Kohavi, “A study of cross-validation and bootstrap for accuracy estimation and model selection,” in Ijcai, 1995, Montreal, Canada, vol. 14, no. 2, pp. 1137-1145, 1995.##
| ||
آمار تعداد مشاهده مقاله: 930 تعداد دریافت فایل اصل مقاله: 547 |