Karafan Journal

Karafan Journal

Clustering based on Affinity Matrix First Search-Breadth Graph

Document Type : Original Article

Author
Assistant Professor, Department of Computer Engineering, National University of Skills (NUS), Tehran, Iran
Abstract
Density-based clustering algorithms are commonly used in machine learning and data mining due to their ability to identify clusters with different shapes and noisy objects. These algorithms are famous in data analysis and the use of their analysis output in industry and business. However, traditional clustering algorithms may have difficulty with datasets with different densities and overlapping neighboring clusters. To address these challenges, a new density-based clustering algorithm is proposed in this article. In this algorithm, the dependency matrix and the first -level search graph are used to find the dense points and the connection between the points, the concept of the relevant space is introduced to define the local and global density, and a central point identification method is used to identify the cluster structures. This algorithm also uses an allocation strategy based on the relevant space for the remaining objects to achieve accurate clustering results. Experimental results on real data sets show the effectiveness of the proposed method in clustering performance.
Keywords
Subjects

[1] Chen, Y., Hu, X., Fan, W., Shen, L., Zhang, Z., Liu, X., Du, J., Li, H., Chen, Y., & Li, H. (2020). Fast density peak clustering for large scale data based on kNN. Knowledge-Based Systems, 187, 104824, https://doi.org/10.1016/j.knosys.2019.06.032.
[2] Jan, Z., Ai-Ansari, N., Mousa, O., Abd-Alrazaq, A., Ahmed, A., Alam, T., & Househ, M. (2021). The role of machine learning in diagnosing bipolar disorder: scoping review. Journal of medical Internet research, 23(11), e29749,  https://preprints.jmir.org/preprint/29749.
[3] Wen, J., Xuan, S., Li, Y., Peng, Q., & Gao, Q. (2020). Image segmentation algorithm based on neutrosophic fuzzy clustering with non‐local information. IET Image Processing, 14(3), 576-584, https://doi.org/10.1049/iet-ipr.2018.5949.
[4] Zou, Q., Lin, G., Jiang, X., Liu, X., & Zeng, X. (2020). Sequence clustering in bioinformatics: an empirical study. Briefings in bioinformatics, 21(1), 1-10, https://doi.org/10.1093/bib/bby090.
[5] Tombros, A., Villa, R., & Van Rijsbergen, C. J. (2002). The effectiveness of query-specific hierarchic clustering in information retrieval. Information processing & management, 38(4), 559-582, https://doi.org/10.1016/S0306-4573(01)00048-6.
[6] Jain, A. K. (2008). Data clustering: 50 years beyond k-means. Joint European Conference on Machine Learning and Knowledge Discovery in Databases, https://doi.org/10.1007/978-3-540-87479-9_3.
[7] Jin, H., Leung, K.-S., Wong, M.-L., & Xu, Z.-B. (2005). Scalable model-based cluster analysis using clustering features. Pattern Recognition, 38(5), 637-649, https://doi.org/10.1016/j.patcog.2004.07.012.
[8] Murtagh, F., & Contreras, P. (2012). Algorithms for hierarchical clustering: an overview. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2(1), 86-97,  https://doi.org/10.1002/widm.53.
[9] Kriegel, H. P., Kröger, P., Sander, J., & Zimek, A. (2011). Density‐based clustering. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 1(3), 231-240,  https://doi.org/10.1002/widm.30.
[10] Kumar, K. M., & Reddy, A. R. M. (2016). A fast DBSCAN clustering algorithm by accelerating neighbor searching using Groups method. Pattern Recognition, 58, 39-48, https://doi.org/10.1016/j.patcog.2016.03.008.
[11] Tang, C., Wang, H., Wang, Z., Zeng, X., Yan, H., & Xiao, Y. (2021). An improved OPTICS clustering algorithm for discovering clusters with uneven densities. Intelligent Data Analysis, 25(6), 1453-1471, https://doi.org/10.3233/IDA-205497.
[12] Hu, L., & Chan, K. C. (2015). A density-based clustering approach for identifying overlapping protein complexes with functional preferences. BMC bioinformatics, 16, 1-16, https://doi.org/10.1186/s12859-015-0583-3.
[13] Rodriguez, A., & Laio, A. (2014). Clustering by fast search and find of density peaks. science, 344(6191), 1492-1496, DOI: 10.1126/science.1242072.
[14] Guo, W., Wang, W., Zhao, S., Niu, Y., Zhang, Z., & Liu, X. (2022). Density peak clustering with connectivity estimation. Knowledge-Based Systems, 243, 108501, https://doi.org/10.1016/j.knosys.2022.108501.
[15] Xu, X., Ding, S., & Shi, Z. (2018). An improved density peaks clustering algorithm with fast finding cluster centers. Knowledge-Based Systems, 158, 65-74, https://doi.org/10.1016/j.knosys.2018.05.034.
[16] Xie, J., Gao, H., Xie, W., Liu, X., & Grant, P. W. (2016). Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors. Information Sciences, 354, 19-40, https://doi.org/10.1016/j.ins.2016.03.011.
[17] Xu, X., Ding, S., Wang, L., & Wang, Y. (2020). A robust density peaks clustering algorithm with density-sensitive similarity. Knowledge-Based Systems, 200, 106028, https://doi.org/10.1016/j.knosys.2020.106028.
[18] Hou, J., Zhang, A., & Qi, N. (2020). Density peak clustering based on relative density relationship. Pattern Recognition, 108, 107554, https://doi.org/10.1016/j.patcog.2020.107554.
[19] Tong, W., Liu, S., & Gao, X.-Z. (2021). A density-peak-based clustering algorithm of automatically determining the number of clusters. Neurocomputing, 458, 655-666, https://doi.org/10.1016/j.neucom.2020.03.125.
[20] Zhu, Y., Ting, K. M., Carman, M. J., & Angelova, M. (2021). CDF Transform-and-Shift: An effective way to deal with datasets of inhomogeneous cluster densities. Pattern Recognition, 117, 107977, https://doi.org/10.1016/j.patcog.2021.107977.
[21] Yaohui, L., Zhengming, M., & Fang, Y. (2017). Adaptive density peak clustering based on K-nearest neighbors with aggregating strategy. Knowledge-Based Systems, 133, 208-220, https://doi.org/10.1016/j.knosys.2017.07.010.
[22] Wang, Y., Wang, D., Zhang, X., Pang, W., Miao, C., Tan, A.-H., & Zhou, Y. (2020). McDPC: multi-center density peak clustering. Neural Computing and Applications, 32, 13465-13478, https://doi.org/10.1007/s00521-020-04754-5.
[23] Ding, S., Du, W., Xu, X., Shi, T., Wang, Y., & Li, C. (2023). An improved density peaks clustering algorithm based on natural neighbor with a merging strategy. Information Sciences, 624, 252-276, https://doi.org/10.1016/j.ins.2022.12.078.
[24] Pourbahrami, S., Balafar, M. A., & Khanli, L. M. (2023). ASVMK: A novel SVMs Kernel based on Apollonius function and density peak clustering. Engineering Applications of Artificial Intelligence, 126, 106704, https://doi.org/10.1016/j.engappai.2023.106704.
[25] Pourbahrami, S., & Hashemzadeh, M. (2022). A geometric-based clustering method using natural neighbors. Information Sciences, 610, 694-706, https://doi.org/10.1016/j.ins.2022.08.047.
[26] Pourbahrami, S., Balafar, M. A., Khanli, L. M., & Kakarash, Z. A. (2020). A survey of neighborhood construction algorithms for clustering and classifying data points. Computer Science Review, 38, 100315, https://doi.org/10.1016/j.cosrev.2020.100315.
[27] Pourbahrami, S., Khanli, L. M., & Azimpour, S. (2019). A novel and efficient data point neighborhood construction algorithm based on Apollonius circle. Expert Systems with Applications, 115, 57-67, https://doi.org/10.1016/j.eswa.2018.07.066.
[28] Abdolmaleki, N., Khanli, L. M., Hashemzadeh, M., & Pourbahrami, S. (2022). ACQC: Apollonius Circle‐based Quantum Clustering. Journal of Computational Science, 64, 101877, https://doi.org/10.1016/j.jocs.2022.101877.
[29] Pourbahrami, S., Khanli, L. M., & Azimpour, S. (2020). Improving neighborhood construction with Apollonius region algorithm based on density for clustering. Information Sciences, 522, 227-240, https://doi.org/10.1016/j.ins.2020.02.049.
[30] Sundqvist, M., Chiquet, J., & Rigaill, G. (2023). Adjusting the adjusted Rand Index: A multinomial story. Computational Statistics, 38(1), 327-347, https://doi.org/10.1007/s00180-022-01230-7.
[31] Amelio, A., & Pizzuti, C. (2017). Correction for closeness: Adjusting normalized mutual information measure for clustering comparison. Computational Intelligence, 33(3), 579-601,  https://doi.org/10.1111/coin.12100.
 
Volume 22, Issue 1
Technical and Engineering
Spring 2025
Pages 36-59

  • Receive Date 20 August 2024
  • Revise Date 03 November 2024
  • Accept Date 24 November 2024