dc.contributor.author | Fazlali, Mahmood | |
dc.contributor.author | Mirhosseini, Mina | |
dc.contributor.author | Shahsavari, Mahyar | |
dc.contributor.author | Shafarenko, Alex | |
dc.contributor.author | Mashinchi, Mashaallah | |
dc.date.accessioned | 2024-03-28T17:15:01Z | |
dc.date.available | 2024-03-28T17:15:01Z | |
dc.date.issued | 2024-03-04 | |
dc.identifier.citation | Fazlali , M , Mirhosseini , M , Shahsavari , M , Shafarenko , A & Mashinchi , M 2024 , GPU-based Parallel Technique for Solving the N-Similarity Problem in Textual Data Mining . in 2024 Third International Conference on Distributed Computing and High Performance Computing (DCHPC) . Institute of Electrical and Electronics Engineers (IEEE) , pp. 1-6 , 2024 Third International Conference on Distributed Computing and High Performance Computing (DCHPC) , Tehran , Iran, Islamic Republic of , 14/04/24 . https://doi.org/10.1109/DCHPC60845.2024.10454074 | |
dc.identifier.citation | conference | |
dc.identifier.isbn | 979-8-3503-8158-0 | |
dc.identifier.other | ORCID: /0000-0002-1701-5562/work/156578386 | |
dc.identifier.uri | http://hdl.handle.net/2299/27685 | |
dc.description | © 2024 IEEE. This is the accepted manuscript version of an article which has been published in final form at https://doi.org/10.1109/DCHPC60845.2024.10454074 | |
dc.description.abstract | An important issue in data mining and information retrieval is the problem of multiple similarity or n-similarity. This problem entails finding a group of n data points with the highest similarity within a large dataset. Exact methods to solve this problem exist but come with high time and space complexities. Additionally, various metaheuristic algorithms have been proposed, including genetic algorithms, gravitational search algorithms, particle swarm optimization, imperialist competitive algorithms, and fuzzy imperialist competitive algorithms. These metaheuristics are capable of finding near-optimal solutions within a reasonable timeframe, although there is no guarantee of achieving exact results. In this paper, we employ a parallelization technique using CUDA to expedite the exact method. We conduct experiments on textual datasets to identify a group of n textual documents with the highest similarity to each other. The experimental results demonstrate that the proposed parallel exact method significantly reduces execution time compared to the best sequential approach and CPU multi-core implementation. Furthermore, it is evident that the proposed method requires less memory space than the exact method. | en |
dc.format.extent | 6 | |
dc.format.extent | 1115797 | |
dc.language.iso | eng | |
dc.publisher | Institute of Electrical and Electronics Engineers (IEEE) | |
dc.relation.ispartof | 2024 Third International Conference on Distributed Computing and High Performance Computing (DCHPC) | |
dc.subject | multiple similarity | |
dc.subject | n-similarity | |
dc.subject | parallel programming | |
dc.subject | text document similarity | |
dc.subject | Artificial Intelligence | |
dc.subject | Decision Sciences (miscellaneous) | |
dc.subject | Control and Optimization | |
dc.subject | Safety, Risk, Reliability and Quality | |
dc.subject | Computer Networks and Communications | |
dc.subject | Modelling and Simulation | |
dc.title | GPU-based Parallel Technique for Solving the N-Similarity Problem in Textual Data Mining | en |
dc.contributor.institution | Department of Computer Science | |
dc.contributor.institution | School of Physics, Engineering & Computer Science | |
dc.contributor.institution | Centre for AI and Robotics Research | |
dc.contributor.institution | Cybersecurity and Computing Systems | |
dc.contributor.institution | Networks and Security Research Centre | |
dc.date.embargoedUntil | 2026-02-04 | |
dc.identifier.url | http://www.scopus.com/inward/record.url?scp=85187778299&partnerID=8YFLogxK | |
rioxxterms.versionofrecord | 10.1109/DCHPC60845.2024.10454074 | |
rioxxterms.type | Other | |
herts.preservation.rarelyaccessed | true | |