Plagiarism Detection Application for Computer Science Student Theses Using Cosine Similarity and Rabin-Karp
Abstract
Plagiarism detection is critical in maintaining academic integrity, particularly in higher education. This study focuses on developing a plagiarism detection application for Computer Science student theses. The application leverages the Cosine Similarity and Rabin-Karp algorithms to accurately and efficiently detect textual similarities. Developed using JavaScript, the application provides an intuitive interface and reliable performance, making it a practical tool for educational institutions. The application includes features allowing users to upload thesis documents, analyze textual content, and measure plagiarism levels by comparing them to an existing dataset. The Cosine Similarity algorithm measures the overall similarity between documents, while the Rabin-Karp algorithm focuses on identifying exact matches in phrases and sentences. The results demonstrate the efficacy of both algorithms. For titles, the Cosine Similarity algorithm achieved a 100% similarity rate for identical documents while detecting minor plagiarism with a similarity level of 5.86% for other documents. For abstracts, it achieved 100% similarity for the first document, 2.78% for the second document, and 8.37% for the third document. These findings highlight the algorithm's ability to detect exact matches and partial overlaps in textual content. The Rabin-Karp algorithm showed comparable performance, particularly in detecting phrase-level similarities. For titles, it recorded 100% similarity for identical documents, 11.42% for the second document, and 16.92% for the third document. For abstracts, the algorithm also achieved 100% similarity for the first document, 11.42% for the second document, and 16.81% for the third document. The study confirms that both algorithms complement each other in detecting different forms of plagiarism. The Cosine Similarity algorithm excels in identifying global patterns of similarity, while the Rabin-Karp algorithm is more suited for finding exact matches in specific phrases or sentences. This dual approach provides a comprehensive solution for detecting plagiarism in academic theses. The findings from this research are promising and highlight the potential of the application as a reliable tool for ensuring academic integrity. Future improvements could include expanding the dataset, enhancing the user interface, and integrating additional algorithms for cross-language plagiarism detection. This application contributes to academic honesty and is a valuable resource for educators, researchers, and students in combating plagiarism effectively.
Keywords
Full Text:
PDFReferences
T. A. Prismadana, “Aplikasi Ruang Tugas Dengan Deteksi Kemiripan Teks Pada Dokumen Tugas Menggunakan Cosine Similarity,” vol. 15, no. 1, pp. 31–37, 2023, doi: https://doi.org/10.33795/jtim.v15i1.4405.
L. Hermawan and M. B. Ismiati, “Aplikasi Pengecekan Dokumen Digital Tugas Mahasiswa Berbasis Website,” J. Buana Inform., vol. 11, no. 2, pp. 94–103, 2020, doi: 10.24002/jbi.v11i2.3706.
M. Ayunda and N. Aisya, “Fenomena plagiarisme akademik di era digital The phenomenon of academic plagiarism in the digital age,” vol. 1, no. 2, pp. 16–33, 2021, doi: https://doi.org/10.48078/publetters.v1i2.23.
N. Nurdin, R. Rizal, and R. Rizwan, “Pendeteksian Dokumen Plagiarisme dengan Menggunakan Metode Weight Tree,” Telematika, vol. 12, no. 1, p. 31, 2019, doi: 10.35671/telematika.v12i1.775.
R. A. Salim, M. R. D. Septian, S. Suhartini, D. Anggraini, and Q. Qomariyah, “Aplikasi Pendeteksi Kesamaan Dokumen Dengan Menggunakan Algoritma Jarak Jaro Winkler Dan Levenshtein,” Sebatik, vol. 25, no. 1, pp. 35–41, 2021, doi: 10.46984/sebatik.v25i1.1309.
I. Muhammad, “Pengaruh Perkuliahan Daring Terhadap Kemandirian Belajar Mahasiswa Prodi Pendidikan Matematika Universitas Malikussaleh,” J. Ilm. Pendidik. Mat. Al Qalasadi, vol. 4, no. 1, pp. 24–30, 2020, doi: 10.32505/qalasadi.v4i1.1567.
M. H. Febiawan, A. Setiawan, and A. Primadewi, “Sistem Pendeteksi Dini Plagiarisme Menggunakan Algoritma Levenshtein Distance,” J. Komtika (Komputasi dan Inform., vol. 3, no. 1, pp. 18–27, 2020, doi: 10.31603/komtika.v3i1.3464.
A. Aldian and M. Mubarak, “Implementasi Algoritma Rabin-Karp Untuk Pendeteksian Plagiarisme Pada File Dokumen Berupa Text Berbasis Web,” J. Inf. Syst. Res., vol. 3, no. 3, pp. 150–154, 2022, doi: 10.47065/josh.v3i3.1404.
I. S. Simanullang, “Perancangan Aplikasi Deteksi Kemiripan Dokumen Teks Menggunakan Algoritma Shingling,” J. Sist. Komput. dan Inform. Hal, vol. 2, no. 1, pp. 36–41, 2020, doi: 10.30865/json.v2i1.2451.
P. P. Putra, Afriansyah, and M. Syaifullah, “Pendeteksi Kesamaan Dokumen Pada Sistem Informasi Pendaftaran Proposal Skripsi Dengan Pendekatan Algoritma Rabin-Karp,” ???????????????, vol. 2, no. 2, pp. 1–13, 2019.
A. E. Naiman, E. Farber, and Y. Stein, “Cyclic longest common subsequence,” Discret. Math. Algorithms Appl., vol. 15, no. 4, pp. 217–226, 2023, doi: 10.1142/S1793830922501038.
Runimeirati, Abdul Muis, and Figur Muhammad, “Pelatihan Text Mining Menggunakan Bahasa Pemrograman Python,” Abdimas Langkanae, vol. 3, no. 1, pp. 36–46, 2023, doi: 10.53769/abdimas.3.1.2023.83.
R. Tjut Adek, R. Kesuma Dinata, and A. Ditha, “Online Newspaper Clustering in Aceh using the Agglomerative Hierarchical Clustering Method,” Int. J. Eng. Sci. Inf. Technol., vol. 2, no. 1, pp. 70–75, 2021, doi: 10.52088/ijesty.v2i1.206.
I. Mawanta, T. S. Gunawan, and W. Wanayumini, “Uji Kemiripan Kalimat Judul Tugas Akhir dengan Metode Cosine Similarity dan Pembobotan TF-IDF,” J. Media Inform. Budidarma, vol. 5, no. 2, p. 726, 2021, doi: 10.30865/mib.v5i2.2935.
S. Pawestri and Y. Suyanto, “Analisis Perbandingan Metode Similarity untuk Kemiripan Dokumen Bahasa Indonesia pada Deteksi Kemiripan Teks Bahasa Indonesia,” J. Media Inform. Budidarma, vol. 8, no. 3, p. 1440, 2024, doi: 10.30865/mib.v8i3.7648.
R. P. Pratama, M. Faisal, and A. Hanani, “Deteksi Plagiarisme pada Dokumen Jurnal Menggunakan Metode Cosine Similarity,” SMARTICS J., vol. 5, no. 1, pp. 22–26, 2019, doi: 10.21067/smartics.v5i1.2848.
S. L. B. Ginting, Y. R. Ginting, S. Sutono, and W. A. Sirait, “Aplikasi Deteksi Kemiripan Kata Menggunakan Algoritma Rabin-Karp,” J. Teknol. dan Inf., vol. 12, no. 2, pp. 162–175, 2022, doi: 10.34010/jati.v12i2.6947.
A. Filcha and M. Hayaty, “Implementasi Algoritma Rabin-Karp untuk Pendeteksi Plagiarisme pada Dokumen Tugas Mahasiswa,” JUITA J. Inform., vol. 7, no. 1, p. 25, 2019, doi: 10.30595/juita.v7i1.4063.
R. Rismanto, Y. Yunhasnawa, and R. A. Bhakti, “Penerapan Metode Cosine Similarity Dalam Aplikasi Chatbot Layanan Wisata Di Wilayah Malang,” 2019. [Online]. Available: www.malangkota.go.id,
F. Pramono, Didi Rosiyadi, and Windu Gata, “Integrasi N-gram, Information Gain, Particle Swarm Optimation di Naïve Bayes untuk Optimasi Sentimen Google Classroom,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 3, no. 3, pp. 383–388, 2019, doi: 10.29207/resti.v3i3.1119.
DOI: https://doi.org/10.52088/ijesty.v5i1.686
Article Metrics
Abstract view : 0 timesPDF - 0 times
Refbacks
- There are currently no refbacks.
Copyright (c) 2025 Taufik Habib Ansyari, Dahlan Abdullah, Lidya Rosnita