This project is a plagiarism detection system that uses three classic string matching algorithms—Knuth-Morris-Pratt (KMP), Boyer-Moore, and Rabin-Karp—to detect similarities between texts. It includes preprocessing techniques to improve detection accuracy and an efficiency matrix to compare algorithm performance.
The application is built using Streamlit for a user-friendly web interface, allowing users to input text and patterns to check for plagiarism.
- Preprocessing: Tokenization, stop-word removal, and stemming.
- Plagiarism Detection: Using KMP, Boyer-Moore, and Rabin-Karp algorithms.
- Efficiency Matrix: Measures and compares the time taken by each algorithm.
- Web Interface: Built with Streamlit for easy interaction.