Plagiarism is not only a growing problem in schools and universities, but also in the business world.
This is especially true if the business carries out an active publishing program–such as blogs, white papers, special reports and the like.
A few months ago, I spent the afternoon researching plagiarism software for one of my clients, who publishes a lot of documents that are authored by subject matter experts around the world.
Plagiarism software: Comparing apples with apples
In order to compare “apples with apples,” I used two documents: a white paper that I wrote from scratch for my client plus a simple little article I wrote for a writing class that I took several years ago. The article is about a dog show I attended; it did not involve any research, and it has never been published anywhere. This means I know that every word in it is mine and that it has not appeared in any published format.
My first step was to conduct a Google search on plagiarism software. After a couple of hours, I came up with a list of frequently mentioned and reviewed companies. To begin with, I ran tests on my two documents using only free software. However, the results were confusing and inconclusive. I then started signing up for paid sites in order to evaluate whether or not the results would be more effective and standardized. They weren’t.
Some of the free software that I tried included:
This simple tool goes line-by-line through each phrase and highlights in red the areas of concern, making the results easy to read and understand. The red areas are also links, so you can check immediately where the system thinks the phrases came from. If you click on a link, however, you will see that the system is simply picking up random keywords from another text, not whole phrases/paragraphs/ideas that might actually be plagiarized.
White paper: 97% unique
Clarice’s story: 95% unique
After I uploaded the document, a tool came up that provided a final score—but it didn’t provide any explanations.
White paper: 93% non-unique content (content appears in 125 sources)
Clarice’s story: 16% non-unique content (content appears in 35 sources)
I also downloaded a software onto my computer called “Viper.” Unfortunately, when I tried to upload a document for testing, the software completely locked up, and the only way I could get rid of it was by shutting down my computer and starting over. After 3 attempts with similar results, I deleted the software.
Then I decided to try software that requires payment. This included:
This site provided no explanation of the results, only ratings.
White paper: 2.09
Clarice’s story: 2.50
The plagiarism ratings on this site are:
0-1% = good
1-5% = warning
5-100% = plagiarism
This software found that my client’s document had 115 matches from 47 sources! Since I only used six sources to write the document, there’s no way I plagiarized material from that many places. The report takes you to sources on the Web that it believes you have plagiarized from, but it doesn’t show where in the actual document those passages occur—making the results just about useless as far as the ability to find the passages it believes have been plagiarized.
Furthermore, the system is simply picking up random keywords and phrases. Because my client belongs to a particular field (like almost all businesses), certain standard phrases are common; this software highlights those words again and again, which completely skews the results.
White paper: 7.3%
Clarice’s story: 1.9%
3. www.plagtracker.com (premium account)
The graphic above represents the results from this software. Instead of picking up ideas/paragraphs/chunks of plagiarized information, it simply highlights random fragments, including such phrases as:
- My client’s name
- financial and human resources
- the development and implementation of
- what is required of
- Appendices A and B provide
Clearly, these words have no relationship to actual plagiarized passages.
Client’s white paper: 16% plagiarized
Clarice’s story: 0% plagiarized
Every system produces different results that differ widely from each other. Most–if not all–pick up individual words and phrases, not whole passages that might actually be plagiarized.
Another issue is that most of these companies are also trying to sell editing and plagiarism correction services, which makes me even more uncomfortable about the results of their detection systems.
When I was teaching undergraduate writing composition courses a few years ago, I received a paper from a student that was clearly much more sophisticated than anything the student had produced before. I simply typed in some of the major keywords in Google and found the entire document online within seconds.
For all of these reasons, I think that human beings are still the best detectors of plagiarism.
Clarice Dankers is a freelance writer and copy editor in Portland, Oregon, who creates B2B content that helps businesses grow. You can contact Clarice here.