Ghostwriting Detection

GhostDetect® is a tool for analyzing stylometric differences between two writing samples. It detects ghostwriting by finding differences in writing styles from supposedly the same author. Academic ghostwriting is a significant and underappreciated problem facing the academy. GhostDetect® helps to differentiate distinct authors, and thus to place a much needed check on academic ghostwriting. Its results, however, must be interpreted with care.

Any tool like this cannot completely rule out false positives (mistakenly ascribing distinct authorship) or false negatives (mistakenly ascribing identical authorship). For instance, the same author may write in very different styles depending on genre and context. At the same time, distinct authors may write in a similar style. As such, similarities or differences in style exposed by this tool cannot prove identical or distinct authorship.

Applying GhostDetect® is, however, a crucial first step in uncovering ghostwriting. Sharp divergences in style, when detected by this tool and purported to come from the same author, call for, at a minimum, explanation.

Reference Text

Query Text

Flury-Riedwyl Faces Learn More
Flesch–Kincaid Grade Level

Flesch–Kincaid Reading Ease

1 is very difficult to read, 100 is very easy

Gunning Fog Index

Comparable to Grade Levels

Coleman-Liau Index

Comparable to Grade Levels

SMOG Grade

Comparable to Grade Levels

Automated Readability Index

1 is preschool, 2 is grade 1, etc.

Average Sentence Length

Average Syllables Per Word

The readability indexes take into account the length of words and sentences to estimate the reading difficulty of text.

Parts of Speech
  • reference
  • query
Function Words
  • reference
  • query

Function Words are words that signify grammatical relationships rather than semantic meaning. They tend to be independent of the subject matter, and thus can signify the same author writing in different genres.

Syllables in Word
  • reference
  • query
Sentence Lengths
  • reference
  • query

Longest Reference Sentence ()

Longest Query Sentence ()

Suspect Words and Phrases
  • reference
  • query
  • Weasel words such as "clearly", "mostly", or "very" tend to convey little information.
  • Repeated words such as "the people that that came"
  • Passive sentences speak in a passive rather than an active voice: "The dog was walked by George" vs "George walked the dog".
  • Cliche phrases such as "against all odds" or "a loose cannon" are overused.