Deciphering Originality.AI: Navigating AI Detection and Human Creativity in Writing

Spread the love

How Originality.AI Operates

Originality.AI uses machine learning (ML) to differentiate between AI-generated and human-written text.
Our system employs a custom version of BERT designed for classification. Our engineers found that a more adaptable discriminative model architecture enhances detection capabilities compared to a generative model.
The language model of our tool was developed with a novel architecture using 160GB of text data, undergoing a dual-phase training regime that included both generator and discriminator models, with the latter being pivotal for language modeling.
Our training data was meticulously produced through various sampling techniques and human verification. To refine text generation and boost accuracy, our team implemented methods such as temperature control, Top-K, and nucleus sampling.
For more insights into our technology and training processes, visit our blog.

Understanding the distinction between Plagiarism Detection and AI detection is crucial, as they require different approaches.
Plagiarism detection is straightforward and has been practiced online for years, where any copied text segment indicates plagiarism.
AI detection, however, is more complex and subjective. For example, a 5% AI score with a 95% Human score shows a 95% likelihood of human authorship, not that 5% of the text is AI-generated.
When evaluating Originality.AI scores, consider:
- The AI versus Human score represents the likelihood, according to our AI, of whether the content was AI-generated or human-written.
- For instance, a 10% AI and 90% Human score implies a 90% probability of human origin and 10% of AI creation, not that 90% of the text is human and 10% AI-generated.
- Publishers often use a high human score to denote originality, even with human-authored content.
- While our tool boasts over 94% accuracy on GPT-3, GPT-3.5, and ChatGPT, it’s not infallible, and some inaccuracies occur.
- It’s more reliable to evaluate a series of articles for assessing a writer or service than to judge based on a single piece.
- Article length also influences the accuracy of the scores.

So, what should your threshold be?
Here are some suggested benchmarks for different users:

Recommended threshold for ensuring purely human-generated content:
- Human Average: Above 90%
- Human Minimum: 65
Recommended threshold for sites allowing AI-assisted research:
- Human Average: 75%
- Human Minimum: 50%
Recommended threshold for those using AI to enhance efficiency but wary of AI detection by search engines:
- Human Average: 60%
- Human Minimum: 50%
Recommended threshold for sites using AI for content creation with editorial oversight:
- No specific target
- A minimum of 0% Human is acceptable

Most recently updated posts