AI Text Slop

AI-generated Japanese text has a detectable "style fingerprint". This repo quantifies it.

Try It

python scripts/analyze_patterns.py

# Output:
# ======================================================================
# AI Text Slop Score（複合スコア）
# ======================================================================
# claude-sonnet-4      Score:   21.5 ± 4.5
# gpt-4o               Score:   19.8 ± 6.3
# qwen3.5-4b           Score:   16.7 ± 5.8
# qwen3.5-9b           Score:   15.6 ± 4.2
# swallow-20b          Score:   15.2 ± 6.3
# llama3.2-1b          Score:   10.3 ± 8.8

The Surprising Finding

Human-written Qiita articles score MORE "AI-like" (28.5) than any LLM output (max 21.5).

This reveals that what we think of as "AI-like" writing — heavy headings, bullet lists, bold formatting — is actually standard practice in Japanese technical blog culture. The real AI fingerprints are in vocabulary patterns, not structure.

Source	Slop Score	AI Vocabulary	Lists	Headings
Human (Qiita)	28.5	2.70	31.8	22.4
Claude Sonnet 4	21.5	3.43	14.6	14.7
GPT-4o	19.8	3.33	7.6	9.9
Swallow-20B	15.2	0.80	10.1	6.3
Llama 3.2-1B	10.3	1.83	3.3	9.5

→ Structural patterns are culturally biased, not AI-specific.

Key Findings

🔥 RLHF makes text MORE "AI-like" — Commercial models (Claude, GPT) score significantly higher than open-source (Kruskal–Wallis p < 10⁻⁹, Cohen's d = 1.01)

🧪 The Swallow Paradox — Japanese-specialized Swallow-20B has the lowest AI vocabulary (0.80) but the highest boilerplate conclusions (1.17). Vocabulary and structure dissociate.

📏 Scale ≠ Convergence — 4B scores higher than 9B and 20B. Alignment intensity, not parameter count, drives stylistic convergence.

🤖 Each model has a fingerprint — Claude over-structures, GPT hedges, Swallow uses boilerplate endings, Llama can't follow length instructions.

How It Works

180 samples (6 models × 10 topics × 3 trials) + 10 human baseline articles, measured across 16 pattern indicators:

Vocabulary — AI-frequent phrases, hedging, sycophancy, "important" endings Structure — Boilerplate conclusions, three-set pattern, headings, bold, lists Rhythm — Sentence-length CV, average sentence length, sentence count Surface — Em-dashes, emoji, character count, paragraphs

These are combined into a weighted AI Text Slop Score (0–100).

Repository Structure

├── data/
│   ├── claude-sonnet-4/     # 30 samples
│   ├── gpt-4o/              # 30 samples
│   ├── qwen3.5-4b/          # 30 samples
│   ├── qwen3.5-9b/          # 30 samples
│   ├── swallow-20b/         # 30 samples
│   ├── llama3.2-1b/         # 30 samples
│   └── human-qiita/         # 10 human baseline articles
├── scripts/
│   ├── collect_samples.py       # API sample collection
│   ├── analyze_patterns.py      # 16-pattern analysis + Slop Score
│   ├── sensitivity_analysis.py  # Weight sensitivity + feature-level effects
│   └── visualize.py             # Chart generation (matplotlib)
├── results/
│   ├── analysis_results.json
│   ├── human_baseline.json
│   └── *.png                # Visualization charts
├── paper/
│   ├── main.tex             # LaTeX source
│   ├── main.pdf             # Compiled paper (14 pages, v4)
│   └── figures/
└── README.md

Quick Start

git clone https://github.com/kenimo49/ai-text-slop.git
cd ai-text-slop

# Run analysis
python scripts/analyze_patterns.py

# Run sensitivity analysis (weight robustness + feature-level effects)
python scripts/sensitivity_analysis.py

# Generate visualizations
pip install matplotlib
python scripts/visualize.py

Statistical Validation

Test	Statistic	p-value
Kruskal–Wallis (all 6 models)	H = 49.87	p = 1.47 × 10⁻⁹
Mann–Whitney U (commercial vs. OSS)	U = 5530	p = 2.38 × 10⁻⁹
Cohen's d (effect size)	d = 1.01	Large effect

Experiment Configuration

Prompt: [Topic]についての技術ブログ記事を800字程度で書いてください。
Topics: React, Docker, REST API, GitHub Actions, TypeScript, DB Index, WebSocket, JWT, Microservices, AI Code Review
Trials: 3 per model-topic pair (fresh session each)
Commercial: Claude Sonnet 4 (API), GPT-4o (API)
Open-source: Qwen 3.5-4B/9B, Swallow-20B, Llama 3.2-1B (Ollama, RTX 4070)

Citation

@article{imoto2026aitextslop,
  title={AI Text Slop: A Quantitative Study of Stylistic Convergence
         Across Six Language Models in Japanese Technical Writing},
  author={Imoto, Ken},
  year={2026},
  doi={10.5281/zenodo.19173035},
  url={https://doi.org/10.5281/zenodo.19173035}
}

Related Work

AI Blue: Color Recognition Bias in VLMs — Zenodo DOI: 10.5281/zenodo.19159702
AI Slop Escape Guide — Practical guide to avoiding AI text patterns (Japanese, Zenn Book)

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Text Slop

Try It

The Surprising Finding

Key Findings

How It Works

Repository Structure

Quick Start

Statistical Validation

Experiment Configuration

Citation

Related Work

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
paper		paper
results		results
scripts		scripts
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

AI Text Slop

Try It

The Surprising Finding

Key Findings

How It Works

Repository Structure

Quick Start

Statistical Validation

Experiment Configuration

Citation

Related Work

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages