Benchmarks

AI Token Efficiency

How jscpd's AI reporter saves tokens compared to other CPD tools — 8× reduction for LLM workflows.

When CPD output is fed to an LLM (for automated refactoring, code review, or deduplication workflows), output size directly impacts cost, latency, and context window usage. jscpd's --reporters ai flag is purpose-built for this — agents and LLM workflows should use it by default.

LLM-Ready Output

Compact formats designed for LLM context windows

jscpd@5AI reporter
2.8k tokens
212 clones·13 tok/clone
jscpd@4AI reporter
2.7k tokens
211 clones·12 tok/clone
jscpd-rsAI reporter
3.0k tokens
222 clones·13 tok/clone
Partially Usable

Structured but limited coverage or verbose output

FallowPlain text
400 tokens
10 clones·40 tok/clone
Only processes JS/TS; low token count reflects limited coverage
SimianPlain text
15k tokens
424 clones·35 tok/clone
No structured metadata; aggregates multi-format files
Not LLM-Ready

Verbose or unstructured output; not suitable for LLM context windows

DuploJSON
158k tokens
518 clones·305 tok/clone
Large JSON output; includes false positives from text-matching
PMD CPDPlain text (34 files)
21k tokens
56 clones·375 tok/clone
Output spread across 34 separate files; only 56 clones found

Tokens/clone is not directly comparable across tools. Each tool defines "clone" differently — jscpd uses token-based code blocks, Duplo reports text matches (many false positives), Simian reports aggregate blocks, PMD CPD only processes 34 of 547 files, and Fallow only handles JS/TS. Compare within the same category for meaningful results.