From manuscript to editorial decision in eight minutes.
An eight-stage pipeline that mirrors a journal's editorial committee. Here is exactly what happens to your paper, in order.
Eight stages, end to end.
No black boxes. Every stage logs structured output that the next stage reads. You can trace any finding from the final report back to the page in your manuscript.
PDF extraction
~10 sCustom multi-column parser. Recovers spacing from character positions. Strips running headers and page numbers.
Security scan
~5 sDetects invisible Unicode, prompt injection phrases, hidden role markers. Strips them before any agent sees the text.
Section annotation
~15 sHaiku 4.5 inserts inline section markers so each agent focuses on the parts of the paper relevant to its specialty.
Citation verification
~30 sEach reference checked against CrossRef and Semantic Scholar. DOIs verified. Retracted papers and title mismatches flagged.
Independent review
~3 minFive specialist agents review the paper in parallel. Each produces structured findings with page numbers and exact quotes.
Adversarial deliberation
~2 minEach agent reads the others' findings. Must REBUT with paper evidence or ESCALATE missed findings. Silent disagreement is not allowed.
Editorial synthesis
~1 minAn independent editor weighs all 10 outputs (5 reviews × 2 rounds). Classifies findings as consensus, contested, or unique. Computes the score.
Report generation
~10 sInteractive split-view with click-to-highlight. PDF export. Every finding traceable back to its source quote in your manuscript.
Same foundation model. Five different mandates.
All five agents run on the same large language model. The specialisation comes from structured system prompts that define each agent's scope and, critically, its refusal scope. Dr. Stats does not comment on writing. Dr. Writing does not judge methodology. This constraint is what produces focused, non-overlapping feedback. Single-agent review cannot do this because a single prompt cannot simultaneously specialise in five directions.
The methodologist.
Audits experimental design, controls, sample selection, replicability. Surfaces unstated assumptions and missing comparison groups.
The statistician.
Checks 18 named failure modes: confidence intervals, multiple-comparison correction, VIF omission, distribution assumptions, power, effect-size reporting.
The reader.
Checks novelty claims against the literature. Flags missing seminal works, mis-cited papers, and overstated contributions relative to prior art.
The editor.
Checks clarity, structure, terminology consistency, claim-evidence alignment. Flags hedge inflation and unsupported “we show” assertions.
The skeptic.
Asks the uncomfortable question: did you compare against the right baseline? Flags weak comparators, ablations missing, and “no comparison” claims of state-of-the-art. Frequently the deciding voice in deliberation.
What makes PeerPanel adversarial.
Most multi-agent AI reviews concatenate output. We do not. After the independent round, every agent reads every other agent's findings and must take a structured action on each: rebut, escalate, agree, or retract. Silent disagreement is forbidden.
The protocol is enforced at the schema level: an agent's deliberation output cannot be accepted unless it accounts for every cross-finding. This is what produces the “shifted in deliberation” indicators you see in the dashboard: an agent who started with seven findings and ended with five, because two were retracted under cross-examination.
| Action | When used | Requirement |
|---|---|---|
| REBUT | Another agent's finding contradicts evidence in the paper. | Must include an exact quote and page number that disproves the original finding. |
| ESCALATE | Another agent missed something within your specialty. | Must include a quote, page number, and explanation of severity. |
| AGREE | Another agent's finding is consistent with your reading. | Adds confidence weight. Does not duplicate the finding. |
| RETRACT | You are convinced your own finding was wrong by paper evidence. | Must reference the contradicting passage. Editor logs the retraction. |
“A peer disagreement is not evidence. Only a paper passage is. Do not retract unless another agent quotes a passage that contradicts your finding.”
The editor is not a vote-counter.
After deliberation, an independent editor agent reads all ten outputs (five initial reviews and five deliberation rounds), then re-reads the paper. The editor's job is not to tally agreement. It is to weigh each finding against the manuscript itself.
The editor has explicit override power. A finding can be rejected even if all five specialists agreed on it, if the editor's re-reading does not support it. Conversely, a finding raised by a single agent can be promoted to “major” if the paper evidence warrants it. Consensus is a useful prior, not a verdict.
Every override is logged. Every finding in the final report carries a confidence weight, a classification (consensus, contested, or unique), and a link to the exact source passage.
We show our work.
Scores start at 100 and are reduced by weighted findings. Confidence is a 0 to 1 multiplier set by the editor based on evidence strength. The formula is fixed and published: no hidden weights, no opaque adjustments.
Table 1: Finding type to point deduction
| Finding type | Deduction |
|---|---|
| Major consensus | −8 × confidence |
| Major unique | −4 × confidence |
| Minor consensus | −3 × confidence |
| Minor unique | −1 × confidence |
| Retracted finding | +2 |
| Editor-rejected consensus | no deduction |
Table 2: Score to decision mapping
| Score | Decision |
|---|---|
| ≥ 85 | ACCEPT |
| 60 to 84 | MINOR REVISIONS |
| 30 to 59 | MAJOR REVISIONS |
| < 30 | REJECT |
Vibes-based scoring is not science. We show our work.
See it on your own paper.
Three free reviews per researcher during Research Preview. No credit card.
Start your first review →