← Back to Grimoire
AI Alchemy 12 min

The Noise Floor: When to Ignore AI Feedback on Your Fiction

Most posts about AI in fiction assume the feedback is worth acting on. The working novelist's discipline is knowing when it is not, and the cost of getting that wrong is your voice.

The Noise Floor: When to Ignore AI Feedback on Your Fiction

A draft sits in your manuscript file. You hand it to a frontier model for feedback. The model returns three pages of notes. The opening could be stronger. The pacing of the middle act drags. The villain’s motivation feels underdeveloped. The protagonist’s emotional arc would land harder with a moment of vulnerability in chapter eight.

Some of this is useful. Most of it is not. Distinguishing the signal from the noise is the single most consequential craft skill a working novelist develops in the first year of serious AI use. Writers who do not develop it produce manuscripts that have been quietly sanded down by a series of plausible-sounding suggestions, each one of which seemed reasonable in isolation, and which together have stripped out the specificity that made the work worth reading.

This post is the taxonomy. The categories of feedback to ignore. The small set of cases where the model is reliable. The triage discipline that protects your voice from the slow accumulation of corrections that were never actually corrections.

Why the Model Has Bad Opinions About Your Fiction

The model’s opinions about your draft are not random. They are systematically biased in directions that follow from how the model was trained. Knowing the directions lets you predict the feedback before you receive it, which lets you discount it before it influences you.

Frontier language models are trained on a corpus that is overwhelmingly weighted toward general commercial fiction, mass-market nonfiction, journalism, and web text. The taste of that corpus is the taste the model carries into your draft. Smoother. Less specific. Less local. More therapeutic. More resolved. More legible to a reader who reads broadly rather than to a reader who lives inside a single genre’s register.

The post-training process compounds the bias. The model has been further trained to be helpful, safe, and broadly agreeable. Helpfulness in feedback means producing suggestions. Models that returned “the draft is fine, ship it” got penalized in training. Models that produced three pages of suggestions got rewarded. The result is a generator of always-present, always-plausible feedback regardless of whether the feedback is warranted.

Safety training shapes the bias further. Dark content gets flagged. Specific violence gets softer alternatives suggested. Morally ambiguous protagonists get pushed toward redemption. The model is not being prudish on purpose. It is following the gradient its training installed, and the gradient runs away from the specific work grimdark and cosmic horror writers do.

Agreeableness training closes the loop. Models trained to please users will reframe a user’s own ideas back to them, lightly polished, as “feedback” that the user then feels validated by. The user mistakes the echo for analysis. The draft does not improve. The user’s belief in the model’s judgment does.

These four pressures are the noise floor. The feedback you receive sits above this floor, but the floor itself is shaping the signal in predictable ways.

Six Categories of Feedback to Ignore

The specific patterns of bad feedback that recur across models, prompts, and drafts.

Smoothing notes. The model suggests softer alternatives for prose that is doing the work the register requires. The fight scene is “graphic.” The dark romance moment is “uncomfortable.” The villain’s monologue is “off-putting.” The grimdark protagonist’s interior is “bleak.” These are not feedback. These are the model registering that your work exceeds its trained comfort zone, and the suggested fix would erode the specificity that justifies the discomfort. Ignore them. The work was supposed to be uncomfortable. That is the contract.

Convention bias notes. The model suggests bringing an unconventional structural choice into line with general fiction conventions. The opening does not have a clear inciting incident in the first ten pages, so move it earlier. The chapter ends without resolution, so add a beat that signals where the story is going. The point of view shifts in the middle of the scene, so commit to a single perspective. These notes treat conventions as defaults. A working novelist treats conventions as choices, and the unconventional choice is the choice, not a mistake to be corrected.

Pacing illusion notes. The model claims a section drags or feels rushed. The judgment is largely an artifact of how language models perceive prose density. Dense atmospheric paragraphs read as slow to the model because the per-token information rate is high. Sparse action sequences read as rushed for the inverse reason. The reader’s experience does not track the model’s perception of density. Pacing notes from a model are unreliable in both directions, and acting on them tends to produce prose that is uniform in rhythm, which is its own failure mode.

Modernization notes. The model suggests vocabulary fixes that import contemporary registers into settings that should not carry them. The pre-modern character should “process” her grief. The medieval mercenary should “set boundaries” with the warband. The dark-fantasy protagonist should have “agency” in her choices. These suggestions are anachronisms wearing the clothes of clarity. The fix is to recognize the modernization as itself the failure mode and to keep the original prose. Therapy vocabulary in a pre-modern setting is a register violation. The model does not see this because the corpus does not punish it.

Genre frame slippage. The model evaluates your draft against the conventions of a genre adjacent to but not identical with yours. Your grimdark fantasy gets evaluated as if it were heroic fantasy. Your cosmic horror gets evaluated as if it were psychological thriller. Your dark romance gets evaluated as if it were contemporary romance. The notes that result are formally coherent but materially wrong. Your job is to recognize the frame slippage and discount accordingly. Better prompt engineering reduces but does not eliminate this. Even with the right genre name explicit in the prompt, the model’s defaults pull toward the adjacent commercial genre’s conventions.

Echo feedback. You wrote a paragraph. You suspect it is not working. You ask the model what is wrong with it. The model produces a polite restatement of your own paragraph’s ideas, lightly polished, framed as analysis. You read the response and feel like you have been understood. Nothing in the response was actually feedback. The model has reflected your own concern back at you in slightly different language, and your sense of having been helped is unrelated to whether the paragraph improved. This is the most dangerous category of bad feedback because it is the hardest to detect from the inside.

These six categories cover the vast majority of feedback that working novelists should learn to dismiss. The remaining feedback is worth considering. The discipline is making the triage explicit rather than reading every note as equally weighted.

The Small Set of Cases Where the Model Is Reliable

The inverse of the noise list. The cases where AI feedback is reliable enough to act on with confidence.

Constraint adherence. The model is reliable at counting. Word count, paragraph count, chapter count, scene count, named characters. If you ask whether a chapter exceeds a budget, the answer is trustworthy. If you ask whether a character appears in fewer than three scenes after their introduction, the answer is trustworthy. Constraint checking is where the model’s pattern recognition has nothing to do with taste, and the noise floor falls away.

Internal consistency checking. The model is reliable at catching contradictions across a manuscript. Character X had blue eyes in chapter two and brown eyes in chapter eight. The geographical distance between two cities was a three-day ride in book one and a two-week ride in book three. The protagonist’s wound was a knife slash in the second scene and a sword cut in the fifth. These checks are mechanical and the model performs them better than most human first readers.

Continuity tracking across long documents. Tied to consistency but worth listing separately. The model can hold a long manuscript in context and trace the movement of a named object, a piece of information, a relationship status, or a piece of magical lore through the document. The trace is reliable. The judgment about whether the trace is correct is yours.

Explicit anchor comparison. If you supply the model with anchor passages that exemplify what you want and what you do not want, the model can compare a new passage to the anchors and tell you which side of the line it falls on. The reliability comes from the anchors. Without them, the model is making the judgment against its trained taste, which is exactly the noise floor.

Generative drafting against locked specs. When you give the model a tight spec and ask it to produce a draft of a specific component, the output is verifiable. A scene transition. A character introduction in 200 words. A worldbuilding paragraph that hits three specific points. You can verify the output meets the spec. The model is reliable here because you have made the success criterion external rather than relying on the model’s taste.

Mechanical surface checks. Spelling, grammar, formatting consistency, frontmatter validation. The model is at least as good as a careful human at these and faster. There is no taste judgment involved. The notes are correct or incorrect, and you can verify which.

The pattern across these reliable cases is that the success criterion is external to the model’s taste. The model is being asked to perform a verifiable operation against an explicit standard. Where the standard is explicit, the model performs. Where the standard is the model’s own taste, the model produces noise dressed as analysis.

Triage Discipline

The operational practice that keeps the noise floor from corrupting your draft.

Read all feedback through the bias filter first. Before considering whether a note is correct, classify it. Is it a smoothing note. A convention bias note. A pacing illusion. A modernization. A genre frame slippage. An echo. If the note matches one of these categories, discard it without further analysis. The classification itself is the triage. Note that “ignore” is a positive action, not a default. You read the note. You classified it. You decided.

Reserve attention for the verifiable categories. When the model returns notes that fall into the reliable categories above, give them real consideration. The continuity error the model flagged is probably a real continuity error. The constraint violation is probably real. The anchor comparison is probably calibrated. Spend your attention here.

Time-box feedback sessions. A long feedback session with a model produces increasing amounts of noise as the conversation accumulates context. The model’s first three notes on a draft are usually higher quality than the next twelve. Take the first batch, close the session, and act on what was valuable. Returning for a longer session usually generates more noise without proportionally more signal.

Cross-check against another reader. When a model note feels persuasive but cannot be classified into a reliable category, the best discipline is to ask a human reader the same question. The human reader either confirms or contradicts. If the human contradicts, the model was likely surfacing noise that happened to be plausible. If the human confirms, the note was a true positive that survived the filter. Either way, your calibration improves.

Keep a noise log. A running document of model feedback you accepted that you later regretted. Specific paragraph, specific note, specific change made, specific reason the change was wrong. Reviewing the log every few months teaches you which of your own susceptibilities the model exploits. Most writers find that the same two or three categories of bad feedback are the ones they keep falling for. Knowing which ones lets you flag them faster.

The Cost of Getting This Wrong

Every working novelist who has used AI for two years has the same story. A series of small acceptances. The model suggested smoothing a line. The line was sharper in your draft, but the smoothed version was defensible. You accepted. Twelve such acceptances later, the chapter has lost the texture that distinguished it from a thousand other chapters. The book ships. It reviews well in the trade press, where the reviewers do not have your specific taste. It does not connect with the readers who would have loved the sharper version, because the sharper version is gone.

The damage is cumulative and largely invisible at each step. No single accepted note ruined the book. The aggregate of fifty accepted notes did. This is why the triage discipline matters more than any single feedback session. The cost of being agreeable to the model compounds across the project. The cost of being disagreeable is one note’s worth of friction at a time.

A second cost. Writers who do not develop the triage discipline tend to converge with each other. The model is one model, applied to many drafts, producing similar feedback patterns. Accepting that feedback uncritically produces manuscripts that converge on the model’s trained taste rather than diverging into the writer’s own. Your competitive advantage as a working novelist is the specificity of your voice. The noise floor will sand that specificity off if you let it.

When to Stop Asking

A final discipline. There are drafts where you should not ask the model for feedback at all.

Drafts where you have been working long enough to know what they need. Drafts where the choices you are making are deliberate violations of conventions you know the model will flag. Drafts where you are deep enough into the work that any external feedback risks knocking you off the trajectory you have been building. Drafts in registers that the model handles poorly enough that the feedback-to-noise ratio falls below the threshold that justifies the effort of triage.

Knowing when to stop asking is its own craft skill. The model is a tool. Tools have appropriate uses and inappropriate uses, and the working novelist gets better with experience at recognizing which is which. The fact that the model will give you feedback is not a reason to ask for it. The reason to ask is that you have a specific question the model is well-positioned to answer, and you have the discipline to triage the response.

If you do not have the question, do not ask. If you have the question but the noise will overwhelm the answer, do not ask. If you have the question and the answer is verifiable, ask, triage, and act.

Closing

The current culture around AI in fiction tends toward two failure modes. Writers who refuse to use the tools and lose the productivity gains. Writers who use the tools without discipline and lose the voice that justified their work.

The middle path is triage. Use the tools where they pay. Ignore the tools where they would harm. Keep a noise log. Develop the categorical instinct that lets you classify a note in two seconds rather than weighing it for an hour.

The model’s feedback is not your editor. It is a generator of plausible-sounding suggestions, some of which are correct and most of which are noise shaped like correctness. Your job is to keep the signal and discard the rest.

The draft you finish is the draft that survived your judgment. Make sure the judgment was actually yours.