← Back to Grimoire
Visual Sorcery 12 min

Sigil Engines: Training Custom LoRAs for Dark Fiction Visual Identity

Stop wrestling generic image models into your aesthetic. Train a lightweight adapter that bakes your characters, worlds, and mood into every generation

Sigil Engines: Training Custom LoRAs for Dark Fiction Visual Identity

You’ve fought this fight. You prompt Midjourney for the twentieth time, trying to coax it into producing the cover art you can see clearly in your head. “Gothic portrait of a woman with black sclera and silver hair, Victorian mourning dress, oil painting, shadow-choked parlor.” Twenty variants later you have twenty different women, none of them your character, each rendered in a slightly different house style that is not yours.

Prompt engineering has ceilings. Past a certain point, no amount of adjective stacking makes a generic model render the specific face of your recurring protagonist or the specific architecture of your fictional city. You’re asking the model to guess at something it has never seen.

LoRA training solves that problem. A LoRA, short for Low-Rank Adaptation, is a small weight adjustment you train on top of an existing diffusion model. Instead of fine-tuning the entire multi-billion-parameter model, which is expensive, slow, and risky, you train a compact adapter that shifts the model’s behavior toward your target concept. Your character, your world, your aesthetic. Twenty to a hundred megabytes of learned preference that you can load on demand.

This isn’t exotic infrastructure anymore. The tooling has matured considerably. A laptop with a decent GPU, or a few dollars of cloud compute, plus a weekend of careful work, gets you a LoRA that renders your character consistently across a hundred book covers, promotional pieces, and reader magnet assets. Here’s how to do it without wasting that weekend.

What a LoRA Actually Is (and Isn’t)

A LoRA is not a new model. It’s a small set of adjustments that steer an existing model. The base model, whether Stable Diffusion XL, Flux, or whatever you’re using, still does most of the heavy lifting. The LoRA whispers corrections in its ear: when the prompt says Maren, pull the output toward this specific face. When it says the Ash Court, pull toward this architectural vocabulary.

Two practical consequences follow.

First, LoRAs compose. You can stack a character LoRA on top of a style LoRA on top of the base model. Your protagonist’s face rendered in your aesthetic register, with a single prompt. Traditional fine-tuning fuses everything together; LoRAs stay modular.

Second, LoRAs are portable. The file is small enough to share, to back up, to hand to a cover designer who can then produce art featuring your character without you sitting at the keyboard. Your visual identity becomes a file you own.

What a LoRA won’t do: invent capabilities the base model lacks. If SDXL cannot render believable hands, your hand-focused LoRA will not fix that. If Flux has never seen convincing Victorian industrial machinery, a small training set won’t teach it from scratch. LoRAs adjust; they don’t rebuild.

The Three LoRAs Every Dark Fiction Writer Should Train

Start with three, in this order. Each solves a specific problem, and together they cover most of your visual production needs.

The Character LoRA. Your recurring protagonist, your antagonist, any figure who appears across multiple covers or promotional assets. The goal is consistency: the same face, the same proportions, the same distinguishing features across every generation. This is the LoRA that ends the “is this the same character?” discontinuity that plagues series cover art.

The World LoRA. The architectural and environmental vocabulary of your setting. Your fictional city’s skyline. The specific register of your gothic cathedral interior. The particular flora of your cursed wetland. This LoRA lets you generate establishing shots, background plates, and atmospheric pieces that feel native to your world rather than pulled from a generic fantasy mood board.

The Style LoRA. Your aesthetic fingerprint. The palette, the lighting philosophy, the texture quality you want across your visual identity. This is the LoRA that makes a Midjourney render and a Stable Diffusion render and a hand-painted commission all feel like they belong in the same book.

Advanced writers eventually train more, including specific monsters, recurring locations, or particular magical effects, but these three remain the foundation everything else builds from.

Dataset Construction: The Unglamorous Heart of the Work

Everything depends on your training dataset. Model choice matters less than people assume. Training parameters matter less than people assume. Dataset quality matters more than everything else combined. A pristine dataset trained with average settings beats a sloppy dataset trained with perfect settings every time.

For a character LoRA, you need twenty to fifty images of the character rendered in consistent fashion. This creates a chicken-and-egg problem: how do you get fifty consistent images of a character who doesn’t exist yet? Three approaches work.

The commission path. Hire a single illustrator to produce twenty to thirty reference images of your character from varied angles, expressions, and lighting conditions. Expensive but produces the cleanest dataset. The images are consistent because the same human rendered all of them.

The seed-and-curate path. Generate hundreds of images from a generic model using detailed prompts. Ruthlessly curate down to the twenty or thirty most consistent ones. You’re essentially using the base model as a random image generator and your eye as the consistency filter. This path is cheap but requires painful discipline, because the temptation to keep almost-right images will corrupt your dataset.

The hybrid path. Start with three to five commissioned reference images. Use those as image-to-image seeds to generate variations. Curate the variations. This combines commissioned precision with scalable iteration.

For world and style LoRAs, curate from existing sources: your own photography, licensed reference imagery, public-domain art that captures your target aesthetic. Forty to eighty images per LoRA is a reasonable target. More data does not always mean better results. Past a certain point, you’re just adding noise.

A clean dataset obeys a few non-negotiable rules. The first is consistency of target with variation in everything else. Every character image must feature the same character, but angle, lighting, expression, and clothing should vary. You’re teaching the model what is essential (the face) by showing the non-essential varying around it.

Crop and resolution discipline matters almost as much. Training typically happens at 1024x1024 or 512x512 depending on the base model, so your source images should meet or exceed that resolution. Crop aggressively. If you’re training a face LoRA, most of the frame should be face.

Captions are the third pillar. Each training image needs a caption describing what’s in it, using a consistent trigger word for your concept. If your character is named Maren and her trigger token is marn_v1, every caption starts with marn_v1 and then describes the rest of the image. This teaches the model to associate your trigger word with the invariant features while ignoring the variable ones.

Finally, no duplicates and no near-duplicates. Twenty genuinely different images of your character beat fifty nearly-identical variants. Near-duplicates cause the model to overfit on whatever coincidental details the duplicates share.

The Training Run

Once your dataset is clean, the training itself is mostly waiting. The tools have matured to the point where you don’t need to understand the underlying math, but you do need to understand a handful of parameters.

Base model. Choose your base carefully. An SDXL-trained LoRA won’t work on Flux, and vice versa. Pick the model you’ll use for actual production and train against it. For most dark fiction use cases, SDXL or Flux are currently strong choices. SDXL offers finer control and broader community tooling, while Flux tends toward more photorealistic rendering.

Steps and learning rate. The two parameters that matter most. Too few steps and the LoRA undertrains, so your character looks vaguely like the target but not distinctly. Too many steps and it overfits, so your character looks exactly like one specific image from the training set, with no flexibility. Reasonable starting points sit around 1500 to 3000 steps for a character LoRA, with a learning rate near 1e-4. Every run is different. Expect to iterate.

Network dimension and alpha. These control how expressive your LoRA can be. Higher values allow more dramatic adjustments but also increase overfitting risk and file size. A network dimension of 32 and alpha of 16 works well for character LoRAs. Style LoRAs can often work at dimension 8 or 16 since the target adjustment is subtler.

Validation images. Generate sample images every few hundred steps using a fixed prompt and fixed seed. This is your early-warning system. You’ll see the character emerge from noise, become recognizable, stabilize, then begin to collapse into overfitting. Stop training when the sample images look best, not when the step counter hits some round number.

The training itself takes thirty minutes to a few hours depending on your hardware and parameters. You can rent a cloud GPU for a few dollars if local hardware isn’t sufficient.

Testing and Iteration

A LoRA is not done when training finishes. It’s done when you’ve tested it against your actual use cases and confirmed it works across the range of generations you need.

Test matrix. Generate images across a grid of variables: different prompts, different styles, different compositions. Your character should remain recognizably your character whether she’s in a portrait, a wide shot, a combat scene, or a quiet domestic moment. If she only works for portraits, your dataset was too narrow.

Stress tests. Prompt for things outside the training distribution. If you only trained on images where the character faces the camera, try generating a profile view. If your style LoRA was trained on night scenes, try a daylight prompt. The failure modes reveal what the LoRA actually learned versus what you assumed it learned.

Composition with other LoRAs. Load your character LoRA alongside your style LoRA. Do they cooperate or fight? Sometimes two LoRAs interact badly, amplifying each other’s biases. You’ll discover this only by testing.

The honest re-training decision. First LoRAs are rarely the final version. Plan for v2. Your first training run teaches you what’s missing from the dataset: the angles you didn’t capture, the expressions that don’t render, the outfits that confuse the model. Collect those edge cases, augment your dataset, train again. A mature character LoRA typically reaches its useful form on iteration three or four.

Integration Into Your Visual Production Stack

A trained LoRA is a file. Its real value emerges when you integrate it into the workflows it enables.

Cover art iteration. Once your character and style LoRAs are stable, cover iteration collapses from weeks to hours. Draft twenty cover compositions in a single session, all featuring your actual character rendered in your actual aesthetic. Pick the strongest three. Send those to your cover designer as reference, or refine them yourself.

Character art for marketing assets. Social content, reader magnets, merchandise mockups, reader-facing “meet the cast” posts. Each of these used to require either expensive commissions or off-brand stock generations. With a character LoRA, you produce them in minutes and they all feel native to your brand.

World-building reference. A world LoRA lets you generate visual references for scenes you’re writing. Stuck describing a specific corner of your fictional city? Generate five renderings, pick the one that matches your mental image, describe that. The LoRA becomes a bidirectional tool, shaping your writing as well as your marketing.

Reader engagement. Share your LoRA’s output with a framing that treats the custom model as part of the book’s making-of. Reveal how the protagonist was rendered, what the training process looked like, which references informed the aesthetic. This gives readers an entry point into your world before the book arrives and signals technical sophistication that differentiates you from authors relying on stock imagery.

The Ethical Terrain

LoRA training sits in genuinely contested ethical territory, and dark fiction writers should approach it with clear eyes.

Training on images you created or legitimately licensed is uncontroversial. Training on commissioned reference images, where the contract grants you the right to do so, is uncontroversial. Training on public-domain art is legally permitted but worth thinking about, since you’re still using specific artists’ labor even if copyright has expired.

Training on images you pulled from the internet without permission is not uncontroversial. It’s the default on many community LoRA-training platforms, but “everyone does it” isn’t an ethical argument. If your LoRA produces output that clearly derives from a specific living artist’s style, you have a problem that is both ethical and increasingly legal.

The defensible path is to commission your source material, license it properly, or use your own photography and existing artwork. Document your dataset sources. Treat your LoRA like any other piece of your creative infrastructure, with a clean provenance trail. This is both the right thing to do and the only thing that will hold up as platforms tighten rules and courts produce rulings.

Where This Fits in the Longer Arc

LoRA training is a specific, powerful point in a larger trajectory. Image models are getting better. Consistency tooling is getting better. Character turnaround tooling that works from a single reference image is improving fast. There’s a reasonable chance that, in two years, the specific workflow described above is obsolete, replaced by a single reference image and a prompt.

That’s fine. The discipline you build training LoRAs today carries forward regardless of the tool. Learning to construct clean datasets, to diagnose overfitting, to test across a stress matrix: these are transferable skills. When the next generation of tooling arrives, you’ll use it fluently while writers who skipped this generation are still guessing.

The goal isn’t LoRA expertise for its own sake. It’s a visual identity you control, rendered with a consistency your readers recognize across every piece of art that touches your work. The sigil is the same, even as the engines that inscribe it change.

Train the first one this month. Start with your most recurring character. Forty reference images, a weekend of compute, and the patience to iterate. The version of your visual identity that emerges will be yours in a way no prompt-engineered generation has ever been.