Model Settings

What the sliders actually do

Model parameters change how the AI chooses the next words. They do not rewrite your scenario, repair weak memory, or make every model behave the same. They shape pacing, variation, repetition, and how adventurous the prose feels.

Start with the preset. Only adjust one setting at a time, then test for a few replies. A great setting for one model can make another model ramble, repeat itself, or become too cautious.

The main controls

These settings work together. Think of them as steering the model's word choice, not as quality switches.

Temperature

Controls how bold or predictable the writing feels.

LowerCleaner, steadier, safer. Good for continuity, investigation scenes, and characters who must stay precise.
HigherMore dramatic, surprising, and emotional. Good for romance, chaos, dreams, horror, and messy conflict.
  • Too low can feel dry or stuck.
  • Too high can invent facts or lose who is in the scene.

Top P

Limits the pool of possible words before the model chooses one.

LowerNarrows choices. Useful when a model gets scattered or over-decorates every sentence.
HigherLeaves more options open. Useful for expressive models that need room for voice and style.
  • Top P and Temperature both affect variety.
  • If both are high, the model may become unstable.

Max Tokens

Sets how much the model may write in one reply.

ShorterSnappier back-and-forth, better for dialogue, fights, and quick decisions.
LongerMore room for description, inner conflict, multi-character scenes, and cinematic pacing.
  • Long replies can stall if the model narrates around the point.
  • Short replies can feel abrupt if the scene needs atmosphere.

Presence Penalty

Encourages the model to introduce new details instead of staying on the same idea.

LowerBetter for intimate scenes where the same emotional beat should deepen.
HigherBetter when a scene is looping and needs a new action, clue, interruption, or shift.
  • Too high can make the AI throw in random new elements.
  • Use gently for continuity-heavy stories.

Frequency Penalty

Discourages repeated words and phrases.

LowerPreserves verbal tics, catchphrases, and repeated emotional motifs.
HigherHelps when a model repeats sentence shapes, pet phrases, or the same description.
  • Too high can make prose sound strained.
  • Better for style cleanup than story direction.

Repetition Penalty

A stronger anti-loop control used by some NanoGPT models.

LowerLeaves the model's natural rhythm intact.
HigherPushes harder against loops, repeated phrases, and circular replies.
  • Some models need it. Others get worse with it.
  • Raise slowly, especially on prose-heavy models.

Top K

Caps how many word options are considered.

LowerMore controlled and direct. Can help weaker models stay on track.
HigherMore freedom. Better for models that already follow instructions well.
  • Leave it alone unless a model feels noisy.
  • Too low can make every reply feel similar.

Min P

Filters out very unlikely word choices.

LowerMore flexible, more permissive.
HigherRemoves odd choices and can reduce low-quality loops.
  • Useful when a model collapses into strange wording.
  • Too high can flatten creativity.

Why one model hates another model's settings

Each model has its own training style, default sampling behavior, context handling, and tendency toward repetition. Parameters amplify those traits. They do not affect every model evenly.

Expressive dialogue model

A model built for banter may work beautifully at higher Temperature, but the same setting on a lore-heavy model can make it invent family history or jump to the wrong character.

Long-context model

A model that handles huge context may need lower Temperature and fewer novelty penalties. It already has a lot to track, so pushing it to add more can scatter the scene.

Loop-prone model

A model that repeats itself may improve with Repetition Penalty or Frequency Penalty. A cleaner model can become awkward if those same penalties are too high.

Cinematic prose model

A prose-heavy model often benefits from more tokens and moderate variety. If replies become all atmosphere and no movement, reduce length or raise Presence Penalty slightly.

Roleplay tuning examples

Slow-burn romance

  • Temperature: moderate to high
  • Top P: high enough for emotional nuance
  • Presence Penalty: low to moderate
  • Max Tokens: medium or long

Investigation or mystery

  • Temperature: lower to moderate
  • Top P: moderate
  • Presence Penalty: low
  • Max Tokens: medium

Combat or chase scene

  • Temperature: moderate
  • Top P: moderate
  • Presence Penalty: moderate
  • Max Tokens: short or medium

Loop repair

  • Lower Temperature slightly
  • Raise Frequency Penalty gently
  • Raise Repetition Penalty gently
  • Tell the scene to cut to a new action