Skip to main content

Mago Video Models

← Model catalog · Closed-source video models →
Mago video models are built on Mago’s own architecture, leveraging open-source foundations. They run on Mago infrastructure and can be deployed in specific territories or in a client’s own environment under Enterprise terms. All Mago video models share these properties:
  • Frame-perfect — N input frames produce N output frames, each corresponding to a specific source frame.
  • Descriptive prompts — describe the desired result; do not write instructions. (Why?)
  • Settings-rich — more controls than closed-source alternatives. Harder to learn, more powerful once learned.
  • Available in Relaxed mode — unmetered usage subject to plan limits.
  • Confidential — not used to train models. Project data stays private.
ModelTypeUse it when…
Mago TransformFull transformationYou want to substantially change a scene
Mago Style TransferConforming stylizationYou must preserve performance / lip sync while restyling
Mago CharacterCharacter replacementYou’re swapping the character in a shot
Mago InpaintLocalized editsYou’re editing one masked region

Mago Transform

Type: full transformation. Use when the goal is to substantially change a scene — turn a city into a forest, day into night with new geography, an actor into a creature. Driven mostly by an initial keyframe or reference frame, with the prompt playing a major role.

ControlNets for Mago Transform

ControlNets are conditioning signals that constrain how far the model can deviate from the source. They’re one of the most important Mago Transform settings.
ControlNetUse when
DepthYou want to preserve the original composition and volumes as closely as possible. Hardest to transform under — the model is locked to the original 3D structure.
SoftEdgeYou want more interpretive freedom, especially for scenes with movement. The model can reshape outlines.
PoseYou want to track a humanoid character’s pose closely. For human-centric shots.
Depth + PoseYou want both character tracking and environmental fidelity. Most controlled, hardest to transform under.
💡 ControlNet decision — Start with Depth if the camera is static and structure matters. Switch to SoftEdge for creative freedom or significant camera movement. Use Pose for character-focused shots.

Key settings

SettingRecommendedNotes
ControlNet resolutionDefaultHigher = more detail, but can inject artifacts.
ControlNet strengthDefault to startHigher = closer to source. Lower for more transformation.
Use input as control mapOffEnable only when the source video is a depth/pose map.
Output size1280 (sweet spot)Max 1620. Higher costs more, may not improve quality.
StepsDefaultMore = sharper. Lower for rougher styles.
InterpolationOff for detail-criticalOn for cheaper, faster renders.
Image sequence exportEXR 16-bit for VFXPNG otherwise.
Prompt influenceDefaultHow closely the model follows the prompt.
Color consistencyDefaultReduces color drift across frames.
SeedAnySame seed + settings reproduces output.
Shift / low step acceleratorDefaultFaster, slight color drift.
Context size150Frames per chunk.
Context overlapDefaultLower for high movement, higher for static.
Start frame bufferDefaultWarm-up frames at render start.
Dynamic referenceOnDisable only for very static shots.

Prompting

Mago Transform expects descriptive prompts. Describe what the output looks like, not what to do.
Example“Turn this scene into a wooden castle.”“A medieval wooden castle at dusk with overcast sky, rough-hewn timber walls, slate roof, torches mounted on the gatehouse.”

Mago Style Transfer

Type: stylization that closely conforms to the source. Excels at preserving micro-expressions, lip sync, and fine motion while changing visual style. The strongest fit for AI rendering of acted footage where performance must survive the restyle. Also good for relighting and transformations that stay close to the original geometry.

Key settings

SettingDefaultNotes
Output size1280Same logic as Transform; max 1920.
StepsDefaultMore for sharper output.
Video input strength0.5Below 0.5 often gives better results for shots over 80 frames. Higher locks closer to the source.
InterpolationOff for fine detail
Image sequence exportEXR 16-bit for VFX
Shift / low step acceleratorDefault
Context size150Frames per chunk.
Context overlapDefaultLower for high movement.
Start frame buffer1Max 5. Increase if flicker appears at render start.
Dynamic referenceOnDisable for very static shots.

Use cases

  • AI character rendering — preserve actor performance while replacing the visual treatment.
  • Animation restyling — re-render an existing animation in a different style.
  • Relighting — change lighting without changing composition.
  • Look development — explore visual treatments quickly without rebuilding motion.

Troubleshooting

SymptomLikely causeFix
Flicker at render startWarm-up too shortIncrease Start frame buffer toward 5.
Output too close to sourceVideo input strength too highLower below 0.5, especially for shots over 80 frames.
Lip sync drifts on long shotsChunking issueReduce context overlap; ensure Dynamic reference is on.
Style doesn’t apply stronglyReference too generic / input strength too highUse a more distinctive reference; lower video input strength.
More: Troubleshooting.

Mago Character

Type: character replacement. Replaces the character in any scene with a character from a reference image, with or without the reference’s background. The reference doesn’t need to match the source framing — a full-body reference can replace a close-up character.
🧪 Recommended pre-step — Go through Modify Frame first. Generate a reference closely matching the source pose using GPT Image 2 or Nano Banana Pro, then use that frame as the Mago Character reference. Dramatically more controllable than a generic reference.

Settings

SettingDefaultNotes
Use Image BackgroundOffOn = use the reference’s background. Off = preserve the original.
Output size1280Same logic as Transform.
StepsDefaultMore for sharper output.
InterpolationOff for detail
Pose strengthDefaultLower for exact positioning. Higher for stylistic deviation.
Face strengthDefaultLower for precise facial replication. Higher for stylized expressions.
Face crop resolution512Increase to 768/1024 for high-res output or if eyes appear closed.
Face crop padding0–5Increase to 10–20 if facial features are cut off.
Masking threshold0.3Lower (0.1–0.2) if detection fails. Higher (0.4–0.6) if too much is detected.
Masking promptDescriptiveOne or two words: “character”, “person”, “robot”. Multi-subject: “the person in red”.
Grow mask10–1515–25 for spiky hair, flowing clothes, large accessories, horns.
RelightOffImproves lighting consistency between reference and source. Values above 1 strengthen it.
Prompt influenceDefault
SeedAnyReproducibility.
Shift acceleratorDefaultFaster, slight color drift.
Context size / overlap / dynamic referenceDefaultsChunked rendering controls.

Troubleshooting

SymptomFix
Eyes appear closedIncrease Face crop resolution to 768/1024. Increase Face crop padding to 10–20.
Eyebrows/forehead cut offIncrease Face crop padding.
Features (horns, spiky hair) clippedIncrease Grow mask to 15–25.
Background detected as the characterRaise Masking threshold to 0.4–0.6. Use a more specific Masking prompt.
Character not detected at allLower Masking threshold to 0.1–0.2. Use a more general prompt like “person”.
Lighting doesn’t match the sceneEnable Relight; raise the value if too weak.
Identity drifts across long shotsReduce Pose strength variation. Use a more distinctive reference.

Mago Inpaint

Type: precise localized edits. Edits a masked region while leaving the rest intact. Inputs: a source video, a mask, an image reference showing what the masked region should look like, and a descriptive prompt of the post-edit scene.

Settings

SettingNotes
Image referenceShows the desired result for the masked area.
Mask inputFrom the Mask workspace.
Invert maskSwap masked and unmasked regions.
Expand maskGrow the mask outward.
Blur maskSoften mask edges.
PromptDescriptive — write the scene as it should look after the edit.
StepsMore for sharper output.
Level of detailControls fine detail in the generation.
CFGPrompt-following strength.
ShiftControls the frame shift amount for temporal alignment.
Context size / overlapChunked rendering controls.
Image sequence exportPNG or EXR.

Prompting

Example“Replace the teacup with a water bottle.”“A water bottle is on the table.” Mago models don’t take instructions — describe what the result should look like.

Pixel-perfect preservation

Mago Inpaint (like most video models) doesn’t operate in pure pixel space — unmasked regions can shift slightly (brightness, color, minor texture) due to compression. For pixel-perfect preservation:
  1. Download the mask from the Mask workspace.
  2. Render the Mago Inpaint result.
  3. In Nuke, After Effects, or Fusion, composite the inpainted result onto the original source using the downloaded mask.
This guarantees unmasked pixels are identical to the source. See Export & compositing.
← Model catalog · Closed-source video models →