Mago Video Models

← Model catalog · Closed-source video models →

Mago video models are built on Mago’s own architecture, leveraging open-source foundations. They run on Mago infrastructure and can be deployed in specific territories or in a client’s own environment under Enterprise terms. All Mago video models share these properties:

Frame-perfect — N input frames produce N output frames, each corresponding to a specific source frame.
Descriptive prompts — describe the desired result; do not write instructions. (Why?)
Settings-rich — more controls than closed-source alternatives. Harder to learn, more powerful once learned.
Available in Relaxed mode — unmetered usage subject to plan limits.
Confidential — not used to train models. Project data stays private.

Model	Type	Use it when…
Mago Transform	Full transformation	You want to substantially change a scene
Mago Style Transfer	Conforming stylization	You must preserve performance / lip sync while restyling
Mago Character	Character replacement	You’re swapping the character in a shot
Mago Inpaint	Localized edits	You’re editing one masked region

Mago Transform

Type: full transformation. Use when the goal is to substantially change a scene — turn a city into a forest, day into night with new geography, an actor into a creature. Driven mostly by an initial keyframe or reference frame, with the prompt playing a major role.

ControlNets for Mago Transform

ControlNets are conditioning signals that constrain how far the model can deviate from the source. They’re one of the most important Mago Transform settings.

ControlNet	Use when
Depth	You want to preserve the original composition and volumes as closely as possible. Hardest to transform under — the model is locked to the original 3D structure.
SoftEdge	You want more interpretive freedom, especially for scenes with movement. The model can reshape outlines.
Pose	You want to track a humanoid character’s pose closely. For human-centric shots.
Depth + Pose	You want both character tracking and environmental fidelity. Most controlled, hardest to transform under.

💡 ControlNet decision — Start with Depth if the camera is static and structure matters. Switch to SoftEdge for creative freedom or significant camera movement. Use Pose for character-focused shots.

Key settings

Setting	Recommended	Notes
ControlNet resolution	Default	Higher = more detail, but can inject artifacts.
ControlNet strength	Default to start	Higher = closer to source. Lower for more transformation.
Use input as control map	Off	Enable only when the source video is a depth/pose map.
Output size	1280 (sweet spot)	Max 1620. Higher costs more, may not improve quality.
Steps	Default	More = sharper. Lower for rougher styles.
Interpolation	Off for detail-critical	On for cheaper, faster renders.
Image sequence export	EXR 16-bit for VFX	PNG otherwise.
Prompt influence	Default	How closely the model follows the prompt.
Color consistency	Default	Reduces color drift across frames.
Seed	Any	Same seed + settings reproduces output.
Shift / low step accelerator	Default	Faster, slight color drift.
Context size	150	Frames per chunk.
Context overlap	Default	Lower for high movement, higher for static.
Start frame buffer	Default	Warm-up frames at render start.
Dynamic reference	On	Disable only for very static shots.

Prompting

Mago Transform expects descriptive prompts. Describe what the output looks like, not what to do.

Example ❌ “Turn this scene into a wooden castle.” ✅ “A medieval wooden castle at dusk with overcast sky, rough-hewn timber walls, slate roof, torches mounted on the gatehouse.”

Mago Style Transfer

Type: stylization that closely conforms to the source. Excels at preserving micro-expressions, lip sync, and fine motion while changing visual style. The strongest fit for AI rendering of acted footage where performance must survive the restyle. Also good for relighting and transformations that stay close to the original geometry.

Key settings

Setting	Default	Notes
Output size	1280	Same logic as Transform; max 1920.
Steps	Default	More for sharper output.
Video input strength	0.5	Below 0.5 often gives better results for shots over 80 frames. Higher locks closer to the source.
Interpolation	Off for fine detail	—
Image sequence export	EXR 16-bit for VFX	—
Shift / low step accelerator	Default	—
Context size	150	Frames per chunk.
Context overlap	Default	Lower for high movement.
Start frame buffer	1	Max 5. Increase if flicker appears at render start.
Dynamic reference	On	Disable for very static shots.

Use cases

AI character rendering — preserve actor performance while replacing the visual treatment.
Animation restyling — re-render an existing animation in a different style.
Relighting — change lighting without changing composition.
Look development — explore visual treatments quickly without rebuilding motion.

Troubleshooting

Symptom	Likely cause	Fix
Flicker at render start	Warm-up too short	Increase Start frame buffer toward 5.
Output too close to source	Video input strength too high	Lower below 0.5, especially for shots over 80 frames.
Lip sync drifts on long shots	Chunking issue	Reduce context overlap; ensure Dynamic reference is on.
Style doesn’t apply strongly	Reference too generic / input strength too high	Use a more distinctive reference; lower video input strength.

More: Troubleshooting.

Mago Character

Type: character replacement. Replaces the character in any scene with a character from a reference image, with or without the reference’s background. The reference doesn’t need to match the source framing — a full-body reference can replace a close-up character.

🧪 Recommended pre-step — Go through Modify Frame first. Generate a reference closely matching the source pose using GPT Image 2 or Nano Banana Pro, then use that frame as the Mago Character reference. Dramatically more controllable than a generic reference.

Settings

Setting	Default	Notes
Use Image Background	Off	On = use the reference’s background. Off = preserve the original.
Output size	1280	Same logic as Transform.
Steps	Default	More for sharper output.
Interpolation	Off for detail	—
Pose strength	Default	Lower for exact positioning. Higher for stylistic deviation.
Face strength	Default	Lower for precise facial replication. Higher for stylized expressions.
Face crop resolution	512	Increase to 768/1024 for high-res output or if eyes appear closed.
Face crop padding	0–5	Increase to 10–20 if facial features are cut off.
Masking threshold	0.3	Lower (0.1–0.2) if detection fails. Higher (0.4–0.6) if too much is detected.
Masking prompt	Descriptive	One or two words: “character”, “person”, “robot”. Multi-subject: “the person in red”.
Grow mask	10–15	15–25 for spiky hair, flowing clothes, large accessories, horns.
Relight	Off	Improves lighting consistency between reference and source. Values above 1 strengthen it.
Prompt influence	Default	—
Seed	Any	Reproducibility.
Shift accelerator	Default	Faster, slight color drift.
Context size / overlap / dynamic reference	Defaults	Chunked rendering controls.

Troubleshooting

Symptom	Fix
Eyes appear closed	Increase Face crop resolution to 768/1024. Increase Face crop padding to 10–20.
Eyebrows/forehead cut off	Increase Face crop padding.
Features (horns, spiky hair) clipped	Increase Grow mask to 15–25.
Background detected as the character	Raise Masking threshold to 0.4–0.6. Use a more specific Masking prompt.
Character not detected at all	Lower Masking threshold to 0.1–0.2. Use a more general prompt like “person”.
Lighting doesn’t match the scene	Enable Relight; raise the value if too weak.
Identity drifts across long shots	Reduce Pose strength variation. Use a more distinctive reference.

Mago Inpaint

Type: precise localized edits. Edits a masked region while leaving the rest intact. Inputs: a source video, a mask, an image reference showing what the masked region should look like, and a descriptive prompt of the post-edit scene.

Settings

Setting	Notes
Image reference	Shows the desired result for the masked area.
Mask input	From the Mask workspace.
Invert mask	Swap masked and unmasked regions.
Expand mask	Grow the mask outward.
Blur mask	Soften mask edges.
Prompt	Descriptive — write the scene as it should look after the edit.
Steps	More for sharper output.
Level of detail	Controls fine detail in the generation.
CFG	Prompt-following strength.
Shift	Controls the frame shift amount for temporal alignment.
Context size / overlap	Chunked rendering controls.
Image sequence export	PNG or EXR.

Prompting

Example ❌ “Replace the teacup with a water bottle.” ✅ “A water bottle is on the table.” Mago models don’t take instructions — describe what the result should look like.

Pixel-perfect preservation

Mago Inpaint (like most video models) doesn’t operate in pure pixel space — unmasked regions can shift slightly (brightness, color, minor texture) due to compression. For pixel-perfect preservation:

Download the mask from the Mask workspace.
Render the Mago Inpaint result.
In Nuke, After Effects, or Fusion, composite the inpainted result onto the original source using the downloaded mask.

This guarantees unmasked pixels are identical to the source. See Export & compositing.

← Model catalog · Closed-source video models →

​Mago Video Models

​Mago Transform

​ControlNets for Mago Transform

​Key settings

​Prompting

​Mago Style Transfer

​Key settings

​Use cases

​Troubleshooting

​Mago Character

​Settings

​Troubleshooting

​Mago Inpaint

​Settings

​Prompting

​Pixel-perfect preservation

Mago Video Models

Mago Transform

ControlNets for Mago Transform

Key settings

Prompting

Mago Style Transfer

Key settings

Use cases

Troubleshooting

Mago Character

Settings

Troubleshooting

Mago Inpaint

Settings

Prompting

Pixel-perfect preservation