Editing Vertical Dramas: Pacing and Cut-Rhythm
The editor who came from a background in hour-long television drama sat down with episode one and delivered a cut at three minutes forty seconds. The episode script was 90 seconds long. She had held every reaction, softened every cut, and let the performance breathe exactly the way her training had taught her.
The director watched it once and said: cut it in half.
She cut it in half and it was still too slow.
Editing vertical drama requires unlearning most of what long-form editing trains editors to do. The instincts toward breathing room, toward holding a performance, toward transitions that smooth the cut, are not just wrong for this format. They actively undermine the commercial mechanics the format depends on. An episode that feels comfortable in the edit suite converts at 2% at the paywall. An episode that feels slightly aggressive to the editor converts at 10%.
This is the complete guide to editing vertical drama: the structural framework, the cut-rhythm decisions, the cliffhanger mechanics, and the specific calibration that separates an edit that works on a phone from one that works in an edit suite.
Why Long-Form Editing Instincts Are Wrong for This Format
Long-form editing is trained on a premise of earned patience. The viewer has committed to an hour or more. They expect setup, atmosphere, character establishment, and breathing room between emotional beats. The editor's job is to serve that expectation without losing the story.
Vertical drama operates on a different premise entirely. The editorial tempo of the format is relentless. With the pace of these shows and the style they want, editors are cutting very regularly. WebCraft
The viewer has given 90 seconds conditionally. Every second that does not advance the story or deliver an emotional beat is a second where the viewer's thumb considers swiping. The editor's job is not to serve patience. It is to eliminate every moment that does not earn its place in 90 seconds of conditional attention.
The specific instincts that editors have to override:
Holding reactions. Long-form editing uses extended reaction shots to let the audience process emotional information. A held reaction in vertical drama consumes 3 to 5 seconds of a 90-second episode and delivers diminishing emotional return after the first second. Cut the reaction at its peak, not after it.
Atmospheric establishment. Opening a scene with an establishing shot or environmental beat before the characters speak is conventional in long-form. In vertical drama, the episode opens in the conflict. There is no time for establishment. If the scene location is not immediately clear from the first frame of dialogue, it becomes clear from context within two cuts.
Transition management. Dissolves, crossfades, and transition effects that smooth scene changes in long-form content distract and delay in a 90-second episode. The vertical drama edit is cuts only. Hard cuts. Every time.
The Episode Structure from the Editing Chair
Every episode of vertical drama follows the same four-part timestamp skeleton. The hook from 0 to 15 seconds opens in the conflict. The escalation from 15 to 60 seconds delivers one forward move. The spike from 60 to 80 seconds builds to the tension peak. The button from 80 to 90 seconds ends before the tension releases. WebCraft
The editor's job is to protect this structure against everything the footage does not provide and eliminate everything the footage provides that does not serve it.
The Hook: 0 to 15 Seconds
The hook has to be in the footage. If the hook is not in the first 15 seconds of the rough cut, it is either in the wrong place in the assembly or it does not exist in the episode. The editor cannot create a hook from footage that does not contain one.
What the editor can do: find the earliest moment in the footage where the conflict is present and established, cut everything before it, and open the episode there. The script may have written a pre-conflict beat before the hook arrives. Cut that beat. The episode opens where the conflict is visible, not where the scene begins.
The cut that starts the episode is the most important cut in the episode. It sets the register, establishes the stakes, and determines whether the viewer's attention is engaged in the first three seconds. If the first frame is not charged, the episode is fighting its own opening.
The Escalation: 15 to 60 Seconds
The middle 45 seconds of the episode carry one forward move. The editor's job here is to find the single clearest path through the scene's dramatic material and cut everything that is not on that path.
The most common editorial failure in the middle section: including two forward moves instead of one. Two revelations, two confrontations, two decisions. The episode feels busy rather than escalating, because the viewer cannot hold two simultaneous tension axes across 45 seconds without losing narrative clarity.
Find the single most important forward move in the scene and build the 45 seconds around it. Everything else gets cut, regardless of how well-performed it is.
The cut rhythm in the escalation section runs faster than the hook section. The hook needs enough time for the viewer to register the conflict. The escalation is driving toward the spike and can afford to compress.
The Spike: 60 to 80 Seconds
The spike is the setup for the cliffhanger. The scene reaches its highest tension point, the confrontation peaks, the revelation is one beat away from landing, the decision is about to be made.
Editors have to cut very regularly at this stage, with cameras ready to reframe quickly to pick up a reaction or key prop moment, so that editors have as many options to cut to as possible. Medium
The spike section cuts are the tightest in the episode. Action cuts, emotional peak cuts, forward motion at maximum compression. The viewer should feel the episode accelerating toward something.
What the editor has to protect against in the spike: cutting away from the tension peak before it reaches maximum. A director or actor note that asks to "give the moment more room" in the spike section is asking the editor to dissipate the pressure the episode has been building. Hold the acceleration. Let the spike run to its highest point.
The Button: 80 to 90 Seconds
The episode ends before the tension releases. This is the single most important editorial decision in the episode and the one most frequently compromised by editors trained in long-form.
The button cut is the moment immediately before the resolution, the confrontation's final word, the revelation's consequence, the decision's outcome. The episode cuts there. Not after. There.
How to Tame a Silver Fox works precisely because every episode ends at exactly the right moment: just before the kiss, just before the confrontation, just as the secret is about to surface. That precision is not accidental. It is scripted to the second. Medium
In the edit, the button is often two to four seconds earlier than the editor's first instinct. The long-form trained instinct is to deliver a moment of completion before the cut. The vertical drama button cuts at the setup for the completion, not at the completion itself.
Test the button by watching the episode end on a phone. Does stopping feel uncomfortable? Does the viewer's next action feel like it should be the unlock button? If the episode end feels complete, it is cutting too late.
Cut-Rhythm: What It Is and How to Calibrate It
Rhythm and pacing define a successful edit, manipulating time and sound to create emotional resonance and manage the overall narrative arc. The decision of when to cut, and more importantly when to hold a shot, determines the emotional resonance of a scene. Ambitions AI
In vertical drama, the cut-rhythm is a specific pattern rather than a continuous flow. The episode does not cut at a uniform pace throughout. It has three distinct rhythm modes that correspond to the three sections of the episode structure.
Establishing rhythm (hook): Slightly longer shot durations, 2 to 4 seconds per cut, enough for the viewer to orient to the conflict and the characters. The rhythm here is deliberate without being slow.
Escalation rhythm (middle): Shorter shot durations, 1 to 2 seconds per cut, forward-driving, each cut advancing the single forward move of the episode. No shot held past its emotional peak.
Peak rhythm (spike and button): Shortest shot durations, sometimes under 1 second at the highest tension moments, with cuts on action rather than after action. The rhythm here is the most aggressive in the episode and should feel slightly too fast to an editor who has not calibrated to the format.
The pattern of these three modes within a single 90-second episode is the fundamental cut-rhythm of vertical drama. An episode that uses establishing rhythm throughout feels slow. An episode that uses peak rhythm throughout feels chaotic and exhausting. The pattern is the calibration.
The Cliffhanger: Editorial Mechanics
The cliffhanger is not an event that happens at the end of the episode. It is an editorial construction built from the footage's highest tension moment and the cut that prevents its resolution.
Four cliffhanger types have distinct editorial requirements:
The partial revelation cut. The viewer sees enough to understand what is being revealed but the episode cuts before the full consequence lands. The editor has to find the exact frame where the revelation is established but the response has not yet arrived. One frame too early and the revelation is unclear. One frame too late and the tension is released.
The reversal setup cut. The power dynamic is about to flip. The episode cuts at the last moment before the flip happens. The editor finds the frame where the setup is complete and the reversal is inevitable but has not yet occurred. The viewer's next action has to feel like the only way to see the flip land.
The intrusion cut. A new element enters the scene that reframes everything. The episode cuts immediately after the intrusion is established, before any response to it occurs. The cut should feel like the scene has been interrupted, because it has.
The deadline cut. A timer, ultimatum, or consequence is established in the final seconds. The episode cuts at maximum deadline pressure. The viewer's next action is the only way to know whether the deadline is met.
All four cliffhanger types share the same editorial principle: the cut comes at maximum unresolved tension, not after any form of release. The editor's job is to find that exact frame and cut there, regardless of whether the footage continues past it.
What to Cut and What to Keep
The vertical drama editing room decision framework is simple in principle and difficult in practice: keep everything that advances the episode's single forward move, cut everything that does not.
Cut these without negotiation:
Pre-conflict establishment beats. Walking into a room, sitting down, preparing to speak. The scene starts in the conflict.
Post-peak reactions held past their first beat. The reaction registers in one second. Everything after that is release.
Dialogue that explains rather than advances. A character explaining what the viewer already knows from the scene is consuming seconds that could advance the conflict.
Transition shots between locations. A vertical drama episode does not have time for transition coverage. Cut directly to the next location.
Any shot that is visually interesting but emotionally static. Beautiful composition does not earn its place if it does not carry the episode forward.
Keep these even when the episode feels short:
The full delivery of any line that carries the episode's central emotional weight. Cutting into a performance at its most important line to save two seconds is not efficient editing. It is destroying the material the episode is built around.
The reaction at its peak. One beat of reaction, held to its highest emotional point, then cut. Not held past it.
The button at its maximum tension frame. The episode ends where it ends. Do not extend the episode to reach a more comfortable ending point.
Editing for Episode Volume
A vertical drama series runs 50 to 90 episodes. The editing pipeline for that volume requires a different operational approach than feature or television editing.
The most effective pipeline for vertical drama episode volume:
Build a template assembly from the script's timestamp structure before the footage is reviewed. Episode structure: hook, escalation, spike, button. Lock the episode duration at 75 to 90 seconds before the assembly begins. Work within the container rather than assembling and then trimming.
Review footage against the structure rather than chronologically. Find the hook first. Find the button first. Then find the escalation material that connects them. Chronological review on a 90-second episode generates a 3-minute assembly that has to be cut back by 60%. Template assembly generates a 90-second first cut that requires refinement rather than reduction.
Process episodes in batches by scene location. Episodes shot in the same location share the same color profile, audio environment, and lighting continuity. Processing them as a batch catches continuity errors that would be missed if episodes are processed individually in chronological order.
Establish a device test protocol as part of the episode approval process, not as a final delivery check. Watch every approved episode on a phone in a lit room before it is locked. The monitor and the phone are different delivery environments.
Axis AI Studios Perspective
The editing room is the last creative decision in the production. Everything the script, the performance, and the cinematography built toward either survives the edit or does not.
The specific failure that produces the most commercial damage in vertical drama post-production is an editor who applies long-form instincts to a short-form format. Not because the editor is incompetent. Because they are competent at the wrong discipline. An editor who delivers beautiful, carefully calibrated long-form cuts is producing the wrong product for a format that requires aggressive, commercially precise 90-second units.
The calibration is learnable. Editors who have cut one or two vertical drama series have made the adjustment. The discomfort of cutting faster than their instincts recommend is the recalibration process. An episode that feels slightly too aggressive to the editor is probably correct for the format. An episode that feels comfortable is probably too slow for the viewer on a phone.
At Axis AI Studios, the edit is reviewed on device as a standard workflow step, not as a final delivery check. The monitor is a tool for color and technical review. The phone is the reference for pacing and rhythm. Those two instruments are calibrated to different standards, and the one that matters for commercial performance is the one the viewer is holding.
For productions looking for an editing pipeline built around the format's actual commercial requirements, reach out at business@axisaistudios.com.
Common Editing Mistakes in Vertical Drama
Cutting at the Comfortable Moment
The comfortable cut is the cut that comes after the emotional beat completes, after the reaction lands, after the line delivers its full weight. In vertical drama, the comfortable cut is consistently two to four seconds too late. Cut at the peak, not after it.
Holding Establishing Shots
An establishing shot in a 90-second episode consumes between 2% and 5% of the episode runtime to deliver information the dialogue will convey in the first line of the scene. Cut the establishing shot.
Building to the Cliffhanger Instead of Cutting to It
The cliffhanger is not a sequence that builds to itself. It is a cut. The episode is in the spike, the tension is at maximum, and the episode cuts. If the editor is building a sequence to arrive at the cliffhanger moment, they are adding time the episode does not have.
Treating Every Episode Equally
Not every episode in a 70-episode series carries the same structural weight. Hook episodes, midpoint reversals, and paywall episodes require the tightest editing and the most precise cliffhanger execution. Connective tissue episodes between structural markers can afford slightly softer buttons. The editor should know which type each episode is before cutting it.
Approving on the Monitor
The episode that clears the monitor does not necessarily clear the phone. The two environments render differently. Phone approval is not optional.
FAQ
What Is the Correct Episode Length for a Vertical Drama Edit?
60 to 90 seconds is the industry standard, with 75 seconds as the most common duration for well-performing series. Under 60 seconds feels truncated and does not give the escalation section enough room to establish the forward move. Over 90 seconds loses the format's rhythm and requires the editor to justify every second past the standard duration. If an episode runs to 110 seconds, the most likely cause is a middle section that contains two forward moves instead of one.
How Many Cuts Per Episode Is Normal in Vertical Drama?
It varies by episode type and genre, but a standard 75-second vertical drama episode typically contains 15 to 25 cuts. Spike sections may contain cuts at under one second. Hook sections typically contain cuts at 2 to 3 seconds. The average cut duration across a well-edited episode runs approximately 3 to 5 seconds, significantly faster than long-form television drama at 8 to 12 seconds average.
How Do You Know If the Button Is in the Right Place?
Watch the episode end on a phone. If the ending feels complete, the button is in the wrong place. If stopping feels genuinely uncomfortable, the button is correct. The calibration test is viewer discomfort at the cut point: the viewer should feel that they are being interrupted at the worst possible moment, because they are. That discomfort is the conversion mechanism.
Further Reading
For the script structure that determines what footage the editor has to work with in each episode, the script structure guide for vertical dramas covers the four-part episode framework that editing is completing.
For the full post-production pipeline that the edit sits inside, the vertical drama post-production guide covers sound design, color grading, VFX, and delivery specifications calibrated for phone playback.
For the paywall conversion mechanics that the editing decisions in this guide directly affect, the guide to why some vertical dramas convert at 12% and others at 2% covers what drives the difference between high and low conversion rates.

Let's set
the new standard together.
If you're working on something, we'd like to hear about it.
