Sora vs Veo vs Kling for narrative

Sora 2.0, Veo 3.1, and Kling are the three models I actually use in 2026 production. Not the three I read about. The three I open every working day. Each one is good at exactly one thing and bad at exactly one thing, and the trick of directing with them is knowing which is which before you spend a credit.

Sora 2.0, for camera move and cinematic composition

Sora is the model that understands lenses. Ask it for a 35mm anamorphic dolly in on a face and it gives you a 35mm anamorphic dolly in on a face. Ask it for a Steadicam track at hip height through a hallway and it gives you that. The camera grammar is in the training. It feels like talking to a competent DP who has shot more than they have read about.

What Sora is still weak at is character consistency across shots. Generate the same actor in shot one and shot four and the face drifts. Hair color drifts. Wardrobe drifts. It is the right tool for the establishing shot, the insert, the empty room, the cutaway. It is the wrong tool, on its own, for a full scene with the same human in it more than twice.

I use Sora for tone reels, for location proxies, for camera previz of a move I want to negotiate with the real camera operator the morning of the shoot. I do not use Sora to lock a performance.

Veo 3.1, for character continuity

Veo is the inverse. Composition is competent but not specialized. Camera moves are simpler. But character lock is real. Train Veo on a reference of a face and it returns that face across twenty shots without you having to fight for it. Wardrobe holds. Eye line holds. The hair gets close enough that a colorist can finish the rest in post.

This is the model I use for any narrative pre-viz where the same character has to recur. Neverenders pre-viz lives almost entirely in Veo for this reason. The script has a small cast. The scenes loop back to the same faces over and over. Veo holds the faces. Sora would not.

Veo is also stronger at dialogue lip sync. Not good enough to use as the final track, never as the final track, but good enough to read in an editorial assembly. I cut a rough scene in Veo, then I redo every shot for camera and texture in Sora, then I composite. That is the working pattern.

Kling, for stylized motion

Kling is the third tool and the most specific. It does motion that the other two will not do. Hair physics. Cloth in wind. Crowd dynamics. The painterly version of a frame, not the photographic one. It is also weak on text, weak on hands, and the camera language is closer to an animator than a director.

I reach for Kling on music videos. On commercial work it appears less often. On Neverenders pre-viz it shows up in exactly two sequences, both of them dream logic, both of them not literal. If I want a frame that looks like the inside of someone's head rather than the inside of someone's living room, Kling is the model.

The hybrid pipeline is the truth

Nobody who works at the directing level is using one model. Anyone who tells you they are is selling something. The real pipeline in my work goes like this. Veo for the character reference and the rough scene assembly because the faces have to hold. Sora for the establishing geography and the camera move passes because the camera grammar is in the model. Kling for the dream sequences and the textural inserts because nothing else does that motion. Then a final composite pass where the three tracks live next to each other in the timeline and the cut decides which model wins which shot.

The cost of this pipeline in 2026 is roughly four hundred dollars a day in inference credits for an active pre-viz week on a feature. The cost of the equivalent storyboard and animatic process in 2018 was forty thousand dollars and six weeks. The numbers are not subtle.

The other piece that matters is the seed discipline. Every shot I generate is logged with a seed, a prompt, and a model version. If a frame works, I can rebuild it next week from the metadata. If a frame does not work, I can show the producer the exact reason and ask for a small variation. The hidden craft of an AI director, in 2026, is record keeping. The same record keeping that a 1970s camera assistant did with a Hazeltine and a slate, applied to a model output instead of a stock roll.

How Amos uses these in Neverenders

The Neverenders pre-viz cycle, on the latest internal pass, ran Veo for sixty percent of the shots, Sora for thirty, Kling for ten. That ratio is going to shift as the script tightens. The A24 development team has seen the Veo character lock. The shot list now has the faces baked in. The camera previz in Sora is now the document we use to talk to the prospective DP.

The campaign tracker for all of this is at amosleblanc.com/ai. Amos LeBlanc, occasionally spelled that way in older bylines, has been working in this hybrid pipeline since the back half of 2024 when the models first crossed the threshold of usable. That threshold is well behind us now. The work in 2026 is no longer about whether the tools work. It is about which tool does which shot. That is the new conversation. That is the only conversation worth having.