Can I create a 1 min Ad on my own with zero video editing skills?

How I used AI tools to turn an idea into a cinematic short film in just a weekend.

It all started with this Kalshi AI Ad that went viral.

When I saw this AI-generated ad, my first thought was: “Wtf. There’s no way this was solely done using AI.”

Then I saw a lot of people on X sharing their Veo-3 AI-generated clips, and I started to question if it was possible.

The script

Between dog walks and daily routines, I found myself sketching out ideas for what would become an AI-generated video to announce OpenBB’s position as a counterpoint to legacy workspace infrastructure.

This would be a cinematic short film about the transformation of financial infrastructure and the rise of OpenBB.

I iterated on this script on my own at a very high level, focusing on what I wanted to happen in order to invoke emotion in the audience. To do that, I followed the typical Hero’s Journey framework.

Here’s a brief description of what I ended up with at a very high level.

  • The ordinary world: The legacy financial office with tired analysts in gray suits. Repetitive, robotic work on outdated CRT terminals. This establishes the mundane, oppressive status quo.

  • The call to adventure (opening): The young analyst pauses and closes their screen. A moment of questioning: “What if we started over?” The frosted-glass door labeled “UNLOCKED” literally represents the call.

  • Crossing the threshold (transition): Walking through the “UNLOCKED” door, a flood of light emerges as they enter the futuristic workspace. It symbolizes leaving the old world behind. Launching OpenBB is the point of no return.

  • Tests and allies: Other analysts join the journey (allies). Each person transforms into their unique self, rejecting uniformity. The collaborative work represents the tests - learning new ways to work. The AI agent appears as a supernatural aid, a mentor figure.

  • The Ordeal (system collapse): The legacy system crashes (“INTERNAL ERROR”). This is the death and destruction of the old world, the moment of greatest crisis for those still trapped.

  • The revelation/reward (digital dissolution): The walls dissolve, revealing the truth. The massive Hub represents the reward, the treasure. “This isn’t an upgrade. This is a migration.”, the wisdom gained.

  • The return: The black void and logo emerge. Sharing the message with the world: “WE DON’T SELL DATA. WE SELL FREEDOM.” and “The financial industry isn’t rebooting.” This knowledge returns to transform society.

Then I had ChatGPT help me decompose this into different scenes.

Also, bear in mind that these AI video models are trained to output 8-second clips, so I aimed to keep each scene within that limit, at least the core of it, since it’s difficult to preserve style across clips longer than 8 seconds.

That said, the “extend” feature on Google’s Flow worked better than I expected. More on this later.

Flow - Veo3

I first went to Google AI Studio, into the Generate Media section, and tried pasting in one of the scene texts I had.

Although it was free, the output was pretty disappointing. That’s when I realized it was Veo2, not Veo3, the model everyone was raving about on X.

I then tried to upgrade to Veo3, but in typical Google fashion, I couldn’t. Eventually, I found it on the Google DeepMind platform, which pointed me toward trying it in Flow.

Since I didn’t have the right subscription, I signed up for Google AI Ultra for Business at $125/month (with a 50% discount). The estimate is that this allows for about 1,250 clips, which is plenty, especially given the model’s extremely impressive quality and the fact that prompt engineering can help squeeze even more performance out of it.

Copy-pasting the scenes from my document produced impressive results, but the outputs varied a lot from run to run.

I realized it needed more structure. Several creators had mentioned that these models perform significantly better with JSON-structured inputs rather than plain text prompts.

So, I took the narrative text for each scene and converted it into JSON.

Example:

{
  "scene and action": "A slow dolly shot glides across a rigid financial office. Identical gray cubicles house analysts in matching suits, typing robotically at CRT-style terminals with pixelated dashboards. Close-ups reveal tired faces, blinking cursors, and a mechanical monotony. Voiceover: 'For decades, the tools of finance have remained the same. Expensive. Opaque. Inflexible. You weren’t meant to build with them. You were meant to follow.'",
  "camera angle": "center-aligned symmetry with slow dolly pans",
  "lighting": "harsh fluorescent with blue-gray tint",
  "room": "legacy financial office",
  "ratio": "16:9",
  "character": "analysts in identical gray suits, robotic behavior",
  "voice": "calm and authoritative male voice",
  "furniture": [
    "CRT monitors",
    "repetitive gray cubicles",
    "fluorescent ceiling lights"
  ],
  "action and motion": "minimal movement, robotic typing, blinking screens, fluorescent flickering",
  "keywords": [
    "legacy finance",
    "monotony",
    "rigid systems",
    "pixelated UI",
    "inflexibility"
  ]
}

This level of detail gave us precise control over every aspect of the scene, and then we just had to paste it to Flow and let the model cook 🧑‍🍳.

That improved the quality and reproducibility significantly.

But I knew we could still get more out of it so I iterated on these prompts with ChatGPT so it would add more relevant keys: value pairs to this JSON, such as:

  • Scene and action descriptions

  • Camera angles and movement

  • Lighting and color grading

  • Character direction and emotion

  • Environmental details

  • Keywords for style consistency

This JSON structure forced me to think about every element, which was awesome.

Iteration

At this point, I knew there was something here.

Once again, I was mind-blown by AI and what it could enable.

There I was, someone who had never done any video editing or film concept work, able to create something on my own in very little time.

This was the point at which I asked for help.

While I was good at the technical side of prompt crafting, my wife has an eye for what actually looks good on screen. She handled all the iteration after that initial work. Flow’s feature of generating four variations per prompt was a game-changer: instead of creating one video and hoping for the best, she could compare options and identify what worked.

She also experimented with Google’s Voiceover AI to generate the narration. This made the video feel far more polished and cinematic than anything we could have recorded ourselves, and it gave us the flexibility to quickly test different tones and deliveries until it clicked.

Another key feature was Flow’s Extend, which allows a scene to build on the previous ones so characters and environments aren’t lost between clips. This was particularly important since we wanted to follow the Hero’s Journey, where the audience needs to develop a relationship and affinity with the hero.

Our marketing lead, Rita, then helped refine the copy to give it the perfect OpenBB style.

Finally, my wife stitched everything together and synced the audio with the video. Leading to the final result, which you can see here:

Final thoughts

What started as a weekend curiosity turned into a legitimate short film.

The tools are there, the quality is impressive, and the barrier to entry is lower than ever.

But perhaps most importantly, AI video generation doesn’t replace human creativity, it amplifies it.

This represents more than just a technical experiment.

It’s literal proof that individual creators can now produce content that would have required significant budgets and teams just a few years ago.

We are definitely in a golden age. Enjoy.