02 Creative Field Notes: Nano Banana, a bit of Veo3 and a deep dive into my September process.
The second entry of JDOTF Field Notes; a dance across the my creative world, with fresh news, new tools, and a few gems from my prompt archive and process.
September is here. We’re neck-deep in projects, trying to keep up with new tools, and squeezing out the last sip of summer. Knowing we have to push for a strong Q3, but between Veo3, Nano Banana, and whatever else, it’s hard not to feel distracted – like everything everywhere all at once.
Let’s kick off with the shiny new toy: Nano Banana. If you don’t know, it’s Google’s code name for Gemini 2.5 image tools. Basically an upgrade specializing in product, character, and image consistency. Imagine throwing in a collage of fashion items or models and just saying, “Swap that color, tweak that detail.” Well, that’s this.
Some say it might eat Photoshop’s lunch. Maybe it will, but Photoshop is still the full toolbox. So I dont think its going to kill Adobe, but creatives have even started building Nano Banana plugins for Photoshop. The hype is real, but at the end of the day it’s just another tool.
Other than going bananas, things have been quite busy around the studio. I snuck into a few fun projects that have let me test out the new tools, even managed to cut a spec clip for BRABUS completely using Gemini 2.5 and Veo3. New York Fashion Week is coming up, new 2026 plans for JDOTF are on the table, and there may even be a guest intern joining in October. So, Happy September, and enjoy the read.
So, why the cost?
Normally the Field Notes are for paid members-only, but this second edition is a sponsored taste of what you get. A mix of creativity and business — unfiltered.
An exclusive look behind the curtain: how I work, what tools I’m using, what I’m reading, watching, and saving. How all of that shapes the work I do. So you can do it too.





Newsfeed
On August 26, 2025, Google launched Gemini 2.5 Flash Image, an advanced AI image generation and editing model that provides precise natural language-based image editing while maintaining consistency in faces and objects. TechCrunch
Meta’s new Advantage+ and Andromeda AI ad tools boosted Q2 2025 revenue to $46.6 billion, only slightly behind Google’s $54.2 billion. The platform increased ad conversions by 5–8% and raised average prices 9%, fueling predictions that Meta may overtake Google in U.S. ad revenue as early as 2026. Ainvest
Vogue’s August 2025 issue ran a Guess ad using fully AI-generated models and visuals, sparking widespread debate about authenticity and ethics in creative advertising. Fashion Network
A niche is forming where brands and clients are searching for designers to retouch botched AI material. Logo adjustment, product retouching, and other post production work. NBC
Fresh Prince is Fakin’ it.
Now let’s shift gears to a bit of cultural cringe. Will Smith recently got some flak for allegedly using AI generated crowds. After painful TikTok clips of Will trying to hype street crowds surfaced, it was hard to believe that people would be holding up signs in a packed stadium. The internet called bullshit.
It was kind of painful to watch, but the real question I want to ask is, does this fake crowd actually make us like the music more? Or is it just strengthening the awkward truth that maybe the vibe is gone? Would we as humans not trust it as much?
What really happened?
Will Smith’s “AI crowd” scandal is less black-and-white than the headlines made it sound. Apparently the team posted a concert video mixing authentic live footage with short clips that were actually AI animated still photos from real shows.
In other words, the fans do exist and the moments happened, but some shots were turned into moving images with AI tools. The internet, ran with the story as proof that Will was faking hype, feeding the narrative, but in reality, it looks more like sloppy editing and bad AI, rather than an attempt to fake fans.
Process & Perspective
In the last week I have focused on testing the power of Nano Banana, chosing to work with a BRABUS 6x6 that I had created for a past project. It was the first time using the tool, so I decided to limit myself to a 72 hour creation framework. The goal was to see what I could create within this timeframe.
The following 30s spec spot was created using the JDOTF Prompt Builder©, Gemini 2.5 (Nano Banana), and Veo 3. I have to say the same rules still apply. These tools are great, but you still need the film editing skills and taste.
Tools: JDOTF Prompt Builder ©, Gemini 2.5 (Nano Banana), Veo3.
Lesson: The tools are powerful, but editing and creative direction still carry the work.
What was my process?
I have to admit that there wasn’t a special process when it came to creating this work, and other than possibly a bit of variation when it came to creating the material, it was a very similar process that I use for most of my work. As I created this short clip for the BRABUS 6x6 – Mercedes-AMG G 63, I noted down the major milestones in the process to share:
Find the subject: To find an idea can be a book of its own, but to start you need a protagonist. The key images you have in your head, the vague message you want to say.
Find moods / Word cloud / Refine Idea: Next you refine the idea to identify the mood of the world you are building. This should not only contain images, but a curated list of key descriptors, words and phrases.
Generate First Set: From the moods and terms, prompts are crafted, reference images are tested, and a look is defined.
Define Shotlist / Storyboard: The exploration of the first set allows us to craft our script and expectations according to what is currently possible with AI.
Generate Final Set: Next step, the final assets and frames are generated.
Retouch & Grading: The final stills are then retouched, adding blur effects, grain, and grading to assure a consistent look.
Generate Video Assets: From the final stills the video assets are then generated.
Edit Rough/Final Cut: The last step of the process is to finalize the cut, adding needed transitions to give it impact.
Any last tip?
Well first, any Google product is expensive, so be sure you know what you want before you begin generating. Not only does it helps make your idea better, but you also will save cash-money. One last thing is to also be sure to use other tools, create an image with Sora or Midjourney, use ChatGPT to describe it, plop that into Gemini and bring it to life with Veo3. (Don’t stick to one gun.)
To vary your inspiration, consider varying your inputs. Break habits. Look for differences. Notice connections.
— Rick Ruben
New Month. New tools.
The last month has been dedicated to prompting and refining my command, but with the latest hype for Gemini 2.5 and Veo3 – I feel like I should lean on these tools for a while. The plan for the next 60 days: lean on Veo3, Gemini 2.5, and Leonardo.ai. Nothing else.
Leonardo.ai
An AI image generation platform that allows for high-quality images & video generation. It’s web-based, with tools for training custom models, batch generation, and fine-tuned control over styles and prompts. Users can use any of the mainstream models such as ChatGPT, Veo, Flux – or other local Leonardo models.
Pros:
High image quality with maximum control
Fast generation speed
Useful features (custom models, upscaling, background removal)
Access to wide variety of models
Cons:
Expensive and free tier is limited.
Inconsistent results with video characters.
Credits are expensive
Gemini 2.5
Google’s new update for text-based image generation and editing. It allows natural-language edits to existing images, supports multiple input images, and keeps faces and objects consistent across outputs.
Pros:
Strong consistency across edits and variations
Handles multiple images in one workflow
Low latency and fast generation times
Produces realistic character and product imagery
Cons:
Some outputs look over-processed or synthetic
Lacks basic editing tools (cropping, expanding, upscaling)
Using a single chat too long produces visual errors
Expensive compared to other generators
Veo3
Veo 3 is a video-generating model built by Google DeepMind. It takes text or an image and produces short, high-quality cinematic clips, complete with synchronized audio. This include dialogue, ambient sounds, and effects.
Pros:
High-quality video compared to earlier models
Syncs lip movement and generate fitting audio
Smooth camera motion and cinematic aesthetic
Cons:
Long render times
Incoherent or off-prompt results are common
Limited video length and control
Expensive credit costs and membership
Mood Dump
I have found that a good mood, found or generated can really make of break an idea.
Better mood boards = better understanding = better work.
Some Sources (Cause you asked.)
The moment I discovered how important a mood can shape an entire concept, I started obsessively curating sources. Here are some favorites:
Boooooooom: An art blog gone rabbit hole. Expect weird public domain finds, experimental projects, and a steady stream of visual oddness. *Link*
Open Processing: A playground for creative coders. Interactive sketches, generative art, and little browser based experiments. *Link*
Awwwards: The glossy showroom of web design, with monthly recognition of 20+ website and a strong community. *Link*
Designspiration: Kind of like pinterest for designers. A gast way to build a moodboard or stumble across new color combinations, type samples, or fresh moods. *Link*
Ignant: Berlin-based culture mag with a good eye for art, design, architecture, and photography. Curated, minimalistic, creative lifestyle. *Link*
Prompt Dump (Nano Banana)
I have collected what words and phrases get the best results to share with you. This week I tested a few older prompts with new tools. Here are a few prompts for retro-realistic results with Google’s Nano Banana. A good base for you to tweak. All from the JDOTF archive.
*Be sure to also check out my Prompt Builder© for more.*
A 16:9 photograph of a unique featured, model with vitiligo skin stands in a vast desert, wearing an oversized neon green jumpsuit. The image looks like a hyper-realistic photograph, captured on a 1990s Nikon F4 film camera with flash. Dusty horizon, soft golden light, cinematic detail, accidental candid pose.
A 16:9 photograph of a vintage Porsche 911 wrapped tightly in clear plastic, parked inside a spacious empty garage under harsh overhead lights. The image looks like a hyper-realistic photograph, captured on a 1990s Nikon F4 film camera with flash. Glossy reflections on the plastic, concrete floor, moody industrial atmosphere, cinematic detail, accidental candid framing.
A 16:9 photograph, 1990s-style analog flash photo, female runner sprinting down a deserted new york street at night. she is wearing all black, has a t-shirt balaclava around her face, blank clothing, no text, she has dark sports glasses hiding her eyes. The perspective exaggerates her stride with fisheye distortion, camera tilted in a strong dutch angle. The photograph is hyper-realistic but grainy, evoking amateur night photography. Harsh flash casts sharp shadows on cracked pavement, neon signs blur in the periphery. Feels urgent, stolen, documentary-like, shot with a GoPro Hero.
Thank you once again for your support.
See you next time.
Jake