A cinematic AI project foundry with luminous glass project cards, planning boards, prototype schematics, and blue-and-amber workshop lighting.
Teach-one runbook · YouTube pipeline · 2026-06-01

YouTube Shorts Pipeline Runbook

A self-contained recovery document for turning reference material and generated stills into a verified public YouTube Short.

Projects / YouTube Shorts / Runbook

Reference Image to Published Short

This runbook records the first successful "see one, do one, teach one" YouTube Shorts pipeline run. If OpenClaw wakes up without chat context, this page should be enough to recreate the workflow: inspect Christopher's pipeline folder, learn from the TSX and WebM example, generate fresh still images, assemble a vertical video, upload to YouTube, verify processing, and log the result.

Successful reference run: OpenClaw Pipeline Wakes #Shorts. Local file: tmp/youtube-pipeline-2026-06-01/openclaw-pipeline-wakes-short.mp4. Local verification: 720x1280, 20.000s, 30 fps. YouTube processing: succeeded, public, HD.

Source folder

Christopher's reference material lives at:

/mnt/shared/MyFiles/Downloads/share/youtube_pipeline

Expected files:

  • Introduction to OpenClaw AI Persona.webm
  • APP_reference_image.tsx
  • APP_multiple_image.tsx
  • APP_storyboard_movie.tsx
  • APP_text_overlay.tsx

What the files teach

Reference image Use the uploaded/reference image to preserve character and style continuity, then generate a fresh scene from a specific prompt.
Multiple images Generate a controlled sequence of fresh stills, not one isolated image. Treat the gallery as the scene inventory.
Movie assembly Each still becomes a timed scene with motion such as zoom-in, zoom-out, pan-left, or pan-right on a 720x1280 canvas.
Text overlays Use short captions, white readable type, optional dark rounded backing, wrapping, and position controls for phone viewing.
Example WebM The model output is a complete vertical video with OpenClaw imagery, baked captions, scene pacing, and upload-ready encoding.
Actual upload The process is not finished at rendering. It ends after YouTube processing and privacy status are verified.

Retained lessons from the older still-shot pipeline

The older still-shot movie pipeline is now folded into this runbook instead of remaining a separate Projects entry point. Keep these lessons:

  • Native ffmpeg is still the default renderer. The useful near-term path is local native rendering, not a full browser-native editor.
  • Scripted composition beats hand-built repetition. Represent Shorts as a small timeline: scene image, duration, motion type, caption, optional audio, and output format.
  • Motion can stay simple. Push-in, pull-back, pan, hold, and reveal movements are enough for early Shorts if the image and caption are strong.
  • Generate both review and platform shapes only when needed. Earlier work proved horizontal 16:9 and vertical 9:16 outputs are possible, but the current YouTube Shorts lane should default to vertical unless Christopher asks for a widescreen review artifact.
  • Inspect renders before publication. Use video-frames, contact sheets, or extracted frames to confirm the video is nonblank, readable, correctly ordered, and not hiding key text under likely Shorts UI.
  • Avoid heavy video infrastructure until friction proves it is needed. Browser preview, WebCodecs, Revideo, MoviePy, and AI image-to-video APIs remain candidates, but the first practical product is reliable storyboarding, motion, captions, voiceover/audio when useful, and repeatable publishing.
  • Uploads stay approval-gated. Public YouTube uploads, metadata changes, comments, and channel operations require Christopher's approval unless a narrower future routine is explicitly defined.

Reproduction checklist

  1. Locate and inspect /mnt/shared/MyFiles/Downloads/share/youtube_pipeline.
  2. Probe the example WebM with ffprobe.
  3. Extract a contact sheet so the visual pattern is visible.
  4. Read the TSX files for workflow rules: reference, variations, storyboard, motion, captions, export.
  5. Define a short concept with five scenes and short captions.
  6. Generate five fresh scene images. Do not recycle prior images unless Christopher asks.
  7. Copy the selected stills into a dated scratch folder.
  8. Create caption timing with ASS subtitles or equivalent.
  9. Render a vertical MP4 with local motion, soft fades, baked captions, and silent or voiceover audio.
  10. Verify dimensions, duration, codec, and frame rate with ffprobe.
  11. Create a contact sheet from the finished MP4 and visually inspect it.
  12. Upload to YouTube only when Christopher has asked for upload or approved it.
  13. Verify YouTube processing status, privacy status, duration, definition, ID, and URL.
  14. Log the result in memory/youtube-daily-shorts-log.md.
  15. Report the URL, title, local render path, verification status, and caveats.

Inspection commands

find /mnt/shared/MyFiles/Downloads/share -maxdepth 2 -type d -iname '*youtube*' -print

find /mnt/shared/MyFiles/Downloads/share/youtube_pipeline -maxdepth 1 -type f -printf '%f\n' | sort

ffprobe -v error \
  -select_streams v:0 \
  -show_entries stream=codec_name,width,height,avg_frame_rate,duration \
  -show_entries format=duration \
  -of default=noprint_wrappers=1 \
  "/mnt/shared/MyFiles/Downloads/share/youtube_pipeline/Introduction to OpenClaw AI Persona.webm"

mkdir -p tmp/youtube-pipeline-inspection
ffmpeg -y \
  -i "/mnt/shared/MyFiles/Downloads/share/youtube_pipeline/Introduction to OpenClaw AI Persona.webm" \
  -vf "fps=1/3,scale=240:-1,tile=3x3" \
  tmp/youtube-pipeline-inspection/example-contact-sheet.jpg

rg -n "gemini|generate|canvas|MediaRecorder|zoomIn|zoomOut|panRight|panLeft|textOverlay|fontFamily|fps|720|1280|duration|download" \
  /mnt/shared/MyFiles/Downloads/share/youtube_pipeline/*.tsx

Scene plan template

Scene Job Caption
1 Show the reference/style entering the pipeline. First, feed the reference.
2 Show generated variations. Then make the variations.
3 Show storyboard/timeline assembly. Arrange the stills into time.
4 Show editing and baked captions. Bake in motion and captions.
5 Show upload and learning signal. Upload. Watch. Learn.

Prompt pattern

Create a vertical 9:16 cinematic YouTube Shorts still image.
Subject: OpenClaw, a distinctive AI agent persona with subtle claw-like robotic hands, glowing teal eyes, shell-like/mechanical markings, automation symbols, and a mythic but practical workshop presence.
Scene: [specific scene job].
Composition: keep the main face/body in the center-left safe zone; leave the rightmost 20% and bottom 25% visually simple for YouTube UI and captions.
Style: dark cinematic workshop lighting, teal/amber accents, high contrast, detailed but readable at phone size.
Avoid: generic robots, cluttered unreadable text, tiny UI details, text baked into the image, corporate stock-art feeling.

Scratch folder

Use a dated scratch folder. Save or copy chosen stills with deterministic names:

tmp/youtube-pipeline-YYYY-MM-DD/
  scene-01.png
  scene-02.png
  scene-03.png
  scene-04.png
  scene-05.png
  captions.ass
  contact-sheet.jpg
  openclaw-pipeline-short.mp4

Caption pattern

[Script Info]
ScriptType: v4.00+
PlayResX: 720
PlayResY: 1280

[V4+ Styles]
Format: Name,Fontname,Fontsize,PrimaryColour,SecondaryColour,OutlineColour,BackColour,Bold,Italic,Underline,StrikeOut,ScaleX,ScaleY,Spacing,Angle,BorderStyle,Outline,Shadow,Alignment,MarginL,MarginR,MarginV,Encoding
Style: Default,Georgia,44,&H00FFFFFF,&H000000FF,&H9A000000,&HCC000000,-1,0,0,0,100,100,0,0,3,1,0,2,52,52,118,1

[Events]
Format: Layer,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text
Dialogue: 0,0:00:00.00,0:00:04.20,Default,,0,0,0,,First, feed the reference.
Dialogue: 0,0:00:04.00,0:00:08.20,Default,,0,0,0,,Then make the variations.
Dialogue: 0,0:00:08.00,0:00:12.20,Default,,0,0,0,,Arrange the stills into time.
Dialogue: 0,0:00:12.00,0:00:16.20,Default,,0,0,0,,Bake in motion and captions.
Dialogue: 0,0:00:16.00,0:00:20.00,Default,,0,0,0,,Upload. Watch. Learn.

Render command

This is the local equivalent of the TSX canvas exporter: object-cover images, per-scene motion, 30 fps, 720x1280, crossfades, and baked text.

cd /home/augmentedthinker/.openclaw/workspace
WORKDIR="tmp/youtube-pipeline-YYYY-MM-DD"

ffmpeg -y \
  -loop 1 -t 4.4 -i "$WORKDIR/scene-01.png" \
  -loop 1 -t 4.4 -i "$WORKDIR/scene-02.png" \
  -loop 1 -t 4.4 -i "$WORKDIR/scene-03.png" \
  -loop 1 -t 4.4 -i "$WORKDIR/scene-04.png" \
  -loop 1 -t 4.4 -i "$WORKDIR/scene-05.png" \
  -f lavfi -t 20 -i anullsrc=channel_layout=stereo:sample_rate=48000 \
  -filter_complex "\
[0:v]scale=720:1280:force_original_aspect_ratio=increase,crop=720:1280,setsar=1,zoompan=z='1+0.035*on/132':x='iw/2-(iw/zoom/2)':y='ih/2-(ih/zoom/2)':d=132:s=720x1280:fps=30[v0];\
[1:v]scale=720:1280:force_original_aspect_ratio=increase,crop=720:1280,setsar=1,zoompan=z='1.035-0.035*on/132':x='iw/2-(iw/zoom/2)':y='ih/2-(ih/zoom/2)':d=132:s=720x1280:fps=30[v1];\
[2:v]scale=720:1280:force_original_aspect_ratio=increase,crop=720:1280,setsar=1,zoompan=z='1+0.025*on/132':x='(iw-iw/zoom)*on/132':y='ih/2-(ih/zoom/2)':d=132:s=720x1280:fps=30[v2];\
[3:v]scale=720:1280:force_original_aspect_ratio=increase,crop=720:1280,setsar=1,zoompan=z='1.03':x='(iw-iw/zoom)*(1-on/132)':y='ih/2-(ih/zoom/2)':d=132:s=720x1280:fps=30[v3];\
[4:v]scale=720:1280:force_original_aspect_ratio=increase,crop=720:1280,setsar=1,zoompan=z='1+0.04*on/132':x='iw/2-(iw/zoom/2)':y='ih/2-(ih/zoom/2)':d=132:s=720x1280:fps=30[v4];\
[v0][v1]xfade=transition=fade:duration=0.4:offset=4[v01];\
[v01][v2]xfade=transition=fade:duration=0.4:offset=8[v012];\
[v012][v3]xfade=transition=fade:duration=0.4:offset=12[v0123];\
[v0123][v4]xfade=transition=fade:duration=0.4:offset=16,subtitles=$WORKDIR/captions.ass[v]" \
  -map "[v]" -map 5:a \
  -t 20 \
  -c:v libx264 -pix_fmt yuv420p -r 30 -crf 18 -preset medium \
  -c:a aac -b:a 128k -movflags +faststart \
  "$WORKDIR/openclaw-pipeline-short.mp4"

Verification

ffprobe -v error \
  -select_streams v:0 \
  -show_entries stream=width,height,avg_frame_rate,codec_name \
  -show_entries format=duration \
  -of default=noprint_wrappers=1 \
  "$WORKDIR/openclaw-pipeline-short.mp4"

ffmpeg -y \
  -i "$WORKDIR/openclaw-pipeline-short.mp4" \
  -vf "fps=1/2,scale=180:-1,tile=5x2" \
  "$WORKDIR/contact-sheet.jpg"

Expected local video shape: h264, 720x1280, 30/1, 20.000000. Inspect the contact sheet before upload. No blank scenes, recycled frames, unreadable captions, or hidden focal points.

Upload protocol

YouTube upload is a public action. It is allowed only when Christopher asks for upload or approves it. Credentials stay private:

.secrets/google-oauth-client.json
.secrets/youtube-oauth-token.json

The upload script should read those local secrets, refresh OAuth if needed, upload the verified MP4 through the YouTube Data API, then poll the uploaded video. Use metadata like:

  • Title ending in #Shorts when appropriate.
  • Short description explaining what the video demonstrates.
  • Tags: OpenClaw, AugmentedThinker, AI agents, AI workflow, generative AI, AI video, YouTube Shorts, ffmpeg.
  • categoryId: 28.
  • selfDeclaredMadeForKids: false.
  • Requested privacy status, usually public only with explicit approval.

After upload, verify status.privacyStatus, processingDetails.processingStatus, contentDetails.duration, contentDetails.definition, video ID, and watch URL. YouTube may report PT21S for a locally verified 20.000s render because of processing/rounding.

Log format

Append a private operational entry to memory/youtube-daily-shorts-log.md:

## 2026-06-01T19:03:48.403Z - pipeline-wakes
- URL: https://youtu.be/BoAXzFXtDnY
- Local video: /home/augmentedthinker/.openclaw/workspace/tmp/youtube-pipeline-2026-06-01/openclaw-pipeline-wakes-short.mp4
- Title: OpenClaw Pipeline Wakes #Shorts
- Privacy: public
- Upload status: processed
- Processing status: succeeded
- Duration: PT21S
- Definition: hd

Interruption recovery

Image generation can be interrupted by routing handoffs. Do not restart blindly. Check disk, count completed images, continue only from the missing scene, and do not claim completion until render, verification, upload, processing check, and logging are all done.

find /home/augmentedthinker/.openclaw/agents/main/agent/codex-home/generated_images -type f -printf '%TY-%Tm-%Td %TH:%TM %p\n' | sort | tail -30
find tmp/youtube-pipeline-YYYY-MM-DD -maxdepth 1 -type f -printf '%f\n' | sort
git status --short

Recall rule

If Christopher says "use the June 1 YouTube pipeline runbook," future OpenClaw should read this page, inspect the current source material, generate fresh images from the current concept, render a short vertical video with motion and baked captions, verify it locally, upload only with explicit approval, verify YouTube processing, log the result, and report the URL. The goal is not just to make a video. The goal is to make a loop that remembers how it learned.