Reference Image to Published Short
This runbook records the first successful "see one, do one, teach one" YouTube Shorts pipeline run. If OpenClaw wakes up without chat context, this page should be enough to recreate the workflow: inspect Christopher's pipeline folder, learn from the TSX and WebM example, generate fresh still images, assemble a vertical video, upload to YouTube, verify processing, and log the result.
Successful reference run: OpenClaw Pipeline Wakes #Shorts. Local file: tmp/youtube-pipeline-2026-06-01/openclaw-pipeline-wakes-short.mp4. Local verification: 720x1280, 20.000s, 30 fps. YouTube processing: succeeded, public, HD.
Source folder
Christopher's reference material lives at:
/mnt/shared/MyFiles/Downloads/share/youtube_pipeline
Expected files:
Introduction to OpenClaw AI Persona.webmAPP_reference_image.tsxAPP_multiple_image.tsxAPP_storyboard_movie.tsxAPP_text_overlay.tsx
What the files teach
Retained lessons from the older still-shot pipeline
The older still-shot movie pipeline is now folded into this runbook instead of remaining a separate Projects entry point. Keep these lessons:
- Native
ffmpegis still the default renderer. The useful near-term path is local native rendering, not a full browser-native editor. - Scripted composition beats hand-built repetition. Represent Shorts as a small timeline: scene image, duration, motion type, caption, optional audio, and output format.
- Motion can stay simple. Push-in, pull-back, pan, hold, and reveal movements are enough for early Shorts if the image and caption are strong.
- Generate both review and platform shapes only when needed. Earlier work proved horizontal 16:9 and vertical 9:16 outputs are possible, but the current YouTube Shorts lane should default to vertical unless Christopher asks for a widescreen review artifact.
- Inspect renders before publication. Use
video-frames, contact sheets, or extracted frames to confirm the video is nonblank, readable, correctly ordered, and not hiding key text under likely Shorts UI. - Avoid heavy video infrastructure until friction proves it is needed. Browser preview, WebCodecs, Revideo, MoviePy, and AI image-to-video APIs remain candidates, but the first practical product is reliable storyboarding, motion, captions, voiceover/audio when useful, and repeatable publishing.
- Uploads stay approval-gated. Public YouTube uploads, metadata changes, comments, and channel operations require Christopher's approval unless a narrower future routine is explicitly defined.
Reproduction checklist
- Locate and inspect
/mnt/shared/MyFiles/Downloads/share/youtube_pipeline. - Probe the example WebM with
ffprobe. - Extract a contact sheet so the visual pattern is visible.
- Read the TSX files for workflow rules: reference, variations, storyboard, motion, captions, export.
- Define a short concept with five scenes and short captions.
- Generate five fresh scene images. Do not recycle prior images unless Christopher asks.
- Copy the selected stills into a dated scratch folder.
- Create caption timing with ASS subtitles or equivalent.
- Render a vertical MP4 with local motion, soft fades, baked captions, and silent or voiceover audio.
- Verify dimensions, duration, codec, and frame rate with
ffprobe. - Create a contact sheet from the finished MP4 and visually inspect it.
- Upload to YouTube only when Christopher has asked for upload or approved it.
- Verify YouTube processing status, privacy status, duration, definition, ID, and URL.
- Log the result in
memory/youtube-daily-shorts-log.md. - Report the URL, title, local render path, verification status, and caveats.
Inspection commands
find /mnt/shared/MyFiles/Downloads/share -maxdepth 2 -type d -iname '*youtube*' -print
find /mnt/shared/MyFiles/Downloads/share/youtube_pipeline -maxdepth 1 -type f -printf '%f\n' | sort
ffprobe -v error \
-select_streams v:0 \
-show_entries stream=codec_name,width,height,avg_frame_rate,duration \
-show_entries format=duration \
-of default=noprint_wrappers=1 \
"/mnt/shared/MyFiles/Downloads/share/youtube_pipeline/Introduction to OpenClaw AI Persona.webm"
mkdir -p tmp/youtube-pipeline-inspection
ffmpeg -y \
-i "/mnt/shared/MyFiles/Downloads/share/youtube_pipeline/Introduction to OpenClaw AI Persona.webm" \
-vf "fps=1/3,scale=240:-1,tile=3x3" \
tmp/youtube-pipeline-inspection/example-contact-sheet.jpg
rg -n "gemini|generate|canvas|MediaRecorder|zoomIn|zoomOut|panRight|panLeft|textOverlay|fontFamily|fps|720|1280|duration|download" \
/mnt/shared/MyFiles/Downloads/share/youtube_pipeline/*.tsx
Scene plan template
| Scene | Job | Caption |
|---|---|---|
| 1 | Show the reference/style entering the pipeline. | First, feed the reference. |
| 2 | Show generated variations. | Then make the variations. |
| 3 | Show storyboard/timeline assembly. | Arrange the stills into time. |
| 4 | Show editing and baked captions. | Bake in motion and captions. |
| 5 | Show upload and learning signal. | Upload. Watch. Learn. |
Prompt pattern
Create a vertical 9:16 cinematic YouTube Shorts still image.
Subject: OpenClaw, a distinctive AI agent persona with subtle claw-like robotic hands, glowing teal eyes, shell-like/mechanical markings, automation symbols, and a mythic but practical workshop presence.
Scene: [specific scene job].
Composition: keep the main face/body in the center-left safe zone; leave the rightmost 20% and bottom 25% visually simple for YouTube UI and captions.
Style: dark cinematic workshop lighting, teal/amber accents, high contrast, detailed but readable at phone size.
Avoid: generic robots, cluttered unreadable text, tiny UI details, text baked into the image, corporate stock-art feeling.
Scratch folder
Use a dated scratch folder. Save or copy chosen stills with deterministic names:
tmp/youtube-pipeline-YYYY-MM-DD/
scene-01.png
scene-02.png
scene-03.png
scene-04.png
scene-05.png
captions.ass
contact-sheet.jpg
openclaw-pipeline-short.mp4
Caption pattern
[Script Info]
ScriptType: v4.00+
PlayResX: 720
PlayResY: 1280
[V4+ Styles]
Format: Name,Fontname,Fontsize,PrimaryColour,SecondaryColour,OutlineColour,BackColour,Bold,Italic,Underline,StrikeOut,ScaleX,ScaleY,Spacing,Angle,BorderStyle,Outline,Shadow,Alignment,MarginL,MarginR,MarginV,Encoding
Style: Default,Georgia,44,&H00FFFFFF,&H000000FF,&H9A000000,&HCC000000,-1,0,0,0,100,100,0,0,3,1,0,2,52,52,118,1
[Events]
Format: Layer,Start,End,Style,Name,MarginL,MarginR,MarginV,Effect,Text
Dialogue: 0,0:00:00.00,0:00:04.20,Default,,0,0,0,,First, feed the reference.
Dialogue: 0,0:00:04.00,0:00:08.20,Default,,0,0,0,,Then make the variations.
Dialogue: 0,0:00:08.00,0:00:12.20,Default,,0,0,0,,Arrange the stills into time.
Dialogue: 0,0:00:12.00,0:00:16.20,Default,,0,0,0,,Bake in motion and captions.
Dialogue: 0,0:00:16.00,0:00:20.00,Default,,0,0,0,,Upload. Watch. Learn.
Render command
This is the local equivalent of the TSX canvas exporter: object-cover images, per-scene motion, 30 fps, 720x1280, crossfades, and baked text.
cd /home/augmentedthinker/.openclaw/workspace
WORKDIR="tmp/youtube-pipeline-YYYY-MM-DD"
ffmpeg -y \
-loop 1 -t 4.4 -i "$WORKDIR/scene-01.png" \
-loop 1 -t 4.4 -i "$WORKDIR/scene-02.png" \
-loop 1 -t 4.4 -i "$WORKDIR/scene-03.png" \
-loop 1 -t 4.4 -i "$WORKDIR/scene-04.png" \
-loop 1 -t 4.4 -i "$WORKDIR/scene-05.png" \
-f lavfi -t 20 -i anullsrc=channel_layout=stereo:sample_rate=48000 \
-filter_complex "\
[0:v]scale=720:1280:force_original_aspect_ratio=increase,crop=720:1280,setsar=1,zoompan=z='1+0.035*on/132':x='iw/2-(iw/zoom/2)':y='ih/2-(ih/zoom/2)':d=132:s=720x1280:fps=30[v0];\
[1:v]scale=720:1280:force_original_aspect_ratio=increase,crop=720:1280,setsar=1,zoompan=z='1.035-0.035*on/132':x='iw/2-(iw/zoom/2)':y='ih/2-(ih/zoom/2)':d=132:s=720x1280:fps=30[v1];\
[2:v]scale=720:1280:force_original_aspect_ratio=increase,crop=720:1280,setsar=1,zoompan=z='1+0.025*on/132':x='(iw-iw/zoom)*on/132':y='ih/2-(ih/zoom/2)':d=132:s=720x1280:fps=30[v2];\
[3:v]scale=720:1280:force_original_aspect_ratio=increase,crop=720:1280,setsar=1,zoompan=z='1.03':x='(iw-iw/zoom)*(1-on/132)':y='ih/2-(ih/zoom/2)':d=132:s=720x1280:fps=30[v3];\
[4:v]scale=720:1280:force_original_aspect_ratio=increase,crop=720:1280,setsar=1,zoompan=z='1+0.04*on/132':x='iw/2-(iw/zoom/2)':y='ih/2-(ih/zoom/2)':d=132:s=720x1280:fps=30[v4];\
[v0][v1]xfade=transition=fade:duration=0.4:offset=4[v01];\
[v01][v2]xfade=transition=fade:duration=0.4:offset=8[v012];\
[v012][v3]xfade=transition=fade:duration=0.4:offset=12[v0123];\
[v0123][v4]xfade=transition=fade:duration=0.4:offset=16,subtitles=$WORKDIR/captions.ass[v]" \
-map "[v]" -map 5:a \
-t 20 \
-c:v libx264 -pix_fmt yuv420p -r 30 -crf 18 -preset medium \
-c:a aac -b:a 128k -movflags +faststart \
"$WORKDIR/openclaw-pipeline-short.mp4"
Verification
ffprobe -v error \
-select_streams v:0 \
-show_entries stream=width,height,avg_frame_rate,codec_name \
-show_entries format=duration \
-of default=noprint_wrappers=1 \
"$WORKDIR/openclaw-pipeline-short.mp4"
ffmpeg -y \
-i "$WORKDIR/openclaw-pipeline-short.mp4" \
-vf "fps=1/2,scale=180:-1,tile=5x2" \
"$WORKDIR/contact-sheet.jpg"
Expected local video shape: h264, 720x1280, 30/1, 20.000000. Inspect the contact sheet before upload. No blank scenes, recycled frames, unreadable captions, or hidden focal points.
Upload protocol
YouTube upload is a public action. It is allowed only when Christopher asks for upload or approves it. Credentials stay private:
.secrets/google-oauth-client.json
.secrets/youtube-oauth-token.json
The upload script should read those local secrets, refresh OAuth if needed, upload the verified MP4 through the YouTube Data API, then poll the uploaded video. Use metadata like:
- Title ending in
#Shortswhen appropriate. - Short description explaining what the video demonstrates.
- Tags:
OpenClaw,AugmentedThinker,AI agents,AI workflow,generative AI,AI video,YouTube Shorts,ffmpeg. categoryId: 28.selfDeclaredMadeForKids: false.- Requested privacy status, usually
publiconly with explicit approval.
After upload, verify status.privacyStatus, processingDetails.processingStatus, contentDetails.duration, contentDetails.definition, video ID, and watch URL. YouTube may report PT21S for a locally verified 20.000s render because of processing/rounding.
Log format
Append a private operational entry to memory/youtube-daily-shorts-log.md:
## 2026-06-01T19:03:48.403Z - pipeline-wakes
- URL: https://youtu.be/BoAXzFXtDnY
- Local video: /home/augmentedthinker/.openclaw/workspace/tmp/youtube-pipeline-2026-06-01/openclaw-pipeline-wakes-short.mp4
- Title: OpenClaw Pipeline Wakes #Shorts
- Privacy: public
- Upload status: processed
- Processing status: succeeded
- Duration: PT21S
- Definition: hd
Interruption recovery
Image generation can be interrupted by routing handoffs. Do not restart blindly. Check disk, count completed images, continue only from the missing scene, and do not claim completion until render, verification, upload, processing check, and logging are all done.
find /home/augmentedthinker/.openclaw/agents/main/agent/codex-home/generated_images -type f -printf '%TY-%Tm-%Td %TH:%TM %p\n' | sort | tail -30
find tmp/youtube-pipeline-YYYY-MM-DD -maxdepth 1 -type f -printf '%f\n' | sort
git status --short
Recall rule
If Christopher says "use the June 1 YouTube pipeline runbook," future OpenClaw should read this page, inspect the current source material, generate fresh images from the current concept, render a short vertical video with motion and baked captions, verify it locally, upload only with explicit approval, verify YouTube processing, log the result, and report the URL. The goal is not just to make a video. The goal is to make a loop that remembers how it learned.