Convert articles to click-driven 16:9 web video presentations - Hermes skill
  • CSS 48.1%
  • TypeScript 32.4%
  • Shell 12%
  • JavaScript 7.3%
  • HTML 0.2%
Find a file
Gordon Luk d40a85fd0f fix: continuous CDP screencast for smooth animation
Complete rewrite of render-video.cjs. Previous approaches (burst+hold,
continuous JPEG screenshots) failed because CDP captureScreenshot tops
out at ~10fps, making animation look jerky.

NEW APPROACH: CDP Page.startScreencast — streams browser compositor
frames at full framerate (~25-40fps) via CDP events, piped directly
to a single ffmpeg process across all 21 steps. No stop/start between
steps — ONE screencast, ONE ffmpeg, ONE CDP session.

Results on 21-step test presentation:
  - 24.7fps continuous capture (vs 10fps before)
  - 100% frame pairs differ at 1s intervals (animation confirmed)
  - 331.8s duration, 9.8MB, H.264+AAC, 1920x1080, 30fps
  - Audio concatenated separately and muxed at the end

Key changes:
- Uses Page.startScreencast + frame events (not captureScreenshot)
- Single continuous capture across all steps (no per-step restart)
- Single ffmpeg process for entire video (no per-segment encoding)
- Audio muxed after capture as separate step (concat + mux)
- Removed per-step CDP session management (state leakage fix)
- No SETTLE_MS, ANIM_WINDOW, CAPTURE_FPS — none needed
2026-05-20 01:07:07 +01:00
references fix: CSS animation capture in render-video.cjs + sync SKILL.md 2026-05-19 22:10:10 +01:00
scripts Initial commit: article-to-webvideo-presentation — English port of ConardLi's web-video-presentation skill 2026-05-15 01:21:39 +01:00
templates fix: continuous CDP screencast for smooth animation 2026-05-20 01:07:07 +01:00
themes Initial commit: article-to-webvideo-presentation — English port of ConardLi's web-video-presentation skill 2026-05-15 01:21:39 +01:00
manifest.json Initial commit: article-to-webvideo-presentation — English port of ConardLi's web-video-presentation skill 2026-05-15 01:21:39 +01:00
README.md Initial commit: article-to-webvideo-presentation — English port of ConardLi's web-video-presentation skill 2026-05-15 01:21:39 +01:00
RESEARCH_REPORT.md Add deep research analysis report from study of ConardLi's web-video-presentation skill 2026-05-15 01:28:05 +01:00
SKILL.md fix: continuous frame capture for full step duration (no hold frames) 2026-05-19 22:49:25 +01:00

Article to Web Video Presentation

A method-driven agent skill for turning scripts and articles into click-driven 16:9 web presentations that can be screen-recorded as cinematic videos.

Back to collection root

Article to Web Video Presentation


What Is This?

article-to-webvideo-presentation helps an agent build a Vite + React + TypeScript presentation that behaves like a video production surface rather than a slide deck. Each click advances one narration beat, each step owns the whole 1920×1080 stage, and the progress UI stays hidden unless hovered so the output is clean for screen recording.

It is designed for:

  • Turning a written article into a Bilibili / YouTube / video-channel narration script
  • Turning an existing voiceover script into a cinematic web presentation
  • Building product demos, tutorials, keynote-style explainers, and visual talks
  • Creating "dynamic PPT, but not PPT" experiences with strong motion and pacing
  • Optionally synthesizing narration audio after the visual outline is approved

The skill is primarily a methodology and collaboration workflow. The scaffold supplies reusable tokens, stage primitives, themes, and examples, but each project should still choose a visual language that fits the topic.


Core Ideas

  • Fixed 16:9 stage — content is authored in a stable 1920×1080 coordinate system and scaled to the viewport.
  • One global step cursor — click or keyboard advances (chapter, step), with the cursor persisted locally.
  • One step, one idea — every beat gets a focused full-screen scene instead of accumulating slide bullets.
  • Script beats drive structure — narration rhythm maps directly to visual steps.
  • Hidden chrome — progress controls are hover-only, keeping recordings clean.
  • Motion first — each scene needs a moving visual anchor; static paragraphs are treated as a smell.
  • Theme tokens — visual decisions flow through semantic tokens so themes can change the whole feel.
  • Hard checkpoints — the agent pauses after script/theme alignment, after outline approval, and before optional audio synthesis.

Workflow

Phase 1.1  Identify input
Phase 1.2  Article -> narration script
   |
Checkpoint A1  Script, theme, and rough asset plan
   |
Phase 1.3  Script + article -> outline.md
   |
Checkpoint A2  Outline approval + development mode
   |
Phase 2    Build the Vite / React / TS presentation
   |
Checkpoint B   Ask whether to synthesize audio
   |
Phase 3    Optional audio synthesis
Phase 4    Recording and post-production

The checkpoints are part of the skill contract: the agent should not silently rush from raw article to finished code. Theme choice influences motion design, and outline approval keeps chapter pacing from drifting.


What It Ships

skills/article-to-webvideo-presentation/
├── SKILL.md
├── README.md
├── references/
│   ├── CHAPTER-CRAFT.md
│   ├── OUTLINE-FORMAT.md
│   ├── SCRIPT-STYLE.md
│   ├── THEMES.md
│   ├── AUDIO.md
│   └── RECORDING.md
├── scripts/
│   └── scaffold.sh
├── templates/
│   ├── index.html
│   ├── vite.config.ts
│   └── src/
└── themes/
    ├── paper-press/
    ├── warm-keynote/
    ├── midnight-press/
    ├── blueprint/
    └── ...

Quick Start

Copy the skill into the directory your agent scans, then ask it to turn a script or article into a web-video presentation.

To scaffold manually from inside a project:

bash skills/article-to-webvideo-presentation/scripts/scaffold.sh ./presentation --theme=paper-press

List available themes:

bash skills/article-to-webvideo-presentation/scripts/scaffold.sh --list-themes

The generated presentation/ project is a normal Vite + React + TypeScript app. Run it like any other Vite project, then record the 16:9 stage with your screen recorder.


Built-In Theme Directions

The skill includes multiple theme families, each with its own visual DNA rather than a simple color swap:

  • paper-press — editorial paper, warm print texture
  • warm-keynote — modern talk / keynote energy
  • midnight-press — dark editorial presentation
  • blueprint — technical drawing / planning surface
  • chalk-garden — classroom / chalkboard style
  • terminal-green — phosphor terminal atmosphere
  • bauhaus-bold — sharp geometric manifesto
  • sunset-zine — indie zine / expressive collage
  • newsroom — newspaper / media desk
  • monochrome-print — restrained typographic print

See THEMES.md for the full token contract and theme guidance.


Reference Map