A batch content generation tool that uses NotebookLM to automatically create podcasts, slides, reports, quizzes, and more from YouTube videos and website URLs.
Batch-generates content from YouTube videos, web pages, and other sources using NotebookLM. Write a YAML instruction file, run it, and walk away.
- Automatically creates a notebook per source → generates content → downloads → deletes the notebook
- NotebookLM is used as a generation engine; no notebooks are left behind on the service
- Output files are the source of truth: if a file exists locally, it is skipped (supports rerun and resume)
- Auto-resume: re-running the same YAML resumes from where it left off
- Idempotent: same source → same output directory and filename (stable hashing)
The .claude/skills/notebooklm-batch/ directory includes a Claude Code Skill.
With Claude Code, you can skip writing YAML entirely — just say what you want:
"Make a podcast from this YouTube video: https://..."
Claude will handle YAML creation, dry-run verification, and background execution automatically.
- Batch-convert multiple YouTube videos into podcasts, slides, or reports
- Bulk-process news articles or tech blogs into reports or flashcards
- "Fire and forget" with background execution + GitHub Issue notifications
| IN (source) | OUT (content) |
|---|---|
| YouTube URL | Podcast (MP3) |
| Website URL | Infographic (PNG) |
| Text file | Slide Deck (PDF) |
| Inline text | Video (MP4) |
| Quiz (JSON) | |
| Flashcards (JSON) | |
| Report (Markdown) | |
| Data Table (CSV) |
- Python 3.11+
- pipx
- A Google account with access to NotebookLM
See INSTALL.md for full setup instructions.
git clone https://github.com/KunihiroS/notebooklm-batch.git
cd notebooklm-batch
pip install -r requirements.txt
pipx install "notebooklm-py[browser]"
notebooklm loginPrerequisites: Complete installation and authentication first, you need some works at the first time such as NotebookLM login and some more. — check otu the details in INSTALL.md.
- Create a YAML instruction file in
instructions/:
# instructions/my_task.yaml
settings:
language: en
tasks:
- source: "https://www.youtube.com/watch?v=YOUR_VIDEO_ID"
title: "My Notebook"
contents:
- type: podcast
prompt: "Summarize the video in an engaging podcast format."- Run a dry-run to verify:
python3 run_batch.py ./instructions/my_task.yaml --dry-run- Execute in the background:
nohup python3 run_batch.py ./instructions/my_task.yaml > log/nohup_output.log 2>&1 &Generated files are saved to ./files/<title>__<hash>/.
| Source | Example |
|---|---|
| YouTube URL | https://www.youtube.com/watch?v=... |
| Website URL | https://example.com/article |
| Text file | ./path/to/document.txt |
| Inline text | Plain text string |
type |
Output | Format |
|---|---|---|
podcast |
Audio podcast | MP3 |
slide |
Slide deck | |
image |
Infographic | PNG |
video |
Explainer video | MP4 |
report |
Written report | Markdown |
quiz |
Quiz questions | JSON |
flashcards |
Flashcards | JSON |
data-table |
Data table | CSV |
Contributions are welcome. Please open an issue or pull request on GitHub.
notebooklm-batch/
├── README.md # This document
├── AGENTS.md # Operational guide for AI agents
├── run_batch.py # Batch execution script
├── instructions/ # YAML instruction files
│ └── <any-name>.yaml
├── files/ # Generated outputs
│ └── <safe_title>__<source_hash>/
│ ├── <type>_<hash>.png
│ ├── <type>_<hash>.mp3
│ └── ...
└── log/ # Run logs and progress files
└── run_YYYYMMDDHHmmss.json
| Directory / File | Description |
|---|---|
instructions/ |
Where users place YAML instruction files |
files/ |
Where generated outputs (images, audio, video, PDF) are saved |
log/ |
Execution logs and progress JSON (used for auto-resume) |
run_batch.py |
Batch processing script |
AGENTS.md |
Operational guide for AI agents running the batch |
Place a YAML instruction file under ./instructions/ and ask an AI agent (or run directly) to execute it with run_batch.py. Initial authentication requires browser-based login by the user.
- The AI follows
AGENTS.mdto execute processing based on the YAML file. - The user can also ask the AI to create the instruction YAML itself.
Items the AI will confirm with the user:
| Item | Required | Description | Example |
|---|---|---|---|
| source | ✅ | Target URL(s) (multiple allowed) | https://www.youtube.com/watch?v=XXXXX |
| content type | ✅ | What to generate | podcast / image / slide / video |
| title | Optional | Notebook name (required for non-YouTube sources) | AI Weekly Digest |
| prompt | Recommended | Generation instructions | Summarize in a clear, engaging tone |
| options | Optional | See "Options" section below | format: deep-dive, length: long |
| language | Optional | Default: ja |
en, ja |
If the user gives a brief request such as "make a podcast from this video", the AI will confirm the desired prompt.
Current behavior of run_batch.py:
- The notebook title on NotebookLM uses
tasks[].titlefrom the YAML as-is - If
tasks[].titleis omitted,video_idis used as the title (YouTube only) - No suffix is appended
Instruction files are written in YAML.
- Location:
./instructions/ - Naming:
<any-name>.ymlor<any-name>.yaml- e.g.
instruction.yaml,ai_digest_20260207.yaml - Use a name that identifies the purpose or date
- e.g.
Note: Instruction files are YAML (
.yml/.yaml), not Markdown (.md).
# NotebookLM batch instruction
settings:
language: en # Output language
notify: # GitHub Issue notification (omit to disable)
github_issue: 1 # Issue number to post to
tasks:
- source: "https://www.youtube.com/watch?v=XXXXX"
title: "Notebook Title"
contents:
- type: podcast
prompt: "Summarize the video in an engaging way."
options:
format: deep-dive
length: default
- type: image
prompt: "Visually summarize the key points."
options:
orientation: landscape
detail: standard
- source: "https://example.com/article"
title: "Article Summary"
contents:
- type: report
prompt: "Summarize the key points of the article."| Field | Required | Default | Description |
|---|---|---|---|
language |
No | ja |
Language for generation (passed to notebooklm generate --language) |
notify |
No | (disabled) | GitHub Issue notification settings (see below) |
Posts batch progress, completion, and errors as comments on a GitHub Issue. Omit to disable notification (default behavior).
settings:
notify:
github_issue: 1 # Issue number to post to (required)
github_repo: "YourName/notebooklm-batch" # Omit to auto-resolve from git origin| Field | Required | Description |
|---|---|---|
github_issue |
Yes (when notify is used) | Target Issue number |
github_repo |
No | Target repository (owner/repo). Omit to auto-resolve via gh CLI from origin remote |
Prerequisites:
ghCLI must be installed and authenticated (gh auth status)- The target Issue must exist in the repository
Notification events:
| Icon | Timing | Example |
|---|---|---|
| 🔄 | Batch start | 🔄 Batch started: 3 tasks / 7 contents |
| ⏭️ | Skipped (output already exists) | ⏭️ [1/3] image skipped (exists) |
| 📦 | Content completed | 📦 [1/3] image generated |
| 🚫 | RATE_LIMITED / AUTH_REQUIRED | 🚫 Stopped: RATE_LIMITED |
| ❌ | Error | ❌ Error: GENERATE_FAILED |
| ✅ | Batch completed | ✅ Done: NEW:3 SKIP:2 / 05:30 elapsed |
| ⏹️ | Ctrl+C interrupted | ⏹️ Interrupted: 2/5 completed |
Note: Notifications are best-effort. Notification failures do not affect batch processing.
Note on
language: corresponds to the--languageoption of the notebooklm CLI.
- If not specified in YAML,
run_batch.pydefaults toja- The CLI's own default is
en, so specifyjaexplicitly or omit it (run_batch.py fills inja)
| Field | Required | Description |
|---|---|---|
source |
Yes | Source (YouTube URL / Website URL / text file path / inline text) |
title |
Yes* | Notebook name on NotebookLM & base for output directory name. *YouTube URLs may omit (falls back to video_id) |
contents |
Yes (1+) | List of content to generate |
Backward compatibility: the
url:field still works but emits a warning. Usesource:in new files.
| Field | Required | Description |
|---|---|---|
type |
Yes | podcast / image / slide / video / quiz / flashcards / report / data-table |
prompt |
No | Custom instructions for generation |
options |
No | Per-type options (see below) |
Output filename is automatically determined as
<type>_<hash>.<ext>(stable hash)
| Type | CLI command | Output format | Key options |
|---|---|---|---|
| Podcast | generate audio |
MP3 | format, length |
| Infographic | generate infographic |
PNG | orientation, detail |
| Slide Deck | generate slide-deck |
format, length | |
| Video | generate video |
MP4 | format, style |
| Quiz | generate quiz |
JSON | quantity, difficulty, download_format |
| Flashcards | generate flashcards |
JSON | quantity, difficulty, download_format |
| Report | generate report |
Markdown | format |
| Data Table | generate data-table |
CSV | (none) |
When options are omitted, NotebookLM's defaults are used. Specifying options explicitly is recommended for predictable results.
| Option | Values |
|---|---|
--format |
deep-dive / brief / critique / debate |
--length |
short / default / long |
| Option | Values |
|---|---|
--orientation |
landscape / portrait / square |
--detail |
concise / standard / detailed |
| Option | Values |
|---|---|
--format |
detailed / presenter |
--length |
default / short |
download_format |
pdf (default) / pptx (editable PowerPoint) |
| Option | Values |
|---|---|
--format |
explainer / brief |
--style |
auto-select / classic / whiteboard / kawaii / anime / watercolor / retro-print / heritage / paper-craft |
| Option | Values |
|---|---|
--quantity |
fewer / standard / more |
--difficulty |
easy / medium / hard |
download_format |
json / markdown / html (default: json) |
Note:
--quantity moreis equivalent tostandardinternally (known limitation).--languageis not supported. Changingdownload_formatchanges the output file extension accordingly (markdown→.md,html→.html).
| Option | Values |
|---|---|
--quantity |
fewer / standard / more |
--difficulty |
easy / medium / hard |
download_format |
json / markdown / html (default: json) |
Note:
--languageis not supported. Changingdownload_formatchanges the output file extension accordingly.
| Option | Values |
|---|---|
--format |
briefing-doc / study-guide / blog-post / custom |
Note: Specifying a prompt automatically switches to
customformat. Use--appendto add custom instructions to an existing format (briefing-doc / study-guide / blog-post).
No generate options. Prompt is required (empty prompt causes a CLI error). Output is CSV (UTF-8-BOM).
tasks:
- source: "https://www.youtube.com/watch?v=XXXXX"
title: "Video Title"
contents:
- type: podcast
prompt: "Summarize the video in an engaging way."
titleis optional for YouTube URLs only (falls back tovideo_id).promptis also optional but recommended.
settings:
language: en
notify:
github_issue: 1
tasks:
- source: "https://www.youtube.com/watch?v=XXXXX"
title: "AI Trends Digest"
contents:
- type: podcast
prompt: "Cover the overall picture first, then dive into individual topics."
options:
format: deep-dive
length: long
- type: image
prompt: "Summarize the 3 key points visually."
options:
orientation: portrait
detail: concise
- type: slide
prompt: "Create a presentation suitable for internal sharing."
options:
format: presenter
length: default
- source: "https://example.com/tech-article"
title: "Tech Article Summary"
contents:
- type: report
prompt: "Summarize from a technology selection perspective."
options:
format: briefing-doc
- type: quiz
prompt: "Quiz on the key points of the article."
options:
quantity: standard
difficulty: medium
- type: data-table
prompt: "Create a comparison table of the tools mentioned."run_id- Identifier assigned to each batch execution
- Format:
YYYYMMDDHHmmss(e.g.20260208200022) - Used in: progress filename (
./log/run_<run_id>.json) and temp file collision avoidance (.__tmp__<run_id>)
source_hash(output directory name)- Output directory is automatically determined as
{safe_title}__{sha256(source)[:8]} - Same
sourcealways maps to the same directory (idempotency)
- Output directory is automatically determined as
type_<hash>(filename)- Output filename is automatically determined as
<type>_<hash>.<ext> - Hash is computed from
source/type/prompt/optionsusing a stable hash (same input → same filename)
- Output filename is automatically determined as
- The execution engine wraps the
notebooklmCLI in Python, parsing--jsonoutput to update progress - Resume/completion decisions are based on local output files, not NotebookLM server state
- If
./files/<dir>/<type>_<hash>.<ext>exists, that content is always skipped - Filenames are determined by stable hashing (
type_<hash>)- e.g.
image_7f3a2c1b - Hash is computed by JSON-serializing
url/type/prompt/options(keys sorted) and taking the first 8 characters of SHA1 - Purpose: resilient to YAML reordering, ensuring stable skip/resume behavior
- e.g.
- If
- On
RATE_LIMITEDand similar failures, the content is recorded asblocked; re-running with a different account skips already-completed outputs - Output filenames are deterministic and writes go through a temp file (
<type>_<hash>.<ext>.__tmp__<run_id>) before a rename commit (resilient to interruption and parallelism)
- The user specifies one YAML file per execution (processing multiple YAMLs simultaneously is not supported)
- A single YAML may contain multiple sources in
tasks[]
- A single YAML may contain multiple sources in
- Invocation:
python3 ./run_batch.py ./instructions/<INSTRUCTION>.yaml
- Dry-run mode (pre-execution verification):
python3 ./run_batch.py ./instructions/<INSTRUCTION>.yaml --dry-run- Shows targets, output paths, and skip decisions without generating anything
-
Base directory is the directory containing this README (and
run_batch.py)- Outputs are saved under
./files/ - Progress is saved to
./log/run_YYYYMMDDHHmmss.json
- Outputs are saved under
-
Re-running the same YAML loads the most recent
run_*.jsonand resumes from where it left off- Resume condition:
instruction_file(canonical path) inrun_*.jsonmatches the canonical path of the specified YAML- Purpose: prevents resume failures due to path representation differences (e.g.
./instructions/x.yamlvs/abs/path/.../instructions/x.yaml)
- Purpose: prevents resume failures due to path representation differences (e.g.
- Resume condition:
-
Progress is displayed with a spinner and progress bar:
⠹ 🔄 [████████░░░░░░░░░░░░] 40% (2/5) | NEW:1 SKIP:1 | TIME:01:23 | LOG:run_20260211150030.json- Status icons:
🔄running /✅completed /🚫blocked /⏹️aborted /❌error (2/5)completed / totalNEW:nnewly generated /SKIP:nskipped (file already exists)TIME:MM:SSelapsed (switches toHH:MM:SSafter one hour)LOG:run_*.jsonprogress filename- On block, reason is appended:
[AUTH_REQUIRED]/[RATE_LIMITED]
Examples:
# Running ⠹ 🔄 [████████░░░░░░░░░░░░] 40% (2/5) | NEW:1 SKIP:1 | TIME:01:23 | LOG:run_xxx.json # Completed normally ✅ [████████████████████] 100% (5/5) | NEW:3 SKIP:2 | TIME:05:30 | LOG:run_xxx.json # All skipped (outputs already exist) ✅ [████████████████████] 100% (3/3) | NEW:0 SKIP:3 | TIME:00:04 | LOG:run_xxx.json # Blocked (auth error) 🚫 [██████████░░░░░░░░░░] 50% (2/4) | NEW:1 SKIP:1 | TIME:02:15 | LOG:run_xxx.json [AUTH_REQUIRED] # Blocked (rate limit) 🚫 [██████████████░░░░░░] 66% (2/3) | NEW:2 SKIP:0 | TIME:03:45 | LOG:run_xxx.json [RATE_LIMITED] - Status icons:
-
Ctrl+C records
status=aborted(next run auto-resumes) -
RATE_LIMITED/AUTH_REQUIREDrecordsstatus=blockedand stops -
Exit codes:
0: completed3: blocked130: aborted (Ctrl+C)1: error (source wait failure / generation failure / download failure)2: usage error (YAML not found / invalid format)
This README is the authoritative specification. Divergences between the spec and implementation are recorded here.
- Target spec (this README):
instruction_fileis stored and compared as a canonical path (realpathequivalent) for representation-agnostic auto-resume- Filenames are determined by stable hashing (
type_<hash>) for resilience to reordering
- Current implementation (
run_batch.py):- Both of the above are implemented (as of 2026-02-08)
This section documents batch execution behavior as steps, states, and decision criteria so that README serves as the authoritative operational reference.
- Instruction: the YAML file specified by the user (e.g.
./instructions/test_news.yaml) - Task: one element of
tasks[]in the YAML (one source) - Content: one element of
tasks[].contents[](podcast/image/slide/ etc.) - Output:
./files/<dir>/<type>_<hash>.<ext> - Run state:
./log/run_*.json
- Output files are the source of truth (for skip/resume decisions)
- If an output file exists, that Content is always skipped (never regenerated)
- Notebooks are disposable (aggressive deletion policy)
- Notebooks are automatically deleted on task completion/failure/interruption
- Re-runs always start fresh notebook creation (no
notebook_id/source_idreuse)
- Never depend on NotebookLM server state (minimize external dependency)
Filenames are automatically determined as <type>_<hash>.<ext>.
- Hash = first 8 characters of SHA1 of
url/type/prompt/optionsJSON-serialized (keys sorted) - Same input always produces the same filename (deterministic)
- YAML reordering, addition, or deletion does not change filenames (stable skip/resume)
On re-running the same YAML, run_batch.py searches log/run_*.json for the most recent file matching the same instruction_file and resumes under these conditions:
- If that run is not
completed:- Load the existing run, set
status=running/resumed_at, and continue
- Load the existing run, set
- Otherwise:
- Create a new
run_*.jsonand start fresh
- Create a new
Note: instruction_file is stored and compared as a canonical path, so representation differences do not break auto-resume.
- Canonical path is resolved via Python's
Path.resolve()(symlinks are resolved)
- Tasks are processed top to bottom
- If all outputs already exist: skip the entire task (no notebook creation)
notebooklm create --json -- <title>creates the notebooknotebooklm source add --json <url>→source waitadds the source
- Contents are processed top to bottom
- (A) Output file exists:
skipped - (B) Otherwise:
generate→ identifyartifact_id→artifact wait→download→completed
- (A) Output file exists:
- Notebook is deleted at task end (success/failure/interruption)
-
AUTH_REQUIRED(session expired) orRATE_LIMITED→ blocked- Records
status=blockedanderrorin progress, stops processing - After re-authentication (
notebooklm login) or account switch, re-running the same YAML auto-resumes
- Records
-
Ctrl+C → aborted
- Records
status=abortedand exits - Re-running the same YAML auto-resumes
- Records
-
Non-Ctrl+C exits:
SIGTERM(e.g. terminal/app force-close, host shutdown) is also treated as aborted (status=aborted)SIGKILL(e.g.kill -9) and power loss cannot be caught —run_*.jsonmay remain withstatus=running- Even then, re-running the same YAML picks up the existing run and continues (output files are the truth; existing files are skipped)
-
source waitfailure (timeout/failure/not ready) or download failure → error- Records
status=erroranderror.codein progress, exits (exit code 1) - Re-running restarts from notebook creation; already-completed outputs are skipped
- Records
If an output file already exists, re-running the same instruction file skips that content. To regenerate with the same settings:
| Method | Operation | Notes |
|---|---|---|
| Back up the file | mv ./files/<dir>/<type>_<hash>.<ext> ./files/<dir>/<type>_<hash>.<ext>.bak |
Keeps the original for comparison |
| Delete the file | rm ./files/<dir>/<type>_<hash>.<ext> |
Simple regeneration |
| Back up the directory | mv ./files/<dir> ./files/<dir>_old |
Regenerates all content for that source |
Note: Changing prompt or options produces a different hash, creating a new file (the existing file remains).
# Example: force regeneration of an image
mv ./files/dQw4w9WgXcQ/image_a1b2c3d4.png ./files/dQw4w9WgXcQ/image_a1b2c3d4.png.bak
# Re-run (the image will be regenerated instead of skipped)
python3 run_batch.py ./instructions/my_instruction.yamlsequenceDiagram
autonumber
participant U as User
participant RB as run_batch.py
participant LOG as ./log/run_*.json
participant NLM as notebooklm CLI
participant FS as ./files/
U->>RB: run_batch.py <instruction.yaml>
RB->>LOG: Search for existing run (canonical instruction_file match)
alt Existing run found & status != completed
RB->>LOG: Load existing run (status=running, resumed_at)
else Otherwise
RB->>LOG: Create new run (status=running)
end
loop tasks[] (per source)
RB->>FS: All outputs exist?
alt yes
RB->>LOG: task.status=skipped
else no
RB->>NLM: create --json -- <title>
NLM-->>RB: notebook_id
RB->>LOG: Save notebook_id
RB->>NLM: source add --json <url>
NLM-->>RB: source_id
RB->>NLM: source wait --timeout ... --json
alt ready
RB->>LOG: Save source_id
else auth required
RB->>NLM: delete -n <notebook_id> -y
RB->>LOG: status=blocked (AUTH_REQUIRED)
RB-->>U: exit 3
else timeout/fail/not_ready
RB->>NLM: delete -n <notebook_id> -y
RB->>LOG: status=error (SOURCE_*)
RB-->>U: exit 1
end
loop contents[] (per content)
RB->>FS: Output exists?
alt yes
RB->>LOG: content.status=skipped
else no
RB->>NLM: generate <type> --json -- <prompt>
alt RATE_LIMITED/AUTH_REQUIRED
RB->>NLM: delete -n <notebook_id> -y
RB->>LOG: status=blocked
RB-->>U: exit 3
else ok
RB->>NLM: artifact wait --json
RB->>NLM: download <type>
RB->>FS: tmp → rename commit
RB->>LOG: content.status=completed
end
end
end
RB->>NLM: delete -n <notebook_id> -y
RB->>LOG: task.status=completed
end
end
RB->>LOG: status=completed, finished_at
RB-->>U: exit 0
stateDiagram-v2
[*] --> running
[*] --> skipped: All outputs already exist
running --> completed
running --> skipped: All outputs already exist
running --> blocked
running --> aborted
running --> error
blocked --> running: (re-auth / account switch, then re-run)
aborted --> running: (re-run)
error --> running: (re-run)
| Level | Status | Description |
|---|---|---|
| run / task | running |
In progress |
| run / task | completed |
Completed successfully |
| run / task | skipped |
Skipped because all outputs already exist |
| run / task | blocked |
Stopped due to auth expiry / rate limit |
| run / task | aborted |
Interrupted by Ctrl+C / SIGTERM |
| run / task | error |
Processing failed (source wait / generate / download failure) |
| content | pending |
Not yet processed |
| content | running |
Being processed |
| content | completed |
Completed successfully |
| content | skipped |
Skipped because output file already exists |
| content | blocked |
Auth expiry / rate limit |
| content | error |
Processing failed |
Minimum steps for the user to run "specify one YAML and execute."
If the session has expired or you want to run with a different account:
notebooklm login- Log in to Google in the browser, press ENTER in the terminal once NotebookLM's home is visible
Generation takes minutes to tens of minutes — background execution is recommended. Pair with settings.notify for "fire and forget → GitHub notification" UX.
# Background execution (recommended)
nohup python3 ./run_batch.py ./instructions/<INSTRUCTION>.yaml > log/nohup_output.log 2>&1 &
# Foreground execution (shows progress spinner, Ctrl+C to abort)
python3 ./run_batch.py ./instructions/<INSTRUCTION>.yaml- Progress file (run state):
./log/run_YYYYMMDDHHmmss.json - Outputs:
./files/<dir>/<type>_<hash>.<ext>
- Press Ctrl+C to interrupt
run_*.jsonrecordsstatus=aborted
- To resume, re-run the same YAML
- The most recent
run_*.jsonwith a matchinginstruction_file(canonical path) is automatically loaded and processing continues
- The most recent
If run_*.json shows status=blocked, address the cause and re-run:
AUTH_REQUIREDnotebooklm loginto re-authenticate → re-run the same YAML
RATE_LIMITED- Wait and retry, or switch to a different account
- To switch accounts: back up the existing
storage_state.json, thennotebooklm login
On re-run, any content whose output file already exists is always skipped.
- Filenames are determined as
<type>_<hash>.<ext> - Same URL/type/prompt/options → same filename
- Changing the prompt produces a different filename (treated as a different output)
settings.language is a global setting only (per-content language override is not currently supported by run_batch.py).
Do not run the same YAML with multiple processes simultaneously (no mutual exclusion is implemented).
- Progress JSON temp file:
run_*.json.tmp(for atomic update) - Output temp file:
<type>_<hash>.<ext>.__tmp__<run_id>(renamed to final path after successful download)
Temp files may remain after interruption or abnormal exit. They are safe to delete (they are not final output files).
0: completed3: blocked130: aborted (Ctrl+C)
Use nohup to run batch processing without occupying the terminal.
Combine with settings.notify for "fire and forget → GitHub notification" UX.
nohup python3 ./run_batch.py ./instructions/<INSTRUCTION>.yaml > log/nohup_output.log 2>&1 &> log/nohup_output.log 2>&1— redirect stdout and stderr to a log file- Trailing
&— run in background without blocking the shell - Spinner output is recorded in the log file (use GitHub Issue notifications for progress monitoring)
Check and stop the process:
# Check if running
jobs # within the same shell session
ps aux | grep run_batch # from another terminal
# Stop
kill %1 # job number from jobs
kill <PID> # PID from psNote: Closing the terminal does not stop a
nohupprocess. Shutting down the host machine does.
- User creates an instruction file (YAML)
- List of sources (YouTube URLs, website URLs, etc.)
- Per-source content types, options, and prompts
- AI reads the instruction file and runs the batch automatically
- Creates a new notebook on NotebookLM per source
- Adds the source
- Generates and saves the specified content sequentially
- AI reports results after processing
- Summary of results per source (success/failure)
- Notebooks are automatically deleted after processing (nothing remains on NotebookLM)
The user only needs to place an instruction YAML and request execution. The AI handles execution, progress management, and recovery.
- Place instruction files (
.yml/.yaml) under./instructions/ - Request execution
- Complete browser-based
notebooklm loginauthentication on first use
- Read the instruction file and execute: create notebook → add source → generate → download → save per source
- Write execution logs to
./log/ - Persist progress for long-running jobs and abnormal exits; on restart, auto-skip completed outputs and regenerate only remaining ones
Sequential execution is used for stability.
tasksare processed top to bottomcontentswithin a task are also processed top to bottom
Since notebooks are automatically deleted after processing, recovery after interruption is based on output file existence. On re-run, notebook creation starts fresh, but Contents whose outputs already exist are automatically skipped.
Note: YouTube source ingestion (transcription) is very fast (seconds to tens of seconds), so re-creation overhead is negligible.
Progress is saved under ./log/ per batch (= per instruction file execution).
- e.g.
./log/run_YYYYMMDDHHmmss.json - At minimum, the following are recorded per content generation:
sourcenotebook_idcontent typeoutput_pathartifact_id(when available)
Since the same type may be generated multiple times within a notebook, recovery uses artifact_id rather than type alone (regenerate if unavailable).
{
"run_id": "20260208200022",
"instruction_file": "/abs/path/to/.../instructions/example.yaml",
"started_at": "2026-02-08T20:00:22+09:00",
"resumed_at": "2026-02-08T21:30:00+09:00",
"status": "running",
"tasks": [
{
"url": "https://www.youtube.com/watch?v=BDl3X9GqhVc",
"video_id": "BDl3X9GqhVc",
"title": "Test News",
"notebook_id": "2ae0f4ee-0164-4ad4-b2aa-48ac7986a72b",
"source_id": "ac317bde-4746-4e70-998f-e05a924a8163",
"status": "running",
"contents": [
{
"content_id": "image_7f3a2c1b",
"type": "image",
"prompt": "Visually summarize the key points.",
"options": {
"orientation": "landscape",
"detail": "standard"
},
"task_id": "<generation job ID (obtained at generate time)>",
"artifact_id": "<artifact ID (obtained after generation completes)>",
"status": "running",
"output_path": "files/BDl3X9GqhVc/image_7f3a2c1b.png",
"error": null
}
]
}
],
"finished_at": "<set when completed/blocked/aborted/error>"
}{
"error": {
"code": "GENERATE_FAILED",
"at": "2026-02-08T20:15:30+09:00",
"stdout": "<CLI stdout>",
"stderr": "<CLI stderr>",
"detail": { "<CLI JSON output if available>" }
}
}Common error codes: CREATE_FAILED, SOURCE_ADD_FAILED, SOURCE_WAIT_FAILED, SOURCE_NOT_READY, GENERATE_FAILED, ARTIFACT_WAIT_FAILED, ARTIFACT_FAILED, ARTIFACT_NOT_FOUND, DOWNLOAD_FAILED, AUTH_REQUIRED, RATE_LIMITED
Notes:
instruction_fileis stored and compared as a canonical path (equivalent torealpath)resumed_atis set only on resume runstask_idis stored only when available fromgenerateartifact_idis stored only when available after generation (regenerate if unavailable)content_idis an internal identifier (auto-determined astype_<hash>)
- Wait for source ingestion to complete:
notebooklm source wait <SOURCE_ID> --timeout <sec> [--json]
- Check generation job status:
notebooklm artifact poll <TASK_ID> -n <NOTEBOOK_ID>- Note:
artifact polldoes not support--json - If
TASK_IDis unavailable, useartifact wait --jsonafterartifact_idis confirmed (run_batch.pysavesartifact_idto progress)
- Wait for generation to complete (blocking):
notebooklm artifact wait <ARTIFACT_ID> -n <NOTEBOOK_ID> --timeout <sec> --interval <sec> [--json]run_batch.pydefaults to--timeout 86400 --interval 5(24-hour timeout, effectively unlimited)
Use --json with generate to get machine-readable output and reliably capture identifiers such as artifact_id.
Aggressive deletion policy: notebooks are automatically deleted after processing (success/failure/interruption). Manual notebook deletion is not normally required.
If a notebook remains for any reason, delete it from the Web UI or via CLI:
- e.g.
notebooklm delete -n <NOTEBOOK_ID> -y(partial ID is accepted)
Guide for automatically generating podcasts, images, slides, and videos from YouTube videos.
# Install via pipx
pipx install "notebooklm-py[browser]"
# Install Playwright's Chromium
pipx run --spec "notebooklm-py[browser]" python -m playwright install chromium
# Initial authentication
notebooklm login
# → Browser opens for Google login → press ENTER to save
# Verify
notebooklm --versionCredentials are saved to ~/.notebooklm/storage_state.json.
- To try a different account, back up
~/.notebooklm/storage_state.jsonand re-runnotebooklm login - If a notebook title starts with
-, the CLI interprets it as an option — use--to separate it (e.g.notebooklm create --json -- "-LpMZyDZI8k") - Infographic (image) generation can take several minutes (source ingestion wait + generation and download)
- Generation may hit
RATE_LIMITED(Google-side rate limiting). If recovery is difficult, consider re-logging in with a different account - Running multiple test cycles can accumulate notebooks on NotebookLM; delete unneeded ones with
notebooklm delete -n <NOTEBOOK_ID> -y
Background: Batch processing takes a long time (image ~3 min, slide ~6 min, podcast/video 10+ min), blocking the terminal UX.
Implementation:
- Added
settings.notifyoption (github_issue,github_repo) - Posts GitHub Issue comments at each batch step to notify progress
- Notifications are best-effort (failures do not affect batch processing)
- Combines with
nohupfor "fire and forget → notification arrives" UX
Notification events: start, skip, content complete, RATE_LIMITED, AUTH_REQUIRED, error, batch complete, interrupted
Background: Output directories were fixed to video_id, which was hard to read.
Implementation:
- Added
settings.output_dir_modeoption - Three modes:
video_id: video_id only (default, backward compatible)title: title only (readability focused)title_with_video_id: title + video_id (recommended)
slugify()function converts to filesystem-safe strings- Handles Windows reserved names, unsafe characters, and length limits
Output examples:
video_id: files/dQw4w9WgXcQ/
title: files/Rick_Astley_-_Never_Gonna_Give_You_Up/
title_with_video_id: files/Rick_Astley_-_Never_Gonna_Give_You_Up__dQw4w9WgXcQ/
Background: Even when output files already existed, notebooks and sources were created on NotebookLM, causing large numbers of duplicate notebooks to accumulate during test runs.
Decision:
- Option A (keep notebooks and reuse): complex state management
- Option B (generate → download → delete): simple implementation, eliminates duplication problem → adopted
Implementation:
- Notebooks are automatically deleted on task completion/failure/interruption
- If all outputs already exist, skip notebook creation entirely
- Enforce "output files are the source of truth" principle (no dependency on NotebookLM server state)
try/finallypattern prevents deletion leaks
Accepted trade-offs:
- Cannot inspect results in NotebookLM Web UI → deemed unnecessary
- Notebook creation on every regeneration → source addition is fast (seconds to tens of seconds), so acceptable
