download-webpage-as-pdf

Installation
SKILL.md

Download a webpage as a PDF (agent-browser recipe)

The naive approaches fail on modern sites:

  • chrome --headless --print-to-pdf captures only the initial viewport's images. Anything below the fold renders as a blank rectangle.
  • agent-browser pdf immediately after open has the same problem - lazy-loaded images haven't decoded yet.
  • Scrolling via JS and then waiting a fixed time is also unreliable - you don't know when each image actually finished.

The fix is one async script that strips lazy-load attributes, scrolls the page to trigger any IntersectionObserver-based loaders, and awaits every <img> to decode. agent-browser's eval waits for the returned promise to resolve before exiting, so the subsequent pdf command sees a fully-loaded DOM.

The recipe

agent-browser open <URL>
agent-browser wait --load networkidle

agent-browser eval "(async () => {
  const sleep = ms => new Promise(r => setTimeout(r, ms));
  ['#onetrust-banner-sdk','#onetrust-consent-sdk','.ot-sdk-container','#ot-sdk-btn-floating','[id*=cookie]','[id*=consent]','[id*=onetrust]'].forEach(s => document.querySelectorAll(s).forEach(e => e.remove()));
Related skills

More from tenequm/skills

Installs
2
Repository
tenequm/skills
GitHub Stars
27
First Seen
5 days ago