{"slug":"pdf-generation","meta":{"title":"Generate PDFs (headless Chrome) without wedging your app","slug":"pdf-generation","category":"Files","summary":"A naive `grover.to_pdf` in a controller can freeze your entire app (blank screen / infinite spinner). Funnel every PDF render through one safe helper — single-flight, hard timeout, self-contained HTML — so a slow or hung render can never take the app down.","tags":["pdf","grover","headless-chrome","performance","puma","files"],"status":"stable","visibility":"public","source_project":"leo-note.leo.llamapress.ai","layers":["controller","view","stimulus_js"]},"body":"# Generate PDFs (headless Chrome) without wedging your app\n\n\u003e ⚠️ **Cookbook example — not live code.** (KEEP THIS CALLOUT.) Every code block below\n\u003e is an **example snippet**, **not part of the llamapress.ai codebase**, and **not\n\u003e running on this server**. This is a reference recipe for a **Leo instance (an AI coding\n\u003e agent) to implement in its own app** — read it to understand the pattern, then recreate\n\u003e it there.\n\nGenerating a PDF from HTML with [Grover](https://github.com/Studiosity/grover) (which\ndrives a headless Chromium under the hood) is easy to ship — and easy to take your whole\napp down with. A plain `grover.to_pdf` inside a controller action is **synchronous, slow\n(seconds), and memory-heavy**, and on a small server it can freeze *every* page for *every*\nuser: blank screen, infinite loading spinner, no error. This guide is the safe pattern —\nthe same shape works for `wkhtmltopdf`, `weasyprint`, or any headless-browser render.\n\n\u003e **When to use:** any \"Export / Download / Print PDF\" button backed by HTML → PDF.\n\u003e **When not to:** if a library can build the PDF in-process without a browser (e.g.\n\u003e Prawn for simple tabular docs), prefer that — it has none of these failure modes.\n\n---\n\n## Why the naive version takes your whole app down\n\nYour web server (Puma) runs a **small, fixed pool of threads** — often just 3. Each thread\nserves one request at a time. A headless-Chrome PDF render is uniquely hostile to that pool:\n\n1. **No timeout by default.** `grover.to_pdf` will wait *forever* if Chrome hangs. That\n   thread is now gone — permanently.\n2. **It's heavy.** Each render launches a Chromium worth a few hundred MB. Two or three at\n   once can exhaust RAM and get OOM-killed.\n3. **It calls back into your own server.** With `display_url` set, Chrome fetches the\n   page's CSS/JS/images *over HTTP from your app* — consuming *more* of the same tiny\n   thread pool. If the pool is busy, Chrome's asset requests can't be served, the render\n   stalls, and you deadlock.\n\nSo a user clicks \"Export PDF\" a few times (because it feels stuck), the slow/hung renders\neat all 3 threads, and **the entire app wedges** until someone restarts it.\n\n\u003e **Diagnosing a wedge (so you recognize it):** every page hangs with no error; the Rails\n\u003e log *stops* — no new `Started GET` lines even though requests are arriving; `curl\n\u003e localhost:3000/` from inside the container hangs (times out) instead of refusing;\n\u003e `Started` log lines far outnumber `Completed`. Immediate unstick: restart the web\n\u003e process. Durable fix: this guide.\n\n---\n\n## The 80/20 in one breath\n\n1. Render your HTML to a string as usual.\n2. Funnel **every** PDF render through one helper, `safe_pdf` / `pdf_with_lock`, that:\n   (a) lets only **one** render run at a time (reject extras — don't queue them),\n   (b) hard-caps total time, (c) gives Chrome its own timeouts, (d) turns any failure into\n   a friendly redirect instead of a 500 or a frozen thread.\n3. Make the print HTML **self-contained** (inline CSS, inline/`data:` images) so Chrome\n   needs **zero** subrequests back to your server.\n4. Put a **loading state** on the export button so one click can't become five.\n5. Only if you truly need concurrent PDFs: move generation to a **background job** (needs\n   a real queue). Most apps never need this.\n\n---\n\n## Layer 1 — The controller helper (this is the whole fix)\n\nAdd one private helper and route every PDF action through it. This is plain app code — no\ngems to add, no server config to change.\n\n```ruby\n# app/controllers/\u003cyour\u003e_controller.rb  (or a concern included by ApplicationController)\n\n# How long a single render may take before we give up (seconds).\nPDF_RENDER_TIMEOUT = (ENV[\"PDF_RENDER_TIMEOUT_S\"] || 25).to_i\n# Process-wide guard: only ONE headless Chrome runs at a time.\nPDF_RENDER_LOCK = Mutex.new\n\n# Render a Grover PDF safely. Returns the PDF bytes, or raises Timeout::Error\n# (which the caller rescues into a friendly redirect). Two failure modes are\n# deadly on a small Puma pool, and this closes both:\n#   * a slow/hung render holding a web thread forever -\u003e hard time cap.\n#   * several concurrent renders each launching their own Chrome (RAM blow-up)\n#     AND starving each other's asset fetches -\u003e single-flight (one at a time).\n# We REJECT extra renders rather than queue them: queueing would hold the other\n# Puma threads while waiting and, because Chrome fetches assets back from this\n# same server, starve those fetches and deadlock the pool anyway.\ndef pdf_with_lock(grover)\n  unless PDF_RENDER_LOCK.try_lock\n    raise Timeout::Error, \"another PDF is already rendering\"\n  end\n  begin\n    Timeout.timeout(PDF_RENDER_TIMEOUT) { grover.to_pdf }\n  rescue Timeout::Error\n    raise\n  rescue =\u003e e\n    Rails.logger.error(\"[PDF] render failed (action=#{action_name}): #{e.class}: #{e.message}\")\n    raise Timeout::Error, \"pdf render failed (#{e.class})\"  # normalize -\u003e one rescue path\n  ensure\n    PDF_RENDER_LOCK.unlock if PDF_RENDER_LOCK.owned?\n  end\nend\nprivate :pdf_with_lock\n```\n\nEvery PDF action then looks like this — note Grover's **own** timeout options, which make\nChrome abort cleanly (no orphaned browser process) before the Ruby backstop fires:\n\n```ruby\n# app/controllers/\u003cyour\u003e_controller.rb\ndef export_report_pdf\n  @report = current_user.reports.find(params[:id])\n  html = render_to_string(layout: false, template: \"reports/pdf\", formats: [:html])\n\n  grover = Grover.new(\n    html,\n    format: \"A4\",\n    margin: { top: \"10mm\", bottom: \"10mm\", left: \"12mm\", right: \"12mm\" },\n    print_background: true,\n    prefer_css_page_size: true,\n    # Chrome-level timeouts (ms) -\u003e a stuck render tears Chrome down cleanly.\n    # Keep them a few seconds UNDER PDF_RENDER_TIMEOUT so Chrome aborts first.\n    launch_timeout:  (PDF_RENDER_TIMEOUT - 5) * 1000,\n    convert_timeout: (PDF_RENDER_TIMEOUT - 5) * 1000,\n    timeout:         (PDF_RENDER_TIMEOUT - 5) * 1000\n  )\n\n  begin\n    pdf_data = pdf_with_lock(grover)\n  rescue Timeout::Error\n    redirect_back(fallback_location: root_path,\n                  alert: \"PDF generation is busy or timed out — please try again.\")\n    return\n  end\n\n  send_data pdf_data, filename: \"report-#{Date.today}.pdf\",\n                      type: \"application/pdf\", disposition: \"inline\"\nend\n```\n\nThat's it. Every render is now bounded in time, never runs concurrently, and fails as a\nredirect instead of a frozen page.\n\n## Layer 2 — Make the print HTML self-contained (kills the deadlock)\n\nThe sneakiest failure is Chrome fetching your assets back over HTTP. Avoid it: the PDF\ntemplate should not depend on your server serving anything.\n\n```erb\n\u003c%# app/views/reports/pdf.html.erb %\u003e\n\u003c!DOCTYPE html\u003e\n\u003chtml\u003e\n  \u003chead\u003e\n    \u003cmeta charset=\"utf-8\"\u003e\n    \u003cstyle\u003e\n      /* Inline ALL styling here. Do NOT \u003c%%= stylesheet_link_tag %\u003e a same-origin\n         asset — that makes Chrome call back into your app to fetch it. */\n      body { font-family: Arial, sans-serif; color: #111; }\n      .title { font-size: 20px; font-weight: 700; }\n      table { width: 100%; border-collapse: collapse; }\n      td, th { border: 1px solid #ddd; padding: 6px; }\n    \u003c/style\u003e\n  \u003c/head\u003e\n  \u003cbody\u003e\n    \u003cdiv class=\"title\"\u003e\u003c%= @report.name %\u003e\u003c/div\u003e\n    \u003c%# Images: prefer a data: URI or an absolute CDN URL, not a relative /assets path. %\u003e\n    \u003c!-- ... --\u003e\n  \u003c/body\u003e\n\u003c/html\u003e\n```\n\nIf you genuinely must load a same-origin asset, you can, but only because the single-flight\nguard leaves the other threads free to serve Chrome's fetches — keep the asset count tiny.\n\n## Layer 3 — A loading state on the button (so one click ≠ five)\n\nA slow export with no feedback gets spam-clicked. Disable the trigger and show a spinner\nthe moment it's clicked. (See the companion guide **async-action-feedback** for the full\npattern.)\n\n```erb\n\u003c%# app/views/reports/show.html.erb %\u003e\n\u003ca href=\"\u003c%= export_report_pdf_path(@report) %\u003e\"\n   data-controller=\"busy\" data-action=\"busy#go\"\n   class=\"btn\"\u003e\n  \u003cspan data-busy-target=\"idle\"\u003e⬇️ Export PDF\u003c/span\u003e\n  \u003cspan data-busy-target=\"working\" class=\"hidden\"\u003eGenerating…\u003c/span\u003e\n\u003c/a\u003e\n```\n\n```javascript\n// app/javascript/controllers/busy_controller.js\nimport { Controller } from \"@hotwired/stimulus\"\nexport default class extends Controller {\n  static targets = [\"idle\", \"working\"]\n  go() {\n    this.idleTarget.classList.add(\"hidden\")\n    this.workingTarget.classList.remove(\"hidden\")\n    this.element.classList.add(\"pointer-events-none\", \"opacity-60\")\n    // The browser navigates to the PDF; reset shortly after in case it streams inline.\n    setTimeout(() =\u003e {\n      this.idleTarget.classList.remove(\"hidden\")\n      this.workingTarget.classList.add(\"hidden\")\n      this.element.classList.remove(\"pointer-events-none\", \"opacity-60\")\n    }, 8000)\n  }\n}\n```\n\n---\n\n## Advanced — move it to a background job (only if you need concurrency)\n\nThe synchronous helper above is enough for the vast majority of apps (a handful of PDF\nexports a day). If you need *many concurrent* exports, generate in a background job so web\nthreads are never spent on rendering at all:\n\n1. Controller enqueues `GeneratePdfJob.perform_later(record.id)` and renders a \"preparing\n   your PDF…\" page.\n2. The job renders with Grover and attaches the result via **Active Storage**\n   (`record.pdf.attach(io: StringIO.new(bytes), filename: ...)`).\n3. The page polls a small status endpoint (or uses Turbo Streams) and, when ready, shows\n   a download link to the Active Storage blob.\n\n\u003e **Caveat — this needs a real out-of-process queue.** With the default in-process job\n\u003e adapter (`:async`), jobs run in the *same* process and still compete for memory/CPU, so\n\u003e you gain little. Only adopt the job pattern if your app has a proper worker (e.g.\n\u003e Sidekiq) running. If it doesn't, stick with the synchronous safe helper above — it's\n\u003e fully sufficient and needs zero infrastructure.\n\n---\n\n## Gotchas (the hard-won stuff)\n\n- **`grover.to_pdf` has no timeout by default.** A hung Chrome holds a web thread *forever*\n  and, on a small thread pool, a few of those freeze the entire app. Always wrap it.\n- **Your thread pool is tiny.** Puma defaults to ~3 threads in development. Three stuck\n  renders = a dead app. **Don't \"fix\" this by cranking `RAILS_MAX_THREADS`** — on a small\n  box more threads just means more concurrent Chrome and an OOM kill. Bound the work, don't\n  widen the door.\n- **Reject, don't queue.** Use `Mutex#try_lock` (reject the extra render) — **not**\n  `Mutex#synchronize` (queue it). Queued requests sit holding their own threads while they\n  wait, and because Chrome fetches assets back from the same server, that re-creates the\n  deadlock. Rejecting frees the thread instantly.\n- **`display_url` makes Chrome call back into your app.** It fetches the page's assets over\n  HTTP using your Puma threads. Inline your CSS/images so it needs none.\n- **Give Chrome its own timeouts** (`launch_timeout` / `convert_timeout` / `timeout`) a few\n  seconds *below* your Ruby timeout, so Chrome aborts and cleans itself up first — otherwise\n  a Ruby-level timeout can leave an orphaned Chrome process eating RAM.\n- **Avoid `wait_until: \"networkidle0\"` for print HTML.** A page with a persistent\n  connection (ActionCable, long-poll, analytics beacons) never goes \"network idle\", so the\n  render waits until it times out. Render static, self-contained HTML.\n- **Normalize errors to one path.** Convert any Grover/Chrome exception into the same\n  timeout/redirect branch, so a render failure is a friendly \"try again\" — never a 500 or a\n  spinner that never resolves.\n\n---\n\n## Files this pattern touches\n\n```\napp/controllers/\u003cyour\u003e_controller.rb     # PDF_RENDER_TIMEOUT + PDF_RENDER_LOCK + pdf_with_lock; each action calls it\napp/views/\u003cresource\u003e/pdf.html.erb         # self-contained print HTML (inline CSS/images)\napp/views/\u003cresource\u003e/show.html.erb        # export button with a loading state\napp/javascript/controllers/busy_controller.js  # (optional) Stimulus loading state\n```\n\n## How to adapt to your app\n\n1. Drop `PDF_RENDER_TIMEOUT`, `PDF_RENDER_LOCK`, and `pdf_with_lock` into the controller\n   that has your PDF actions (or a concern mixed into `ApplicationController` if several\n   controllers render PDFs — one lock for the whole app is correct).\n2. In **each** PDF action, add the three Grover timeout options and replace\n   `pdf_data = grover.to_pdf` with `pdf_data = pdf_with_lock(grover)` wrapped in the\n   `rescue Timeout::Error -\u003e redirect_back` shown above.\n3. Make each print template self-contained (inline styles, no same-origin asset links).\n4. Add the button loading state.\n5. Leave `PDF_RENDER_TIMEOUT` at 25s unless your documents are unusually large; raise it a\n   little if legitimate renders approach the cap, but keep the Chrome timeouts a few seconds\n   under it.\n"}