How to Download online web pages as PDF with Percollate

(: October 15, 2018)

Have ever wondered how you can download web pages on your Linux terminal as PDF files?. This guide will help you use Percollate command line tool to download online web pages as beautifully formatted PDF files.

How Percollate works

Here is how Percollate works:

  1. Fetch the page(s) using got
  2. Enhance the DOM using jsdom
  3. Pass the DOM through Mozilla/readability to strip unnecessary elements
  4. Apply the HTML template and the print stylesheet to the resulting HTML
  5. Use puppeteer to generate a PDF from the page

How to install Percollatein Linux

Percollate needs Node.js version 8 or later installed on your Local system, as it uses new(ish) JavaScript syntax. Install Node.js using or guide:{text-align:left} img{margin:0 auto 0 0}

How to run multiple versions of Node.js on Linux

Once Node.js is installed, you can then proceed to install percollate globally using either yarn or npm

For npm use:

npm install -g percollate

For yarn, use:

yarn global add percollate

Check the installed version by running:

$ percollate --version

For help page, use:

$ percollate --help
Usage: percollate [options] [command]

  -V, --version             output the version number
  -h, --help                output usage information

  pdf [options] [urls...]   Bundle web pages as a PDF file
  epub [options] [urls...]  Bundle web pages as an EPUB file
  html [options] [urls...]  Bundle web pages as a HTML file

Updating Percollate

To keep the package up-to-date, you can run:

$ npm install -g percollate
$ yarn global upgrade --latest percollate

Using Percollate

The basic commands available are:

  • percollate pdf: Bundles one or more web pages into a PDF
  • percollate epub: Bundles one or more web pages into an epub
  • percollate html: Bundles one or more web pages into an HTML file

Available options are:

  • -o, –output – The path of the resulting bundle; when omitted, the output file name is derived from the title of the web page.
  • –individual – Export each web page as an individual file.
  • –template – Path to a custom HTML template
  • –style – Path to a custom CSS
  • –css: Additional CSS styles you can pass from the command-line to override the default/custom stylesheet styles

See below Examples

Transform a single web page to PDF:

percollate pdf --output file filename.pdf

To bundle several web pages into a single PDF, specify them as separate arguments to the command:

percollate pdf --output flename.pdf

You can use common Unix commands and keep the list of URLs in a newline-delimited text file:

cat urls.txt | xargs percollate pdf --output filename.pdf

To transform several web pages into individual PDF files at once, use the –individual flag:

percollate pdf --individual --output some.pdf

Set Custom page size / margins

The default page size is A5 (portrait). but you can use the --css option to override it using any supported CSS size:

percollate pdf --output some.pdf --css "@page { size: A3 landscape }"

Similarly, you can define using:

Custom margins: @page { margin: 0 }
The base font size: html { font-size: 10pt }

Or any other style defined in the default/custom stylesheet.

Thanks for using our guide to Download Web page as PDF file.{text-align:left} img{margin:0 auto 0 0}

Related Posts