Thursday, January 27, 2022

Reading science papers on a Remarkable

A while ago I damaged my old tablet and got a Remarkable 2 as replacement. One of the use-cases for my old tablet was reading science papers. The remarkable can read PDFs and EPUB files and with a 10.3” screen it’s one of the larger eink devices on the market.

Computer science papers often use a 2-column layout, small font and large margins. It’s possible to read, but despite the screen size it wasn’t the most pleasant experience. It would be more convenient if we could re-arrange and re-flow the text.

Turns out we can.

k2pdfopt

k2pdfopt is a tool that optimizes PDF or DJVU files for e-readers like a Kindle. It detects multi-column layouts, re-arranges them and re-flows text to fit onto an e-reader. It does exactly what I wanted. After some fiddling with the settings it turned this:

original PDF

Into this.

processed PDF

k2pdfopt settings

k2pdfopt comes with some presets for devices like the Kindle, but it doesn’t contain any for the Remarkable. I spent some time testing various width, height and margin settings and ended up with the following:

k2pdfopt \
  -h 1940 \
  -w 1402 \
  -omt 0.25cm \
  -omb 0.5cm \
  -fc- \
  -y \
  -x \
  -n \
  -o /tmp/memex/output.pdf \
  /tmp/memex/input.pdf

You can find an explanation of the options in the Command-line options page.

Non interactive mode

I wanted to create a small script around k2pdfopt that indexes the text contents of the PDF into CrateDB and sends the processed PDF to the Remarkable.

k2pdfopt by default starts in an interactive mode. It took me a while to figure out, but to prevent that, all you have to do is send it something on its standard input:

echo "" | k2pdfopt [...]

Swallowed pages

With some PDFs I encountered a weird behavior: k2pdfopt trimmed a couple of pages off the end of the document. A seven page document would end after only five. I first thought this might be an issue with the processing logic, but even a 1:1 copy operation trimmed the document. Based on the interactive output of the command, it seemed as if Ghostscript didn’t deliver the full input.

I discovered in the settings that k2pdfopt can either use MUPDF or Ghostscript as backend for PDF processing. I had first installed k2pdfopt from AUR. This package applied custom patches to disable MUPDF and instead links against Ghostscript. Suspecting that Ghostscript may be the culprit, I replaced the package from AUR with a pre-built binary from the k2pdfopt download page. And indeed, using the pre-built binary solved the problem.

Sending the PDF to Remarkable

How do we get the processed PDF onto the Remarkable? Some helpful people worked out how parts of the Remarkable cloud API work and created rmapi. A command-line tool that allows us to send files to the device. The first time you use it, it prompts for login credentials. After that you can send around files like this:

rmapi put /tmp/memex/output.pdf

Wrap Up

Happy reading.