Showing posts with label pdf. Show all posts
Showing posts with label pdf. Show all posts

Thursday, January 24, 2013

How to go paperless the future-proof way (without using Evernote)

The short version:

  • Get a Doxie One scanner. For (optional) convenience, get an Eye-Fi SD card for $35, which will send documents from the scanner to your computer wirelessly and automatically.

    Update: If you want to cut your scanning time by more than half, spend a few bucks more and get the Canon P-215 scanner (as recommended by The Wirecutter). It scans twice as fast, it scans both sides of the page at the same time, it has a document feeder, and it has built-in OCR.
  • Scan your documents to your hard drive. Put them all in one big folder. If you want, use the OCR option in Doxie’s software to make your PDFs searchable.
  • Date and tag the documents: use this format for filenames:
    [year]-[month] name of document #tag1 #tag2 #tag3.pdf
  • Get an expanding file jacket. Put all your important papers in there and out of the way, and scan them all once every quarter or so (I usually do my scans once a year just before I do my taxes).
  • Shred the original documents.
  • Keep backups.

When you want to run a search, open your archive folder and run a search. Need all tax-related documents for last year? Search for 2012 #taxes.

No subscriptions. No extra software. Guaranteed to be stay portable and useful for the next 20 years.

Why the hashtags? What about folders?

Using #tags #like #this in the filename is a universal tagging mechanism. It works on Mac OS and on Windows, and the searches on those systems will have no problem finding your files.

Sticking files in a folder hierarchy is a poor way of filing that is on its way out. For example, what do you do if a document is both medical and tax-related in nature? Do you create a Taxes → Medical folder structure, or Medical → Taxes?

Tags allow a document to live in more than one box at a time, are easy to add, and are easy to search for.

Why Not Evernote? Reasons, that’s why.

There have been a couple of great posts by others lately about going paperless, and they’re definitely worth reading. But they all assume you need some kind of fancy software setup, including (most commonly) a subscription plan to Evernote.

I agree it must be nice to have someone else run OCR on my documents and host them for me. But let’s look at the drawbacks:

  • I have to pay someone else $60 a year in case I need to perform occasional full-text searches on my own documents.
  • Long-term uncertainty. Will Evernote be around in 10 years? 20 years? How do I know I’ll be able to get my massive archive back out again when they go out of business?
  • Handing off responsibility for your sensitive documents to someone else’s computers — this is just asking for trouble. Data corruption, security breaches, warrantless searches. Over the next 10 years, it’s almost a given that your hosted service of choice will be hit by at least one of them.

Monday, April 23, 2012

How to Easily Use OpenType Fonts in LaTeX

I became interested in LaTeX out of a desire to be able to produce high-quality PDFs for self-published books. Someday I hope to be able to produce books of comparable quality to these humanities books typeset in TeX. This idea became even more feasible when I discovered the text content could be written in Markdown and converted to LaTeX with pandoc (More information in this article).

Typographically, the example books I linked to above are more the exception than the rule: the vast majority of LaTeX documents use the same boring default font, Computer Modern, that was originally packaged with the software in the 1980s. Using Computer Modern in a self-published book would be almost as bad as using Times New Roman or Arial.

If you try to figure out whether and how you might be able to use your computer’s normal fonts with LaTeX, you will soon come across a lot of extremely complicated and incomplete documentation about how to convert TrueType or OpenType fonts into a format LaTeX can use.

The happy truth is that these instructions are now obsolete: you now have easy access to OpenType fonts on Windows and Mac platforms, thanks to a new version of LaTeX called XeTeX. XeTeX includes a package called fontspec that gives full access to all system fonts, as well as advanced features for OpenType fonts, such as ligatures and small caps. XeTeX is available for Mac, but what most people don’t say is that this font-accessing goodness can also be used on Windows since XeTeX is included with Windows distributions such as TeX Live and MikTeX.

That being understood, here’s how to use your system fonts in your TeX documents (source):

  1. Use the xelatex command in place of pdflatex
  2. Add \usepackage{xltxtra} at the beginning of your preamble (some XeTeX goodies, in particular it also loads fontspec, which is needed for font selection).
  3. Add \setmainfont{Name of OTF font} in the preamble.
  4. No step 4.

Note: If you are using the aforementioned pandoc to generate your TeX documents, you do not need to do step 2 — pandoc already includes the fontspec package in its default template. Also, you can set the main font by adding the option --variable=mainfont:"font name" when calling the pandoc command.

Friday, July 2, 2010

Quickly apply “Inherit Zoom” to all bookmarks in a PDF file

I often compile PDF reports totaling 1,000 to 2,000 pages, and am always creating hierarchical bookmarks for these files. One of my beefs is that often, clicking a bookmark will change the magnification setting to “fit width” from whatever you had it set at. Ideally, clicking a bookmark should default to just leaving the zoom level at whatever the reader is currently set to use (“Inherit zoom”). I’ve found no way in Acrobat to set the default behaviour for newly-created bookmarks, and changing these links one at a time in Acrobat is extremely tedious.

Thankfully, there is a solution provided by Martin Backschat:

To modify this annoyance you can directly modify the PDF document in your Text editor.

With UltraEdit, for example, I load the PDF document and open the “Search and Replace” box, enable “Regular Expressions” and replace all occurrences of “R/XYZ*]” with “R/XYZ]”, and then also all occurrences of “R/Fit*]” with “R/XYZ]”. Now safe the document.

With the Perl scripting language, this hack is applied with

perl -pe 's#R/(XYZ.*?|Fit.*?)\]#R/XYZ\]#g#' in.pdf >out.pdf

The next time you open the modified document with Acrobat you will get a message that the document is being repaired. Just safe it again with Acrobat and everything is fine.

I don’t have UltraEdit but I was able to make this work using Notepad++. The exact search/replace text was slightly different for my PDF files (note the added space) but the principle is the same:

  1. In Notead++, click the TextFX menu, then TextFX QuickFind/Replace (the standard find/replace tool will not help you here). Make sure you check the “Regular Expr” check box on the right.
  2. Use "R /XYZ(.*)]" (without quotes) as your search string and "R XYZ]" as your replace string and replace all instances.
  3. Do it again using "R /Fit(.*)]" as your search string and the same replace string as above.

Friday, September 9, 2005

Inserting graphics into a PDF

If you have Adobe Acrobat (the real deal, not just the reader) you can insert graphics into PDF docs, but it is not intuitive - mainly because Acrobat can only cut and paste graphics within itself, not to/from other programs.

  1. Click Document menu → Insert Page
  2. In the Browse window, change the file type to your graphics format (GIF, JPG, etc) and select your graphics file.
  3. The image will be inserted on its own page.
  4. Click the TouchUp Object tool on the toolbar. Right-click the image and select Cut.
  5. Go to the page where you want the document and Paste in the image. Drag & drop the image to its correct location.
  6. Click Document menu → Delete page and delete the (now-empty) page you just inserted.
Resizing the graphics: I haven't yet figured that out. I am using 5.0 (an old version) so things may be different in the newer versions of Acrobat.