Eric Niquette UI, UX, Accessibility

The PDF format is still bad for accessibility and you probably shouldn't use it

Published Updated

Despite considerable progress over the years, the PDF format is still a substantial accessibility barrier and usability nightmare, and should generally not be used to disseminate information.

Preface

I don't like PDF as a format. While omnipresent on the web, the format is, justifiably, criticized for its poor usability and accessibility. Its shortcomings are many but invisible to most, hidden in tags and options that can only be accessed with specialized software. This can make issues difficult to explain and demonstrate to peers who may be stuck in their ways.

About the PDF format

Portable Document Format (PDF) files are self-contained, explicitely-designed documents; they do not require fonts, images or other external files to distribute. It all gets bundled in one neat package that's cross-platform and cross-device compatible, making it an attractive solution when the goal is to display a document exactly as it was originally designed.

The format has been around since the early 90s. In 2007, Adobe placed it in the hands of the International Organization for Standardization (ISO).

Modern PDF documents can now be produced in a variety of standards: PDF/UA for accessibility, PDF/A for archival, and a handful of others for design and print. The standard PDF format is, by a wide margin, the most popular and the default profile used by most text editors.

Accessibility concerns

Difficult to remediate

Remediating an inaccessible PDF can be a complex (and often mind-numbingly tedious) process that requires a solid grasp on the semantics and hierarchy of tags. While Acrobat offers a handful of accessibility tools such as the auto-tagger and a validator, fixing the issues these tools uncover is likely to require some expertise.

Even with these skillsets, the hierarchy of basic elements like lists and links are not intuitive and differs from other markup languages like HTML. There's a scarcity of resources available on the topic of tags outside of the complex syntax guides.

Limited customization

You generally can't open a PDF and quickly modify it to suit your needs or preferences — by design, the format ensures consistency and conformity, regardless of the platform or device.

In HTML, a common practice for users suffering from heavy dyslexia is to replace or alter font stacks to increase legibility. While some PDF readers do allow some level of PDF customization, the options are generally limited and can cause reflow and other layout issues.

Poor mobile experience

Most PDF documents are designed for the desktop, resulting in tiny unreadable text and forcing the user to zoom in and pan to read. Most PDF documents also do not reflow or adapt to accommodate the smaller viewport of mobile devices which can cause horizontal scrolling.

In some cases, PDF documents can also be designed in landscape making them even more difficult to read in portrait mode, and vice versa.

Illustration of a document displayed on square, mobile, and wide screen. The mobile view has horizontal scrolling and the wide screen a lot of white space.

No fallbacks

Unless meticulously tagged, screen readers can struggle to interpret PDF document accurately. An untagged PDF is virtually impossible to use, with most screen readers returning the document as "blank" even when plain text is available. Even worse is that, in most of these cases, there is little recourse or fallback available to the user — the document is simply inaccessible to them as-is.

Some screen readers, like JAWS, feature optical character recognition (OCR) tools that could potentially provide assistance in these cases but should only be used as an absolute last resort. Forcing the user rely on another tool is not an acceptable "solution."

Forms are puzzling

Even when properly designed, PDF forms aren't great for accessibility. The format offers little support and a limited toolkit to work with. At best, an input can be assigned a short, plain-text tooltip, which leaves formatting information and other instructions off the table. There is no way to group a series of fields or divide a form in section other than with headings, which isn't ideal.

While some readers offer basic error validation, they generally do not offer anything beyond notifying the user of empty required fields, or offer regex-based validation. This limits the feedback that can be provided to a boolean state; either it's okay, or it's not, but the error message generally can't be customized. This can make a form frustrating to fill out, particularly due to the lack of associated labels for inputs which would normally provide information to the user.

Usability and other concerns

Indexing and tracking

Because PDF readers are external modules, it's nearly impossible to capture detailed analytics on the use of PDF documents other than the number of downloads or hits.

Additionally, search engines vastly prefer HTML to PDF documents. The HTML format provides more contextual information. Landmarks elements such as <main> and <nav> provide an outline of an HTML document and its features, whereas PDF documents can only divide content into articles and generic sections.

Difficult and expensive to maintain

Once a PDF document has been published, maintenance can be difficult and requires specialized, licensed, and expensive software. Even with the software, it's often impossible to modify a document's structure without an extensive amount of work as the layout tends to shift around to accommodate the changes.

It's considerably easier to generate a new PDF from the source material than it is to maintain an existing documents. However, the originals are often difficult to track within an organization or are discarded once the document is published.

Version control is impossible

Once a user has downloaded a PDF, it remains as-is indefinitely. Updates to the source document will not be distributed to the user. This can lead to the propagation of outdated information, broken links, and version control issues.

Lack of documentation

When it comes to the accessibility remediation PDF documents, there's surprisingly very little useful information available other than the deeply technical ISO specifications and syntax guides. You'll find a sea of companies willing to fix your documents for you, and plenty of general how-tos, but detailed information on the tag tree and other techniques is very sparse.

Other concerns

Other less impactful considerations are the significantly larger file size of PDF documents compared to HTML, and the security concern of attaching documents to emails — particularly forms — that could contain unencrypted personal information.

Ease of generation

The option to convert, export, or print to PDF is baked into most word processors and most office applications, allowing users to quickly and painlessly generate pristine-looking documents.

It's significantly easier to generate a perfect-looking PDF for the web than it is to encode the document in HTML, particularly if the document contains a lot of styling and a varied layout.

Sense of security

Given how difficult it can be to modify a PDF and that doing so requires specialized software, there's a general sense of security that comes with the format. Many seem to be believe that once a PDF is exported, it cannot be modified and is therefore safer than other formats.

Design control

PDF documents are great at controlling how they are viewed regardless of the platform or device. The document will always look exactly as designed, and contain the appropriate fonts and images.

Portability and consistency

The user may require access to a document at any time, even while offline. Once a PDF is downloaded, it can be viewed at any time, anywhere, on any device, with or without a data connection and can printed exactly the same way every time.

Can a PDF be made accessible?

The short answer is yes. Followed by an asterisk and a flashing caution sign.

A PDF document is generally considered accessible when it meets a certain set of criteria: it's fully tagged, has a linear reading order, contains the relevant metadata, and is properly structured.

However, because of their limited customization options and fixed layout, PDF documents cannot be made fully accessible to everyone, on every device, like an HTML document can. The format simply wasn't — and still isn't — designed to be flexible in that way. Documents can be made accessible to most but there will always be a subset of users that will find them prohibitive due to the format's limitations.

When to use a PDF

You should not be using a PDF to disseminate information. Consider producing your documents in HTML, which is far more user friendly, lightweight, and is fully compatible with assistive tools. If offline portability is a factor, consider an epub — essentially an HTML package — as an alternative to PDF.

The PDF format excel at one thing: keeping a document exactly as it was designed and originally produced. This is particularly useful when it comes to archival, for visual or graphic design, for blueprints and plans, and for legal documents. Otherwise, stick to HTML to maximize the accessibility and usability of your documents.

Other resources

Create and verify PDF accessibility (Acrobat Pro)

Adobe

Tagged PDF Best Practice Guide: Syntax

PDF Association