Eric Niquette UI, UX, Accessibility

Validating and remediating PDF documents for accessibility

Published Updated

My high-level process for the review and remediation of PDF documents for accessibility using Adobe Acrobat Pro.

What to expect

The evaluation and subsequent remediation of a PDF document is a multi-step process of automated and manual checks. This guide provides a high-level overview of metadata, tags, bookmarks, and general structure, as well as basic semantics and other best practices.

Prerequisites

Viewing and modifying PDF tags is a feature exclusive to Adobe Acrobat Pro. While third-party alternatives are available — and likely offer similar functionality — I'm assuming you're using Acrobat Pro given its widespread popularity.

Automated tools such as PAC 2021 will also be used but can be skipped entirely if desired.

Meta data

When reviewing a document, the first thing I check is the presence of appropriate meta data.

What is meta data?

Meta data is a set of data points that provides details or other information on a document such as its title, author, composition date, language, as well as a document's settings and options. Information found in meta data is also used by search engine indexing services and should therefore not be omitted.

Document properties

In Adobe Acrobat, the meta data is found in the Document properties panel where there are several data points to validate or populate. To view the panel, open the File menu and select Properties from the dropdown list.

Description tab

In the Description tab are the document's title, author, and keywords. The title should be unique and descriptive to the document, and is typically the same as the top-level heading. Should the document not contain a heading, a short descriptive text should be used.

The author is the original document author's name or organization rather than the person who's converted the document to PDF. While optional, the Subject and Keyword fields should be populated as they are used by search engine indexing services.

Initial view tab

In the Window options section is the option to either display the document's file name or title in the title bar. This should be set to display the title that was previously set in the Description tab.

If the document contains bookmarks, the Navigation Tab option should be set to Bookmarks Panel and Page. Otherwise, it can be left to Page only.

Advanced tab

Critical to screen readers, the document's language is set in the Advanced tab. Ensure this field is populated either with the preset languages or by using the IANA language subtag such as en-CA or fr-CA.

Bookmarks

Often overlooked, bookmarks are an invaluable navigation aid and should not be omitted.

The general rule of thumb is that every heading should have a matching bookmark. If your document contains more than a handful of headings, is divided into sections, or otherwise spans multiple pages, it would likely benefit from the addition of bookmarks.

Validating bookmarks

Open the bookmarks panel and select individual bookmarks. Ensure they lead to the correct section in the document. If not, highlight the appropriate destination, right-click the bookmark and select Set Destination from the context menu.

Also ensure that the bookmarks are nested correctly. If not, individual entries can be dragged to the correct location in the tree.

Adding bookmarks

Highlight the desired text, right-click, and select Add Bookmark from the context menu. A new entry will be created with the selected text as the bookmark's title.

Tags

Tags are semantic markers that provide the non-visual structure of a document and arguably the most important part of an accessible PDF. Every element is assigned a tag that provides information about the type of content enclosed within it.

In addition to providing structure, tags also serve as a reading order for a document. The tag tree is a sequential list of elements that screen readers follow.

The tag panel is not available by default and must be enabled in the Navigation Panes menu.

  1. View
  2. Show/Hide
  3. Navigation Panes
  4. Tags

Validating tags

The process of navigating through tags is referred to as walking the tag tree. Select the first element in the list and press the down arrow to move to the next element. As you move down the tree, ensure every element is tagged and accurately represents the content found within.

Managing tags

In some instances, it may be necessary to add, modify, or shuffle tags in the tag tree. Managing tags involves the Reading Order tool to select and tag elements.

  1. View
  2. Tools
  3. Accessibility
  4. Open

Modifying tags

If an element is tagged incorrectly it can be corrected by typing in the correct tag manually or by right-clicking the tag and selecting Properties from the context menu. In the Properties panel, the correct tag can be selected from a list.

Adding tags

Using the Reading Order tool, click and drag around an element to create a selection. Assign the selection a tag by pressing the appropriate button in the Reading Order panel.

Autotagging

When a document is not tagged or the tag tree was generated incorrectly, the easiest way to populate the document is with the auto-tagging tool in the Accessibility panel.

Content

Page numbering

Page numbers found in footers must match the PDF reader's page numbering scheme. If the document displays "Page 8" in the footer, the expectation is that this matches with "Page 8" in the PDF reader's navigation as well.

Alternative text

Images, graphs, charts, and figures should provide alternative text versions for assistive tools. To verify if an element has alternate text, find the element's tag in the tree, right-click, and select Properties from the context menu.

Use of colour

Ensure your document does not rely on colour alone to convey information, and that the colours used provide enough contrast.

WCAG specifies that colours should meet a minimum contrast ratio of 4.5:1 for normal text, and 3:1 for larger text such as headings.

Tables

Tables should be simple, linear, and contain header cells and a caption where appropriate.

Table linearity

A table is considered linear when its data can be read from left to right, top to bottom with no split, merged, combined, or empty cells. Columns and rows have the same layout throughout and the data's formatting is regular and predictive.

Complex tables may need to be redesigned or split into multiple, smaller tables. Some tables may be converted to text format using lists and headings to delineate sections.

Header cells

Tables must contain a row and/or column of header cells to provide context to their respective cells' data. To verify if a table is using header cells, locate the table in the tags panel and expand its tags. Table header cells will appear as <TH> elements, and data cells as <TD>.

Accessibility checker

While it can only perform high-level tests, the Accessibility Checker is a great tool for catching issues with alt text, missing tags, and various faults with the document's structure.

  1. Views
  2. Tools
  3. Accessibility
  4. Open
  5. Accessibility Check

Comprehensive tests

The following checks aim to ensure a comprehensive assessment and are best left to experienced users.

Screen readers

The use of a screen reader will highlight potential issues that may have gone unnoticed in the tags or missed by the automated validation. Issues such as non-standard characters, a broken reading order, or inaccurate alternative text can easily be missed during other steps.

PDF Accessibility Checker

The PDF Accessibility Checker (PAC) by the PDF/UA Foundation is a free and powerful validation tool against the WCAG or Universal Accessibility standards.

Other common issues

Empty paragraphs used for spacing

In word processors, users often rely on the use of multiple spaces or carriage returns to create space. The practice generates empty paragraphs which some screen readers announce as "blank" and should be removed from the tree.

Charts and graphs

When a chart of graph is present, a text version of the data must also be provided. One method is to provide a synopsis of the data in the chart's alternative text and provide a table or full description of the data after.

Other resources

Tagged PDF Best Practice Guide: Syntax

PDF Association

PDF Techniques

WCAG 2.1