Link Search Menu Expand Document

PDF Structure

PDF is a file format that allows for cross-platform and OS document presentations, including text, images, and multimedia. The PDF document format is publicly available since 2008, meaning anyone can use the specification.

Read PDF Format: All you need to know for more on this widely used file format.

How is a PDF structured?

Although the PDF file format has been used for the past 28 years (it was developed in 1993 by Adobe Systems), the extensive nature of its documentation can be a little too much to comprehend. Nonetheless, one of the easiest things to grasp about the PDF file is its basic structure.

PDF document structure has four basic parts: the header, body, a cross-reference table, and the trailer.

Header

Every PDF file has a header, which is the first line of the file that identifies the version number of the PDF specification in question. For example, if the header contains “%PDF-1.4%”, then the specification is that of the 4th version. To identify the header, use the hex editor.

Body

The part that forms the “body” of a PDF document contains all the objects that make up the data that users get.

The PDF’s body allows the presentation of every bit of data and information possible to the user. It is in the body that the PDF file carries information such as multimedia elements, interactive elements like animations or graphics, web page links, and signatures.

So typically, it consists of text streams, data, dictionaries, annotations, arrays, numbers, and much more. You can password protect the document in this section of the PDF, including against unauthorized editing and printing.

The Cross-Reference Table

This is also referred to as the xref table and helps make it easy for users to locate specific objects in the document without having to read every piece of detail within the PDF.

The Cross-Reference Table provides specific references to objects the file via a 20-byte long link easily displayed when viewed from a text editor. Once you open the PDF file in the text editor, scroll to the bottom to locate the xref table.

The Trailer

The trailer makes it possible for applications to read PDF files by locating the cross-reference table at the end of the document. PDF readers process documents from this end, without which the document may not open correctly or fail completely.

The trailer is also key to navigation from page to page, and you can easily identify it from the ”%%EOF” characters at the end.

The last line of the PDF document contains the end of file string ’%%EOF’. Before the end of the file tag, there is a line with a string startxref that specifies the offset from the beginning of the file to the cross-reference table

Conclusion

The PDF structure of different documents may vary or change if users update files. However, the basic structure always has the above four parts.

Other useful articles:


Back to top

© , PDF Splitter — All Rights Reserved - Terms of Use - Privacy Policy