PDF and HTML Comparison

PDF and HTML are not equivalent technologies, but are both commonly found on the Web. Using the reflow feature, tagged PDF documents behave much like HTML.

HTML is a method for describing the content of a webpage in a manner that is open to interpretation by the browser which renders it on the user's screen. This permits content to be rendered to suit the viewer rather than the content provider, and also means that an HTML file will not necessarily look exactly the same in different browsers. PDF without applying reflow, on the other hand, is strictly concerned with describing the content of a document such that the original layout and typesetting are fully preserved.

Since many content providers do not like the fluid nature of HTML rendering, PDF without reflow has been widespread to force a particular layout. With HTML the same can be achieved by using a raster graphics (or recently, SVG, a vector graphics standard) image to present text, but then the text can not be copied as such, nor can a subtext be searched within it. Use of images also leads to larger file sizes. (Sometimes the same is done in a PDF file, and the same disadvantages apply.)

A typical example of the differences this leads to for the viewer is with zooming:

Enlarging a PDF document without reflow magnifies the text but preserves the original layout and spacing; a practical limit on zooming follows from the requirement to keep a text column within the width of the screen (otherwise horizontal scrolling would be needed during and after reading each line, which would be very cumbersome). With HTML and tagged PDF with reflow, a larger font size is used and lines re-wrap accordingly to fit the browser window.

Read about RTF to HTML Feature, RTF to PDF.

Go to glossary contents