About the Author
Sean M. Burke is an active member in the Perl community and one of CPAN's most prolific module authors. He has been a columnist for The Perl Journal since 1998, and is an authority on markup languages. Trained as a linguist, he also develops tools for software internationalization and native language preservation. Sean is also the author of O'Reilly's Perl & LWP.
Excerpt. © Reprinted by permission. All rights reserved.
This book is a convenient reference for Rich Text Format (RTF). It covers the essentials of RTF, especially the parts that you need to know if youre writing a program to generate RTF files. This book is also a useful introduction to parsing RTF, although that is a more complex task.
RTF is a document format. RTF is not intended to be a markup language anyone would use for coding entire documents by hand (although it has been done!). Instead, its meant to be a format for document data that all sorts of programs can read and write. For example, if you even just skim this book, you should be able to write a program (in the programming language of your choice) that can analyze the contents of a database and produce a summary of it as an RTF document with whatever kinds of formatting you want. The flexibility of RTF makes it an ideal format for everything from generating invoices or sales reports, to producing dictionaries based on databases of words.
This book is not a complete reference to every last feature of RTF; Microsofts comprehensive but terse Rich Text Format (RTF) Specification is the closest you will find to that. In the Microsoft Knowledgebase at support.microsoft.com, its access number is 269575. Version 1.5 of the specification and before are more verbose, and might be more useful. Microsoft doesnt distribute copies of them anymore, but you can find them allover the Internet by running a search on "Rich Text Format (RTF) Specification" in Google or a similar search engine.
RTF is a handy format for several reasons. RTF is a mature format. RTFs syntax is stable and straightforward, and its specification has existed for over a decadean eternity in computer years. In fact, while there has been a proliferation of incompatible binary formats calling themselves "Microsoft Word file format," RTF has stayed the course and evolved along backwardcompatible lines. That means if you generate an RTF file today, you should be able to read it in 10 years, and you should have no trouble reading an RTF file generated 10 years ago.
Many applications understand RTF. Since RTF has been around for so long, just about every word processor since the late 1980s can understand it. While not every word processor understands every RTF feature perfectly, most of them understand the RTF commands discussed in this book quite well. Moreover, RTF is the data format for "rich text controls" in MSWindows APIs; RTF-rendering APIs are part of the Carbon/Cocoa APIs in Mac OS X; and you can even read RTF documents on iPods, Apples portable music players.
Most people have the software to read RTF. That is, if you email an RTF file to a dozen people you know, chances are that almost all of them can read it with a word processor already on their system, whether its MSWord, some other word processor (ABIWord, StarOffice, Text-Edit), or just the RTF-literate write.exe that has been part of MSWindows since at least Windows 98.
In RTF, format control is straightforward. In HTML, if you want to control the size and style of text or the positioning and justification of paragraphs, the best you can do istry a long detour through CSS, a standard that is erratically implemented even today. In RTF, font size and style, paragraph indenting, page breaks, page numbering, page headers and footers, widow-and-orphan control, and dozens of other features are each a single, simple command.
RTF is a multilingual format. RTF now supports Unicode, so it can represent text in just about every human language ever written.
RTF is easy to generate. You can produce RTF without any knowledge of the font metrics needed for Adobe PostScript or PDF. In addition, since RTF files are text files, its easy to produce RTF with a program in any programming language, whether its Perl, Java, C++, Pascal, COBOL, Lisp, or anything in between.