Copyright (c) 2000 by Charlie Calvert
After having written literally millions of words using the proprietary word processing formats such as those supported by Microsoft Word, I have, in the last year, started doing all my word processing with HTML editors. This article explains why I made the switch. It will hopefully also encourage others to embrace open standards.
Without further ado, let's plunge into the heat of the battle and see why I got tired of using Word, and why I love HTML.
I've been using the Linux operating system a lot in the last few months. Nothing is more frustrating to a Linux user than getting an attachment to a mail message that was saved in some proprietary word processing format. "Please see the attached message, vital information enclosed," a message might read. Then you click on the attachment and see that it has the dreaded "doc" extension, meaning that it is saved in Bill Gates' proprietary word processing format!
Unlike many in the industry, I'm not a Bill Gates or Windows hater, but still I resent having to pay to use his operating system and his word processors just to read a message from a friend or colleague. Most of the time you can't read his documents at all unless you are using Windows, and even if you have his operating system, sometimes you can't read a document unless you are using his $200 word processor. (I know, of course, about WordPad and Write, but they don't give good support for all Word documents, and they don't help at all if you are using Linux.)
Of course, one of the ironies of this situation is that most of the documents that I get in Word format are from our marketing department. Leave it to the communication experts to figure out a way to communicate in a proprietary format that a lot of people can't even read. It's that professional touch that endears them to us!
Why do large companies encourage their employees to standardize on Microsoft Word? Word is an expensive tool to buy, and it is not an open standard. These companies are locking their employees into word processing formats which cannot be read on all the machines used by their employees.
One of the interesting ramifications of using proprietary formats is that you are forced to upgrade your product as each new version of the tool comes out. One person in the company buys the latest version and sends you a copy of the document he made with that version. In order to read it, you too have to upgrade. Then once you upgrade and send a document to someone else.... Well, there is not need to go on, as everyone knows this particular song and dance. The point is that with HTML documents you never need have a document go out of date. HTML is an open standard, and all decent browsers, which are usually free, support the code you produce with an HTML editor.
Proprietary formats tend to become outdated. We can read documents that are two or three versions of a product old. But what about documents that are ten years old? Are those old documents still supported by the latest version of your word processing tool? Maybe they are, but I would not count on it! The documents created in proprietary word processing formats have a built in obsolescence, even if you do commit to using the same brand over a long period of time. And if you decide to switch brands, then you may be completely out of luck, and find a whole series of older documents unreadable! Why should we risk locking ourselves into a product that might not even exist five or ten years from now?
I recently tried to open some documents that I wrote back in the late eighties when I was using WordPerfect 4.2. My first thought was to open them in my current version of Word, which generally has support for older WordPerfect documents. But my copy of Word's support for WordPerfect goes back as far as version 5.0, and it cannot (as far as I can tell) open documents saved in WordPerfect 4.2 format. Fortunately, I have copies of WordPerfect laying around, and I'm sure I will get these documents open eventually. But it was frustrating finding that I had important and meaningful documents that I could not easily access. If they had been written in an open standard like HTML, it is unlikely that such a problem would ever arise! (I should add, in defense of WordPerfect, that it works on both Windows and Linux, which makes it a relatively viable platform for publishing documents.)
Of course, if I'm going to rail against Word, I need to propose an alternative. As I'm sure you can guess, the alternative I embrace is HTML.
For the last year or so, I've been doing nearly all my word processing in HTML editors. I can do pretty much anything I can do in Word in HTML. Good HTML editors are not more trouble to use than Word, and the documents I create with them can be shared with anyone in the industry who is using a modern operating system. In short, I can create documents as easily as before, and yet share them with even more people.
Two or three years ago it was difficult to do word processing in HTML format. The available editors were less than optimal, and some people might have had trouble reading the files I produced. Now everything has changed.
All computers come equipped with good browsers that can read HTML files, and even most mail programs will read HTML automatically. Even more important, there are numerous great HTML editors available. My favorites are Allaire HomeSite (written in Delphi), Microsoft FrontPage, and HotMetal Pro. Occasionally I have to wrestle a bit with HotMetal, but both FrontPage and HomeSite are at least as easy to use as Word, and are in many ways superior products. FrontPage even has many of the advanced features of Word, such as automatically underlining misspelled words! In fact, HomeSite has the same feature.
I also frequently edit HTML documents in EMACS or Visual Slick Edit, but that is not an option for most people who want to do generic word processing.
On the Linux platform, I have had some success with WYSIWYG HTML editing using the free download from Sun called Star Office. In fact, it is arguable that Star Office represents a reasonable solution to some of the problems posed in this article. Star Office is similar to MS Office in scope. It runs on multiple platforms, and so it is does not lock you into Windows. Furthermore, it supports both Word and WordPerfect formats, as well as Excel. However, it does not run on all possible platforms, it is not bug free, and it is a large and complex application. It seems silly to have to download and learn a 50 MB plus tool just to read a 5 KB mail message. If we all used HTML for our word processing documents, then we could all read the documents in our HTML browsers, and compose them on the tool of our choice. That seems to me much simpler than committing ourselves to any one big tool, no matter how cross platform it is, and no matter what the price.
As everyone knows, the letters WYSIWYG, pronounced whizzy-wig, stands for "what you see is what you get." HomeSite, FrontPage and HotMetal are all WYSIWYG editors, meaning that you don't have to type in the codes for HTML, but rather you can just type normally, as you would in Word, and the necessary codes are inserted when needed to make your code bold, normal, italic, etc. The code is handled behind the scenes, and all you see is the finished product, formatted more or less as it will be in a browser.
But there is more to good HTML editors than mere WYSIWYG functionality: HTML editors let you see the actual code used in your Word processing documents. There is an extremely important feature.
I defy anyone who has used Word extensively to deny that there are times when you can't figure out why it is not formatting your document as you wish. Sometimes when using Word, you can do everything right, turn on all the reveal codes buttons, dig deep into the Style menus, and still some chunk of text just won't behave! On such occasions, I would do anything just to be able to see the actual codes that Word is inserting into my documents. "Let me see what you think I'm trying to do, and then I can put in the darn codes myself!" is my battle cry in such situations. When I used WordPerfect, it would let me do that, why can't Word?
Of course, the beauty of HTML is that you can see absolutely all the codes in your text anytime you want to see them. If you have a good editor, such as HomeSite, you don't have to look at these codes unless you want. But if you want to see them, then you can see them. Even better, the codes you see are the real codes used to format your text! This is the real "What You See is What You Get." In HTML, if a chunk of text is wrapped in <P> tags, then there is no question about what that text should, God willing, do when displayed in a browser! You can see exactly what you editor is doing, because you can see exactly which codes it is inserting. Even better, the codes are part of a cross platform standard that means the same thing on a Windows machine, a Linux machine, a Mac or a Solaris box! (Okay, it means almost the same thing on all those machines....)
In the bad old days, HTML editors frequently made a hash of my HTML tags. I can remember carefully formatting an HTML document in nice, neatly laid out code, then opening it once in FrontPage. After making some changes I saved the document. When I opened it again in a text editor my code was mush. It looked like someone had run it through a blender. In fact, if I didn't look carefully, my code looked like a lowly RTF document, with everything piled in on top of everything else, in the patented Simonyi style, as if some mad Hungarian had been set loose on my variable names in a C++ program! (Charles Simonyi is a Microsoft employee from Hungary who created the RTF format. He also invented a convention for naming variables that is, in my opinion at least, extremely difficult to read. His code is often jokingly referred to as Hungarian code, because it is obscure and foreign looking.)
The new breed of HTML editors no longer mangles your code. HomeSite, FrontPage and HotMetal all turn out HTML that looks pretty much exactly the way I would want it to look were I crafting it myself by hand. Even better, these editors let me edit the HTML codes myself whenever I need to get fine control over the text.
Word has a feature called Styles, which I found extremely useful. Fortunately, HTML has an equivalent feature, called cascading style sheets. Cascading style sheets (CSS) allow you to do the same things that styles did in Word, only they are an open standard supported by most of today's browsers.
If a company wants to create a standard format for its documents, then it can give each employee a small text file with a dot CSS extension. By referencing this file from your document, you can give your text a standardized look. For instance, you can make all the normal text (the prose with <p> tags around it) appear in a particular font and color. You can make the background of all documents appear the same, and you can make H1, H2 and H3 tags follow a predefined format. In fact, you can take any HTML element, and make it behave more or less exactly as you wish. The end result is that you can have very good control over the appearance of your documents, and can easily create a standardized look and feel for your prose.
Because cascading style sheets are a standard, any one person can edit their personal version of the style sheet in a text editor to make their text appear in the format they most desire. For instance, if you are no longer as young as your colleagues, you can edit your version of the company CSS style sheet to make the normal text appear in a font larger than the head of a pin. Alternatively, if you are a burgeoning young phreak, you can edit your version of the style sheet so all the company memos appear in psychedelic tie dye colors! (Or is it aging hippies who want to do this?) In short, style sheets let you get control over the documents you read or produce.
If you think about some of the more impressive things you have seen browsing the web, then you know that an in depth knowledge of HTML will allow you to create very elaborate documents. In fact, you can create even fancier documents in HTML than the ones you made in Word. If you want, you can add scripting, VRML, even CGI to your HTML documents. Granted, this would be an extreme thing to do, but it can be done. If you exercise a modicum of self restraint, you can even share your creations with virtually anyone who has a modern computer.
Despite the attraction of these advanced features, I want to put in a short plea for the use of simple technologies like basic HTML. There is a place for XML, XSL, and all the other fancy accoutrements of modern web pages. But when doing word processing, I usually find that the basic features of HTML more than meet my needs. In particular, I can do almost everything I need with <P>, <PRE>, <B>, <I>, <UL>, <OL>, <LI> and <TABLE> tags. There is nothing innately wrong with using fancier tags and techniques, but most of the time I don't need them. When creating documents, your prose style is your weapon of choice, and usually a lot of fancy formatting does little more than obscure your meaning.
There are some reasons not to move to HTML:
None of these arguments mean much to me personally, but it seems worthwhile mentioning them. For those concerned about the third point, I recommend FrontPage, which is very similar to Word in terms of look and feel. It can help you break the addiction.
I like open standards and I like HTML in particular. I find the new HTML editors at least as good as the old proprietary word processors, and in many ways they are much better. For instance, they let me see exactly what codes are being inserted in my documents. Best of all, they create documents that can be read on any platform.
For me, at least, the day of proprietary word formats is over. The new HTML editors are better than the old editors that locked us into formats that we couldn't always share with our friends and colleagues, and that we might not even be able to open ourselves some number of years down the line.
(This document was written in HomeSite, and then published on the Web using FrontPage.)