Semantic text editor for the web

July 30, 2013

In his article, The Web is not Microsoft Word, Thomas Bradley argues against What-You-See-Is-What-You-Get (WYSIWYG) text editors in favor of markup that provides meaning instead, that is: What-You-See-Is-What-You-Mean (WYSIWYM). As fervent supporters of accessibility and maintainability, we share in Professor Bradley's philosophy and subsequent frustration.

DPEditor Example

The problem#

In the Microsoft Word method of document preparation, the author will stylize a selection of text in order to imply a special meaning, such as "this is a heading", or "this is emphasized text". This approach, no doubt, intends to reduce the work required to prepare a document. But this simplicity is riddled with several major setbacks.

Consistency#

The first is lack of consistency. The next time a header is desired, the author has to ascertain that the styles fully match. And, if by some miracle, he is careful enough to succeed in this endeavor, then he will be faced with a far more tormenting task when the time comes to change the look and feel of these headings; inevitably missing one instance here and there.

Accessibility#

Another deficiency of this approach is seldom seriously considered even when it is legally required. When visual styles are used to imply meaning to a chunk of text, only visual consumers have access to that meaning. In so doing, the author is disenfranchising a segment of society.

What is happening is that the author is concentrating more on the look of the document rather than on its meaning. To be sure, this is an appropriate tactic when the document is a poster, or when the future rendition of the document is to physical paper form; after all, once printed, only visual cues are left.

But on the web, structure is paramount. The same text may be consumed by an ever-increasing number of visual (desktops, tablets, phones) and non-visual (text-readers, search engines, web crawlers) clients. The only way to preserve meaning across these different representations is to actively assign meaning to the text, via markup.

The solution#

In a way, the solution consists of reverting back to the basics. Before the visual text-editor jumped on the scene, computers were still being used to prepare documents via "markup": small-snippets of instruction mixed with the content. (Indeed, HTML is the Hyper-Text Markup Language). A still-popular one in the scientific community is LaTeX, the de-facto document preparation system for academic journals and books.

What a semantic text editor enforces is separation of content from presentation. This is a critical and understated benefit. The author concentrates on the stuff that matters: getting the text, the graphs, the tables, etc, in place. A different agent—such as the graphic designer—can use that embedded meaning in the markup to stylize the document: assign a certain font-type and size to first-level headings, a specific color for hyperlinks, etc.

Maintainability#

When the time comes to (a) update the look and feel, or (b) migrate the document to an entirely different format altogether, the grunt of the work has already been performed. The content need not change. Only the applied styles (in web-speak, the CSS) need to be tweaked.

Styling for multiple platforms#

When meaning is properly assigned to the elements of a document, the same text can be specially stylized for a particular client. For computer monitors, sans-serif font types are easier to read. Size requirements are different on a handheld device. Line widths vary. The list goes on and on.

Stability#

Before WYSIWYG text-editors started using the Open Document Format (which in turn uses XML) to store documents, text editors like Microsoft Word saved prepared documents in binary files with extensions like .doc, .wpd, etc. Binary is perfect for digital parsers, but difficult for the human agent. And so, if the application that saved the file can no longer open it (due to version upgrade, for instance), the document and its contents are forever lost.

With markup, there is no risk of lost data because the document is plain text. Plain text can be parsed by both computers and—more importantly—humans. This means that the contents of a document prepared with markup will never be unreadable.

It also means that converting from one markup to another, should the need arise, is always technically feasible. As technologies evolve, therefore, your documents can easily adapt. In this way, documents are truly timeless.

The Semantic Editor#

What web authors need, then, is a truly semantic text editor. Ideally, everyone would simply use HTML itself; but that markup, though robust, is complicated and error-prone.

To fulfill this need, several (lightweight) markup languages have been created, with the following as the most popular:

At OpenWeb Solutions, we found that even these markups are at times cumbersome to use; specially for the sporadic editor. This is why, for our Resource Management System, we developed a simple, intuitive markup. At this time, we have deployed Javascript (browser-side) and PHP (server-side) parsers for the DPEditor, which can be discreetly deployed onto any existing plaintext (textarea) input.

Explore the DPEditor, and the advantages of meaning over style by using this tutorial.

Conclusion#

Your sites will be more future-proof, more migrateable, and more versatile if your authors are using markup to assign meaning to the text, instead of using the Microsoft Word philosophy of implied meaning. Most of the big players in user-generated content (Wikipedia, et al.) have come around to this realization.

This extra step can seem like more work, specially for authors new to the concept of markup. With a little practice and the right editor, however, you will notice increased productivity, simply by focusing all your attention on content and eschewing the distraction of presentation. In addition, markup is forever-stable, since it is nothing more than plain text. And, because every command is a few keystrokes away, you will rely less on a pointing device (such as a mouse), thereby actually reducing the amount of work.

And isn't that the whole point?