Well-Finished Text

8 thoughts
last posted Dec. 19, 2013, 4:46 p.m.
get stream as: markdown or atom

I write in plain text, but "publish" in HTML or print. I'm always paying attention to how my workflow treats certain punctuation marks when generating the final product:

  • Real quotation marks: they should looke like “this”, not "this" (more info)
  • Em- and en-dashes: — and – (not the - hyphen character except when actually hyphenating words) (more info)
  • Use of the ellipses character instead of three periods (more info)

Option 1: Type in ASCII, deferring the conversion of special characters.

The quickest thing to do while writing is just to type these characters the fast/conventional way: straight quotes instead of curly ones, two hyphens -- for a dash, and three periods . . . for an ellipses.

The problem then is to ensure these get properly converted during the publishing process.


Modern Markdown converters have an annoying habit of not offering a typography conversion layer (such as SmartyPants) or similar functionality.

  • The original Markdown Dingus includes a SmartyPants layer, but it doesn't handle em-dashes. Also, the original Markdown doesn't support the use of footnotes and other newer conveniences.
  • The PHP Markdown Extra dingus includes a SmartyPants layer, but some months ago the author removed the option to use Markdown Extra (which supports footnotes) and Smartypants at the same time. (Why?!) The workaround is to run the text through twice: once with the Smartypants filter, then copy the new typographied Markdown text into the source box and wash it through the Markdown Extra filter (the order is important).
  • WriteMonkey includes an extremely handy Copy-as-HTML feature which converts your plain text (as MultiMarkdown) to HTML and places the result in the clipboard. Again, though, no typography layer is included or supported. Writemonkey also has a SmartyPants plugin, but it must be run as a separate step. It also alters the original text rather than presenting the converted result separately.
  • Drafts (online text editor similar to Editorially but actually predates it) -- it can convert your markdown text to HTML, but weirdly translates your plain-quotes into HTML entities of plain quotes (not curly ones). It doesn't touch ellipses or dashes.

I don't use WordPress, but I understand plugins are available which add a SmartyPants-type conversion layer. WordPress recently added native support for posts written in Markdown (plain text).

It should be possible to roll my own online dingus using the source published at the sites above, but I never got around to it.


Option 2: Use your text editor's auto-replace features to insert Unicode characters at the time of writing

This is the approach taken by Editorially -- when you type two hyphens and a space, for example the two hyphens are automatically replaced with an em-dash. Thus the typography is handled inline within the source text rather than being converted during the publishing process.


Other tools offering this option:

  • Byword (mac app) offers this functionality according to this review.
  • WriteMonkey has a built-in (and nearly undocumented) option for smart quotes: typing CTRL+SHIFT+' turns it on or off. Auto-replacements for dashes and ellipses can be supported by copying and pasting these characters into the Replacements tab in the preferences.
    • You have to remember to turn this setting on every time you start the program.

Doing it this way means you have to be marginally more careful about your tools -- specifically, both your text editor and your publishing platform have to able to handle your text in the UTF-8 character set.

For example, if you're publishing on the web and your text is stored in a MySQL database, you'd want to be sure it's using UTF-8.

You also need to be sure your text editor is saving your source text in UTF-8. Many editors have an option for forcing this. Others use UTF-8 natively (i.e., behind the scenes) but don't do a good job of documenting it anywhere.


Serving up UTF-8 encoded plain-text files on the web may look garbled in the browser if the server's not sending the correct headers. I'm currently having this issue with the transcript links on my podcast episodes.