Formatting: Double-space mangled into one at display time

Some of us are writing with two space characters after a full stop. Such a style is good because it gives a better readability.

This platform allows writing and editing posts with consecutive space characters, but mangles multiple consecutive white spaces into a single space character when displaying the post. This is a pitfall of many forum or blogging platforms.

However, if I return to my post and try to edit it, the double space chars are there, so mangled only when displayed.

Why so?
 
so mangled only when displayed.
That's the HTML standard. It's one of the reasons why we have a [code] bbcode for code or system output, because plain text printed with HTML would mangle whitespaces, indents, newlines, etc. Things that can actually be important to code (Python and YAML are prime examples).

In the case of HTML, whitespace is largely ignored — whitespace in between words is treated as a single character, and whitespace at the start and end of elements and outside elements is ignored.

 
That answered my question, thank you.

Still, that is faulty behavior, IMO. Maybe the HTML/CSS/DOM standards should be upgraded by whoever is in charge with the standards, so the future versions won't have this issue.

However, not all forum platforms are mangling spaces, for example SMF forums doesn't do that. I know nothing about web standards or design, but here's an example of a SMF forum were double spaces are somehow preserved (as can be seen in the posts of users replies, like 'gmb42' or 'Nominal Animal' in this random thread):

Please consider it as a feature request, a nice to have, in case the option of preserving spaces would become available in the future versions of this forum platform.
 
Still, that is faulty behavior, IMO. Maybe the HTML/CSS/DOM standards should be upgraded by whoever is in charge with the standards, so the future versions won't have this issue.
No, HTML does it correctly for being a language aimed at logically describing a document. Any "basic" (ASCII) whitespace (and any number of it) is treated the same. Furthermore, there are no typography rules anywhere that would ever use two spaces.

What exists indeed is a typographic style that uses wider spaces after the end of a sentence. But this wider space is still a single character (historically, a single spacer-piece for the typecase). This makes sense, e.g. you don't want line breaks in the middle of such a space.

In Unicode, the character for a wide (en-width) space is U+2002. Unfortunately, I don't know of any compose-sequence that would allow entering it directly. Note there's also an extra-wide space (em-width) with code U+2003, but that's really not suitable here.

Demo:
(ASCII) space: Foo. Bar.
EN space: Foo. Bar.
EM space: Foo. Bar.

edit: Note some forum software could of course add some "magic":
  • Replace spaces with   (non-breakable space), but that's bad for line-break reasons, see above (to allow breaks at all, you'd have to keep one "normal" space, but which one?)
  • Replace two consecutive spaces with an EN-space and four consecutive spaces with an EM-space
But I don't think that's a necessary feature. If you find yourself using wider spaces regularly, define your own compose-sequence for it and just type it ;)
 
Adding an example how to add a custom compose sequence yourself, for completeness. Create ~/.XCompose:
Code:
include "%L"

<Multi_key> <space> <colon>            : " "   U2002 # EN SPACE
<Multi_key> <colon> <space>            : " "   U2002 # EN SPACE
The character between the double quotes must be the EN space. I used a sequence here that's unused in the system-wide sequences for UTF-8, you can use any sequence you like.
 
No, HTML does it correctly for being a language aimed at logically describing a document.
<nitpick>Or this might be a side affect of allowing html to be formatted with indentation (which is absolutely necessary to make it remotely readable). We don't really know.</nitpick>
 
shkhln see for example https://en.wikipedia.org/wiki/Space_(punctuation)#Between_sentences
Double space (English spacing). It is sometimes claimed that this convention stems from the use of the monospaced font on typewriters.However, instructions to use more spacing between sentences than words date back centuries, and two spaces on a typewriter was the closest approximation to typesetters' previous rules aimed at improving readability.[6] Wider spacing continued to be used by both typesetters and typists until the Second World War, after which typesetters gradually transitioned to word spacing between sentences in published print, while typists continued the practice of using two spaces.

This style does exist, but is uncommon nowadays and using two spaces is the "typewriter workaround": The correct way is to indeed use the wider spacer.
 
I see.

This style does exist, but is uncommon nowadays and using two spaces is the "typewriter workaround": The correct way is to indeed use the wider spacer.
Strictly speaking, actual book typography uses spaces of more than one or two widths (not randomly, it's still a subject to those rules), but we aren't going to reproduce this on the web.
 
Strictly speaking, this typographic style is just obsolete nowadays, with most typesetters using the normal "word spacing" for sentences as well :cool:.

But if you want to use it anyways, just use the correct spacing characters. They exist for a reason in Unicode, there's no need any more for this "typewriter workaround".

And btw, especially in monospaced text, I think this style is very annoying, cause two monospace spaces create a huge "hole" that just looks horrible.
 
This one is kind of amusing, but I'd still argue it starts on "wrong" premises: Just using two spaces is clearly a typewriter phenomenon.

Typography (originally using a typecase) always had spacers of different widths and combining two spacers wasn't a common thing to do.

edit: BTW, LaTeX is typically very close to traditional "good" typography. And, surprise, sentence spacing is configurable there. It defaults to a wider space between sentences (but NOT a "double space"), but can be set to use normal word spacing with the \frenchspacing command. It's a matter of preference, and the traditional style is quite uncommon nowadays, but just using two spaces is clearly a typographic error outside the "monospace" world of typewriters.
 
When I learned to type on a mechanical typewriter in the 1960s, we were taught to put two spaces in between sentences and (I think) five spaces before new paragraphs.

As mentioned earliler, HTML collapses spaces. I believe this is because HTML is based on SGML where that is done but I no longer remember the reasoning other than HTML is a markup language and not a formatting language.

You can use <pre></pre> if you want to maintain spaces.
 
Back
Top