Minor update to encoding documentation

author: Milo Yip <miloyip@gmail.com> 2014-07-15 21:56:11 +0400
committer: Milo Yip <miloyip@gmail.com> 2014-07-15 21:56:11 +0400
commit: 7cfe718d3d1abbb15676b8f83e001b00eb2f1473 (patch)
tree: 62d0a84ba57f4894cd57575d2fa7b7bc432d2beb /doc
parent: e590e0757ec8cb6f602e4668defdae2c2593b042 (diff)
1 files changed, 5 insertions, 6 deletions
diff --git a/doc/encoding.md b/doc/encoding.md
index cc764c2e..bc5c178f 100644
--- a/doc/encoding.md
+++ b/doc/encoding.md
@@ -6,8 +6,7 @@ According to [ECMA-404](http://www.ecma-international.org/publications/files/ECM
 
 The earlier [RFC4627](http://www.ietf.org/rfc/rfc4627.txt) stated that,
 
-> (in §3) JSON text SHALL be encoded in Unicode.  The default encoding is
-   UTF-8.
+> (in §3) JSON text SHALL be encoded in Unicode.  The default encoding is UTF-8.
 
 > (in §6) JSON may be represented using UTF-8, UTF-16, or UTF-32. When JSON is written in UTF-8, JSON is 8bit compatible.  When JSON is written in UTF-16 or UTF-32, the binary content-transfer-encoding must be used.
 
@@ -28,9 +27,9 @@ Those unique numbers are called code points, which is in the range `0x0` to `0x1
 
 There are various encodings for storing Unicode code points. These are called Unicode Transformation Format (UTF). RapidJSON supports the most commonly used UTFs, including
 
-* UTF-8: 8-bit variable-width encoding. It maps a code point to 1-4 bytes.
-* UTF-16: 16-bit variable-width encoding. It maps a code point to 1-2 16-bit code units (i.e., 2-4 bytes).
-* UTF-32: 32-bit fixed-width encoding. It directly maps a code point to 1 32-bit code unit (i.e. 4 bytes).
+* UTF-8: 8-bit variable-width encoding. It maps a code point to 1–4 bytes.
+* UTF-16: 16-bit variable-width encoding. It maps a code point to 1–2 16-bit code units (i.e., 2–4 bytes).
+* UTF-32: 32-bit fixed-width encoding. It directly maps a code point to a single 32-bit code unit (i.e. 4 bytes).
 
 For UTF-16 and UTF-32, the byte order (endianness) does matter. Within computer memory, they are often stored in the computer's endianness. However, when it is stored in file or transferred over network, we need to state the byte order of the byte sequence, either little-endian (LE) or big-endian (BE). 
 
@@ -78,7 +77,7 @@ For a detail example, please check the example in [DOM's Encoding](doc/stream.md
 
 ## Character Type {#CharacterType}
 
-As shown in the declaration, each encoding has a `CharType` template parameter. Actually, it may be a little bit confusing, but each `CharType` stores a code unit, not a character (code point). As mentioned in previous section, a code point may be encoded to 1-4 code units for UTF-8.
+As shown in the declaration, each encoding has a `CharType` template parameter. Actually, it may be a little bit confusing, but each `CharType` stores a code unit, not a character (code point). As mentioned in previous section, a code point may be encoded to 1–4 code units for UTF-8.
 
 For `UTF16(LE|BE)`, `UTF32(LE|BE)`, the `CharType` must be integer type of at least 2 and 4 bytes  respectively.
author	Milo Yip <miloyip@gmail.com>	2014-07-15 21:56:11 +0400
committer	Milo Yip <miloyip@gmail.com>	2014-07-15 21:56:11 +0400
commit	7cfe718d3d1abbb15676b8f83e001b00eb2f1473 (patch)
tree	62d0a84ba57f4894cd57575d2fa7b7bc432d2beb /doc
parent	e590e0757ec8cb6f602e4668defdae2c2593b042 (diff)