Welcome to mirror list, hosted at ThFree Co, Russian Federation.

github.com/bestpractical/rt.git - Unnamed repository; edit this file 'description' to name the repository.
summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authormichel <michel@bestpractical.com>2019-11-26 18:26:07 +0300
committersunnavy <sunnavy@bestpractical.com>2020-02-13 00:06:13 +0300
commitc59003c543e2db4a6e18d8bcc7764ce4b2af5447 (patch)
tree690784091fee57d6386c13f6038778be1e52d95a /README.MariaDB
parent57eb2ba67d17a4462b9b8c19943a6045d6012a58 (diff)
Explain utf8mb4 character set updates
Diffstat (limited to 'README.MariaDB')
-rw-r--r--README.MariaDB58
1 files changed, 58 insertions, 0 deletions
diff --git a/README.MariaDB b/README.MariaDB
new file mode 100644
index 0000000000..149f1ffa23
--- /dev/null
+++ b/README.MariaDB
@@ -0,0 +1,58 @@
+Starting with RT 5.0.0, the minimum supported MariaDB version is 10.2.5
+because this is the first version to provide full support for 4 byte
+utf8 characters in tables and indexes. Read on for details on this
+change.
+
+RT 5.0.0 now defaults MariaDB tables to utf8mb4, which is available in
+versions before 10.2.5. However, before MariaDB version 10.2.5, utf8mb4
+tables could not have indexes with type VARCHAR(255): the default size
+for index entries was 767 bytes, which is enough for 255 chars stored
+as at most 3 chars (the utf8 format), but not as 4 bytes (utf8mb4).
+10.2.5 sets the default index size to 3072 for InnoDB tables, resolving
+that issue.
+
+https://mariadb.com/kb/en/changes-improvements-in-mariadb-102/
+https://mariadb.com/kb/en/mariadb-1025-changelog/ (search for utf8)
+
+In MariaDB, RT uses the utf8mb4 character set to support all
+unicode characters, including the ones that are encoded with 4 bytes in
+utf8 (some Kanji characters and a good number of emojis). The DB tables
+and RT are both set to this character set.
+
+If your MariaDB database is used only for RT, you can consider
+setting the default character set to utf8mb4. This will
+ensure that backups and other database access outside of RT have the
+correct character set.
+
+This is done by adding the following lines to the MariaDB configuration:
+
+[client-server]
+character-set-server = utf8mb4
+
+You can check the values your server is using by running this command:
+ mysqladmin variables | grep -i character_set
+
+Setting the default is particularly important for mysqldump, to avoid
+backups to be silently corrupted.
+
+If the MySQL DB is shared with other applications and the default
+character set cannot be set to utf8mb4, the command to backup the
+database must set it explicitly:
+
+ ( mysqldump --default-character-set=utf8mb4 rt5 --tables sessions --no-data --single-transaction; \
+ mysqldump --default-character-set=utf8mb4 rt5 --ignore-table rt5.sessions --single-transaction ) \
+ | gzip > rt-`date +%Y%m%d`.sql.gz
+
+Restoring a backup is done the usual way, since the character set for
+all tables is set to utf8mb4, there is no further need to tell MariaDB
+about it:
+
+ gunzip -c rt-20191125.sql.gz | mysql -uroot -p rt5
+
+These character set updates now allow RT on MariaDB to accept and store 4-byte
+characters like emojis. However, searches can still be inconsistent. You may be
+able to get different or better results by experimenting with different collation
+settings. For more information:
+
+https://stackoverflow.com/a/41148052
+https://mariadb.com/kb/en/character-sets/