How a single whitespace can triple your cost

With SMS messages, there’s a lot more than what meets the eye.  Your message doesn’t directly pass through the carrier as plain-text; it gets encoded prior to being sent into one of the many different standards.

 

The most popular standard of encoding is referred to as GSM (Global System for Mobile Communications) and contains all Latin characters, digits and several special characters. An SMS sent using the GSM alphabet can contain up to 160 characters. Then there’s the Unicode UCS-2 standard which allows you to go beyond the GSM character sets in order to support a wide range of languages and technical symbols. For instance, if you’d like to send an emoji, you can do so, but it will be sent as a UCS-2 message. If you send an SMS message that contains even a single UCS-2 character, the length per message is reduced to 70 characters.

 

Now, why am I telling you all of this? Take a look at this table:

U+2018
U+2019
U+02BB ʻ
U+02C8 ˈ
U+02BC ʼ
U+02BD ʽ
U+02B9 ʹ
U+201B
U+FF07
U+00B4 ´
U+02CA ˊ
U+0060 `
U+02CB ˋ
U+275B
U+275C
U+0313 ̓
U+0314 ̔

 

The table above illustrates the majority of the apostrophes in the UCS-2 character set. Wait, it gets better. The below table shows the different whitespaces that exist in the UCS-2 character set.

 

U+2004 whitespace: three-per-em space
U+2005 whitespace: four-per-em space
U+2006 whitespace: six-per-em space
U+2007 whitespace: figure space
U+2008 whitespace: punctuation space
U+2009 whitespace: thin space
U+200A whitespace: hair space
U+202F narrow no-break space
U+205F medium mathematical space

 

Often these characters – an apostrophe (❛), or a Unicode whitespace – slip in and are hard to identify and eventually cause your message to split. Remember how a message which is encoded in UCS-2 has a maximum of 70 characters? Well if you were to send a message with 150 GSM-7 compatible characters and a single USC-2 character, then your message will be split into 3 messages leading to 3 outgoing messages against your account and thus tripling the cost.

 

Enter Character Converter. This feature re-interprets the ‘unusual’ characters as one that’s supported via GSM-7 – the characters won’t be exact but will more than communicate the original intent. Best part? It automatically detects and changes the characters for you once it’s turned on and will ensure that the maximum number of characters are fit into your SMS.

 

Don’t be fooled by innocent-looking UCS-2 characters. Be wise and aware of where your money is being spent.


Continue reading...

Everything you missed at DevDay
Natalie Byrgiotis
October 30, 2018
Chameleon SMS
Ibrahim Tareq
October 29, 2018
MessageMedia x Auth0
Ibrahim Tareq
October 25, 2018