Knowledge Base » SMS Gateway » Character Sets and Encodings

Contents

What character sets do you use? [6]

Almost all mobile handsets support two character sets for SMS messages, the GSM 03.38 character set and Unicode UCS-2 (with code points appropriate to the locale).

In order to cover the GSM 03.38 character set, you can submit messages in Modified Latin-9 or directly in GSM03.38 (the GSM character set).

For more information on submitting messages to the SMS Gateway via HTTP, please look here.

What is the difference between UCS-2 and UTF-16? [25]

UCS-2 and UTF-16 are virtually identical, but UCS-2 characters will always take exactly 16 bits, so it is safe to increment and decrement by 16 bits for each character when parsing a message in your code.

What is the GSM 03.38 Character Set? [7]

The GSM 03.38 Character Set
×0 @ Δ SP 0 ¡ P ¿ p
×1 £ _ ! 1 A Q a q
×2 $ Φ " 2 B R b r
×3 ¥ Γ # 3 C S c s
×4 è Λ ¤ 4 D T d t
×5 é Ω % 5 E U e u
×6 ù Π & 6 F V f v
×7 ì Ψ ' 7 G W g w
×8 ò Σ ( 8 H X h x
×9 Ç Θ ) 9 I Y i y
×A LF Ξ * : J Z j z
×B Ø ESC + ; K Ä k ä
×C ø Æ , < L Ö l ö
×D CR æ - = M Ñ m ñ
×E Å ß . > N Ü n ü
×F å É / ? O § o à

GSM 03.38 Escaped Characters
Character Escape Sequence Hex
ESC e 1B 65
FF ESC LF 1B 0A
[ ESC < 1B 3C
\ ESC / 1B 2F
] ESC > 1B 3E
^ ESC Λ 1B 14
{ ESC ( 1B 28
| ESC ¡ 1B 40
} ESC ) 1B 29
~ ESC = 1B 3D

What is the Modified Latin-9 Character Set? [14]

This character set is based heavily on ISO-8859-15 (Latin-9). However, in order to fully cover the GSM Character set it adds some Greek letters between 0x80 and 0x8A and has a different character at 0xA8.

The characters not present in the GSM character set are shown on a grey background.


Modified Latin-9
×0 SP 0 @ P ` p Δ NBSP ° À Ð à ð
×1 ! 1 A Q a q ¡ ± Á Ñ á ñ
×2 " 2 B R b r Φ ¢ ² Â Ò â ò
×3 # 3 C S c s Γ £ ³ Ã Ó ã ó
×4 $ 4 D T d t Λ Ž Ä Ô ä ô
×5 % 5 E U e u Ω ¥ µ Å Õ å õ
×6 & 6 F V f v Π Š Æ Ö æ ö
×7 ' 7 G W g w Ψ § · Ç × ç ÷
×8 ( 8 H X h x Σ ¤ ž È Ø è ø
×9 ) 9 I Y i y Θ © ¹ É Ù é ù
×A LF * : J Z j z Ξ ª º Ê Ú ê ú
×B + ; K [ k { « » Ë Û ë û
×C FF , < L \ l | ¬ Œ Ì Ü ì ü
×D CR - = M ] m } SHY œ Í Ý í ý
×E . > N ^ n ~ ® Ÿ Î Þ î þ
×F / ? O _ o ¯ ¿ Ï ß ï ÿ