Control Your Charcters, For Better Understanding

When we open an XML file, we see human-readable text. Thanks to Unicode, a character set used to represent nearly every character written, XML can contain virtually everything. Don’t forget about special characters and control characters, however. Don’t enter them directly into XML. Forgetting about them can make your XML unusable or even make your application crash.

Special characters, such as less-than and greater-than signs (‘<' and '>‘), need to be escaped. How do we escape a character? We use an entity.

Entities are easily viewable when we read an XML file. Entities always start with an ampersand (&) and end with a semicolon (;). For example, insert a less-than and greater-than signs by entering “&lt;” and “&gt;” respectively. Insert an ampersand by entering “&amp;”. Entities are instantly changed into the right character when it is read by your application.

When converting data from other kinds of files, watch out for control characters. Control characters were sometimes used to define data fields, but they can crash your application. Worse, control characters aren’t usually viewable by the human eye. But, once you find them, you can simply delete them. If they are needed, delete the character, then enter ‘&number;’. Replace ‘number’ with the Unicode number for the character.

Explore posts in the same categories: Uncategorized

Comments are closed.