ASCII and EPSIDIC
What is a “Character Set?”
A character set is a collection or library of characters, (letters and symbols), and their identifying number. Included with the printable characters, (letters and punctuation) are some unprintable yet important characters. Characters are used to form messages.
Characters are not fonts. Characters exist under the font that represent the definition of the character the font is attempting to display. When you change the font on a document the A is changed to an A, but the underlying character that identifies its meaning remains the same. The font identifies how the character is displayed. You can even convert to Wing Dings and the underlying character remains the same.
We can imagine that if I wrote this post using a character set that I created myself. And then you came and tried you read it, without knowing what my character set was, you would see a bunch of garbage on the screen like if you go to a foreign language web page without the correct fonts loaded. Even worse would be if you used the same characters, but had different Identifiers for them. If an A is 001 (my set starts at A and moves on numerically) and you try to read it, (but in your character set you numbered the vowels after the consonants) and 001 to you is B. Now all of the letters will be wrong. And we get garbage.
Fortunately, some people got together early on and created a standard for characters. The American Standard Code for Information Interchange created the character set we call ASCII. The Extended Binary Coded Decimal Information Code was created by IBM, but they use ASCII now as well.
What is ASCII
ASCII is the acronym for American Standard Code for Information Interchange, and is a collection of characters defined from 0 to 127. These definitions represent all of the standard English characters, numbers and symbols. A number of other, unprintable, characters are also included. You use one or two of these each time you hit the “Enter” key on your keyboard. Depending on your operating systems, this sends the “carriage return” and or “line feed”
A “carriage return” comes from a printer where the head would move back and forth on the roller. CR would tell the printer to move the head all the way to the left of its printing area. A “line feed” is also from a printer perspective. This tells the printer to roll the paper so that the head will be writing on the next line. These are both examples of unprintable characters. You can probably think of others. For a complete list of ASCII characters, you can check out this table in my toolbox.
What is EPSIDIC
EPSIDIC is the acronym for Extended Binary Coded Decimal Information Code. This was created by IBM back in the day. IBM now uses ASCII just like everyone else, but there are legacies that are still with us. Old terminals like VT100 and some legacy communications equipment still expect messages using the EPSIDIC character set.
What is the big deal
As I said, some systems still want to use some of the characters in the EPSIDIC system. Even fancy new systems producing XML will sometimes fall into this trap and cause problems. The one that I have run into is the use of | called ‘pipe’ ASCII and EPSIDIC use different character IDs for this character. And I have seen e-commerce systems, that are using ASCII for everything else, throw in an EPSIDIC pipe as a control character. When this happens, other systems will choke on it.
When you find yourself getting an invalid character message, but the characters look fine. Remember that there are some twists that may exist in the underlying character set. If you can, manually replace the character with the character that it looks like, (in my case the EPSIDIC | with a ASCII | ) and see if the parser likes the file now. If it does, you have encountered the character set problem as I have. This can be a difficult problem to solve if you have never encountered it.
If the character that is causing problems is not a pipe, you may want to look at IBM’s ASCII to EPSIDIC conversion table. This can be difficult to communicate with others that have never encountered it, so using the ACSII and EPSIDIC identifier designation can help explain what we are saying in email and documentation when we are trying to correct the issue.
Subscribe to "The Integration Engineer" by Email
Find out about the tools and services available at The Integration Engineer's Consulting site.











