ASN1 is a data interoperability format that is widely used in directory, security and network management systems. Data is stored in triplets of TLV – type, length, value (in BER, DER encoding rules). TLV allows a format that is efficient, recursive and self-describing.
The type system is the interesting aspect. A “type” is a descriptor for the data, that the TLV triplet holds. The “type” is stored as a sequence of one or more bytes. This sequence can be as small as a single byte or be as large as needed (unlimited length). In case of a single byte “type”, the bits 7,8 represent the class of the type (4 classes exist), bit 6 represents whether type is single atomic data element or nested, and bits 5-1 encode the tag of the type. This single byte type can hold tags from 0 to 30. If the type is outside this range 0-30, the 5 to 1 bits are set to 1, and the actual tag starts in the following bytes of the now multi-byte type. In case of multi-byte “types”, the most significant bit of each byte must be 1, except for the last byte, which must be 0.
There are atomic types and component types.
Atomic types include OBJECT IDENTIFIER types and various strings (bits, ascii, octet), integers, null .
Component types include the ordered SEQUENCE and the unordered SET, both of which types can contain one or more occurrences of different types of data. SEQUENCE OF and SET OF are component types which contain zero or more occurrences of the same types of data.
There is potential for ambiguity as to whether an Object Identifier (OID) in the tagged notation is described as a multi-byte type since the OID is itself multi-byte. It is not, it is described as a single byte ASN1 type with tag = 06 as described in the tag table here and clarified by this Microsoft example of an OID encoding. So the OID value sits in the value field of the TLV triplet, not in the type.
An example of the encoding for RSA private key in PKCS#1 is here.
oid-info.com allows lookup of Object Identifiers. Here is a tree display for RSA private key which has OID 1.2.840.113518.104.22.168 – http://oid-info.com/cgi-bin/display?oid=1.2.840.113522.214.171.124+&submit2=Tree+display
What are examples of multi-byte ASN1 types ? The EMV format used in payments and smartcards use two-byte types.
The “abstract” in the name came from a contrast to “transfer syntax notation” which is the on-wire format. The “abstract syntax” maps to “transfer syntax” via encoding rules.
For comparison, consider the XDR scheme used in SunRPC. Here the types (metadata) are not included within the protocol as tags, but defined externally in a .x file which is an input to an rpcgen compiler. Protobuf and capnproto also use external medata in a .proto file. ASN1 now supports Packed Encoding Rules (PER) which remove the tag information for greater compactness and efficiency. Finally, while ASN1 continues to be used for highly structured information, the rapid growth of JSON/REST protocols in the identity space has been interesting.