Montag, 25. Februar 2008

GUIDs, their representation and Base64

Everyone knows what a GUID is - 128 bit number, that is supposed to be global unique and it is used for identifier for different types of objects, COM objects most notably.

Usually they are formatted as hex string like {F7F052A2-8BC7-4b84-8330-228BCA8A6E19}. A tool for creating guids can is guuidgen.exe, there are also System.Guid class and CoCreateGuid API.
Sometimes GUIDs have to be formatted more compactly, for instance in the IFC Specification GUIDs have to be formatted as Base64, making them string with 22 characters length.
Sadly the base64 encoding is non-compatible with the .NET implementation, which makes it a hard task to convert System.Guid object to required format.
IFC Base64 is using 0-9A-Za-z_$ characters and .NET implementation is using something like A-Za-z0-9 for encoding table.
There is some sample C code on the ifc wiki site, so I went for the easy solution - make a dll and call it from .NET.
The problems with this approach kept comming one after another - mostly dll was not always found in Web scenario, due to deployment issues. But there are also other possible hurles - 64bit migration, deployment on Mono and so on.
So I needed a pure managed implementation of Base64 encoding for GUIDs.
Some googleing brought me to sample code, that I adjusted it to the spec and here is the solution:


public class Managed {

public static string GetId(Guid guid) {
return ToBase64String(guid.ToByteArray());
}

public static string GetId() {
return ToBase64String(Guid.NewGuid().ToByteArray());
}

public static readonly char[] base64Chars = new char[]
{ '0','1','2','3','4','5','6','7','8','9',
'A','B','C','D','E','F','G','H','I','J','K','L','M',
'N','O','P','Q','R','S','T','U','V','W','X','Y','Z',
'a','b','c','d','e','f','g','h','i','j','k','l','m',
'n','o','p','q','r','s','t','u','v','w','x','y','z',
'_','$' };

public static string ToBase64String(byte[] value) {
int numBlocks;
int padBytes;

if ((value.Length % 3) == 0) {
numBlocks = value.Length / 3;
padBytes = 0;
} else {
numBlocks = 1 + (value.Length / 3);
padBytes = 3 - (value.Length % 3);
}
if (padBytes < 0 || padBytes > 3)
throw new ApplicationException("Fatal logic error in padding code");


byte[] newValue = new byte[numBlocks * 3];
for (int i = 0; i < value.Length; ++i)
newValue[i] = value[i];

byte[] resultBytes = new byte[numBlocks * 4];
char[] resultChars = new char[numBlocks * 4];

for (int i = 0; i < numBlocks; i++) {
resultBytes[i * 4 + 0] =
(byte)((newValue[i * 3 + 0] & 0xFC) >> 2);
resultBytes[i * 4 + 1] =
(byte)((newValue[i * 3 + 0] & 0x03) << 4 |
(newValue[i * 3 + 1] & 0xF0) >> 4);
resultBytes[i * 4 + 2] =
(byte)((newValue[i * 3 + 1] & 0x0F) << 2 |
(newValue[i * 3 + 2] & 0xC0) >> 6);
resultBytes[i * 4 + 3] =
(byte)((newValue[i * 3 + 2] & 0x3F));
}

for (int i = 0; i < numBlocks * 4; ++i)
resultChars[i] = base64Chars[resultBytes[i]];

string s = new string(resultChars);
return s.Substring(0, 22);
}
}


So if you have to encode something as Base64 or deal with GUIDs with one way or another - this may be helpful to you.
Original code by James McCaffrey

Keine Kommentare: