How to verify if a UUID follows the IETF specification

May 20, 2014

I was playing around with creating UUIDs for adding uniqueness in a project, so I started reading the spec for creating one in the RFC 4122. Also as a side note, the spec sheets there are really interesting and finely detailed for almost anyone to understand. I don’t know why I spent so much time on this, I was just having fun and time seemed to pass by..

Understanding the spec

The one I was after was a type 4 UUID which is randomly generated consisting of hex values under section 4.4. Luckily, from everything explained, this was the smallest algorithm required to implement:


4.4.  Algorithms for Creating a UUID from Truly Random or
      Pseudo-Random Numbers

   The version 4 UUID is meant for generating UUIDs from truly-random or
   pseudo-random numbers.

   The algorithm is as follows:

   o  Set the two most significant bits (bits 6 and 7) of the
      clock_seq_hi_and_reserved to zero and one, respectively.

   o  Set the four most significant bits (bits 12 through 15) of the
      time_hi_and_version field to the 4-bit version number from
      Section 4.1.3.

   o  Set all the other bits to randomly (or pseudo-randomly) chosen
      values.


So maybe that’s easier to understand if you read it from the top, but what it essentially says is that you can generate a random set of hex values for 30 of the 32 values, but the 13th bit must be 4, which is a way of identifying what type of UUID it is (i.e. type 4), and the 17th bit can be of 8, 9, a or b. Here’s an example:


a0424604-03c6-4468-963b-002e5fbe2812
              ^    ^
              |    |
           always 4|
                   |
               either 8,9,a,b


The code for writing this was fairly simple and can be found on GitHub, but I wanted a way to verify it was created correctly.

Verifying the correct form

After asking around and a few StackOverflow questions later, it seemed easiest to use a regex expression to solve this. I came up with a regex expression that tests for all four possible formats it could be expressed in. They look like this:


Lower case without hypens: 7185f40e722c4cfa8de5daedf048ea12
Upper case without hypens: 21A338B30A57462780450D4B6AF7A3EE
Lower case with hypens: 7e0b2da6-38c3-4873-83f7-aab0cacb7603
Upper case with hypens: FDFC7265-BA5E-4A63-9A51-AC661107EB37


This is the regex expression I finally ended up using:


[0-9a-fA-F]{8}-?[0-9a-fA-F]{4}-?4[0-9a-fA-F]{3}-?[89abAB][0-9a-fA-F]{3}-?[0-9a-fA-F]{12}


To break down what the regex says:

  1. First 8 characters can be anything from 0-9, a-f, A-F: [0-9a-fA-F]{8}
  2. A single hypen may follow: -?
  3. Next 4 characters can be anything from 0-9, a-f, A-F: [0-9a-fA-F]{4}
  4. A single hypen may follow: -?
  5. A single ‘4’ must follow
  6. Next 3 characters can be anything from 0-9, a-f, A-F: [0-9a-fA-F]{3}
  7. A single hypen may follow: -?
  8. The next character must be either 8, 9, a, b: [89abAB]
  9. Next 3 characters can be anything from 0-9, a-f, A-F: [0-9a-fA-F]{3}
  10. A single hypen may follow: -?
  11. Next 12 characters can be anything from 0-9, a-f, A-F: [0-9a-fA-F]{12}

Conclusion

Spec sheets explain everything you need about said thing even though they look ugly and seem too monotinous to read. You can find a couple of tests I wrote on GitHub as well in an objective-c implementation. This is what one of the tests look like:


- (void)testCorrectUUIDFormat
{
    UUIDGenerator *u_generator = [[UUIDGenerator alloc] init];
    NSString *sample_uuid = [u_generator uuid4WithCaps:false hypenated:true];
    NSString *pattern = @"[0-9a-fA-F]{8}-[0-9a-fA-F]{4}-4[0-9a-fA-F]{3}-[89abAB][0-9a-fA-F]{3}-[0-9a-fA-F]{12}";
    NSRange searchRange = NSMakeRange(0, [sample_uuid length]);
    NSError *error = NULL;
    NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:pattern options:0 error:&error];
    NSArray *matches = [regex matchesInString:sample_uuid options:0 range:searchRange];
    
    [matches count];
    for (NSTextCheckingResult* match in matches) {
        NSString* matchText = [sample_uuid substringWithRange:[match range]];
        NSLog(@"match: %@", matchText);
        NSRange group1 = [match rangeAtIndex:0];
        NSLog(@"group1: %@", [sample_uuid substringWithRange:group1]);
    }
    
    NSLog(@"Our UUID, %@", sample_uuid);
    NSLog(@"Our UUID length, %lu", (unsigned long)[sample_uuid length]);
    
    XCTAssertEqual([matches count], 1, @"UUID generated doesn't match the type 4 UUID RFC");
}


Refs:

Discussion, links, and tweets

I'm a software enginer that's worked on various Android projects for a while now. If you'd like to follow me on Twitter, I don't always post about tech things.