I’ve spent the last two days trying to find where Apple’s UDID, or Unique Device Identifier, comes from. The number is used to uniquely identify individual iPhone, iPod Touch, iPad, and Apple TV devices. I found out where it comes from.
Finding the origins of this number has sent me through quite a chase. First, I decided to check the IOKit registry for information. After searching through the 9483 lines of the IORegistry dump, I really found nothing even mentioning the UDID. Reaching a dead end there, I decided to start at the source; libMobileGestalt, Apple’s private library for obtaining device information.
MobileGestalt
MobileGestalt’s main function is MGCopyAnswer(CFStringRef key). It returns device information based on the key given. For example:
MGCopyAnswer(CFSTR("UniqueDeviceID"))
…returns a pointer to the device’s UDID.
Using IDA (The Interactive Disassembler) I started from here. Here’s MGCopyAnswer in ARMv7S assembly:
_EXPORT _MGCopyAnswer
_MGCopyAnswer
MOVS R1, #0
B.W sub_36EE8924
This function is very small; it’s just a wrapper to another static function, which was originally exported under something along the lines of ‘realCopyAnswer’ (thanks to saurik (Jay Freeman) for this). Being only two instructions long, it’s not possible to be hooked using CydiaSubstrate. Luckily, the realCopyAnswer function is directly adjacent to the wrapper, meaning it can be hooked by adding 8 to the address of MGCopyAnswer.
Why, Apple?
I spent the next 5-6 hours simply going through the binary and taking note of my findings. After a while, I began to notice something very odd; the keys for MGCopyAnswer are not in plaintext. They are stored as the ASCII representation of the MD5 hash of “MGCopyAnswer” appended with the key. Why Apple decided to do this is beyond me, but it made my quest a more difficult; MD5 is asymmetric, so I can’t just get a list of all keys in the binary. I did, however, write a quick-and-dirty Objective-C tool to get the hash of each key:
#import <Foundation/Foundation.h>
#import <CommonCrypto/CommonCrypto.h>
int main(int argc, char **argv){
int formattedLength = strlen(argv[1]) + strlen("MGCopyAnswer");
char *string = (char*)malloc(formattedLength + 1);
sprintf(string, "%s%s", "MGCopyAnswer", argv[1]);
unsigned char md[16];
CC_MD5(string, formattedLength, md);
NSLog(@"Hash: %@", [[NSData dataWithBytes:md length:16] base64EncodedStringWithOptions:NSDataBase64Encoding64CharacterLineLength]);
return 0;
}
This tool helped me figure out that the hash of “UniqueDeviceID” is re6Zb+zwFKJNlkQTUeT+/w. Sure enough, that string of seemingly random characters was in the binary. With that string found, I cross-referenced it with a few functions that seem to use it, and finally followed the trail back to a pretty complex function.
At first; I wasn’t sure what the function was for. I could tell that it retrieved the device’s serial number, Wi-Fi and Bluetooth MAC, IMEI, and unique chip ID via the plaintext error messages it logs upon retrieval failure. Then I saw something that caught my eye; a call to CC_SHA1. Following it back, I saw a call to CFStringCreateWithFormat, with the format string of "%@%@%@%@", or four objects.
By this time, I had figured out what the UDID was; the SHA1 hash of a mixture of the retrieved values mentioned above. But, I still didn’t know what order these came in, or which ones are used (there are five values retrieved, but the format string only specifies four).
Finding the Combination
After lots of different combinations, I didn’t get it. So I ended up hooking CC_SHA1 (inserting a call to log the input data) and calling the function to see what it hashes. Unfortunately, CC_SHA1 was never called, so I traced it back to a static (presumably boolean) variable that must contain whether or not the UDID has been generated. Using the GNU Debugger, I set that value to false:
Evans-iPhone:~ root# gdb -p 1213
(gdb) x 0x3AEF65B0
0x3aef65b0: 0xffffffff
(gdb) set {int} 0x3AEF65B0 = 0
(gdb) x 0x3AEF65B0
0x3aef65b0: 0x00000000
Now that that was taken care of, I used saurik’s ‘cycript’ tool (thanks again!) to call the function:
Evans-iPhone:~ root# cycript -p SpringBoard
cy# genUDID = @encode(void*())(0x38350E7F)
0x38350e7f
cy# genUDID()
0x1
Sure enough, the data passed to CC_SHA1 appeared in my iPhone’s system log (I replaced it with random letters for privacy reasons):
Feb 23 20:31:46 Evans-iPhone SpringBoard[1213] : SHA1: <46414D4C 4C464638 39333233 34353233 34333132 35333237 3761613A 62623A63 633A6464 3A65653A 66666161 3A62623A 63633A64 643A6565 3A6666>
The Conclusion
Upon converting the bytes to their corresponding ASCII characters (FAMLLFF893234523431253277aa:bb:cc:dd:ee:ffaa:bb:cc:dd:ee:ff), I was able to conclude the following:
The UDID is the SHA1 hash of the ASCII representation of the serial number, followed by the unique chip ID, followed by the WiFi radio MAC address, and finally followed by the Bluetooth radio MAC address.
I am very satisfied with my findings, but I don’t believe my research is yet concluded. There’s still the extra value that was never used; the IMEI. I assume this will have something to do with devices that don’t have cellular radios (Wi-Fi-only iPads and iPod Touches), but that is for another day!
This is quite a large breakthrough. It means this; if we were to change the IMEI, the WiFi/Bluetooth MAC addresses, or (somehow) the unique chip ID, the UDID could change as well, causing all sorts of trouble!
Edit:
It turns out this isn’t new. The entire time, this was documented at theiphonewiki. There goes an entire weekend :-/