If you were born before 2011, there’s a good chance I can guess your social security number (SSN). June 25, 2011 was when the Social Security Administration finally starting issuing random SSNs. But before then, they were surprising predictable.
If you’re American, you’re asked for your xxx-xx-xxxx 9-digit SSN all the time. It’s the closest thing we have in this country to a national ID number. Your first three digits are a geographic location code (area number) based on your birth ZIP code. The middle two digits are a non-sequential but deterministic time grouping (group number). The final four digits are a sequential counter (serial number).
An Army funded Carnegie Mellon experiment showed researchers could predict a SSN with alarming accuracy given only date and place of birth. They tested their algorithm on half a million records in the Social Security Administration (SSA) Deceased Master File, and found they could predict an individual’s SSN within a small range of likely values in many cases. In 2009 the researchers called on the US government to randomize SSNs. SSA initially responded by calling the findings “a dramatic exaggeration”, but they did adopt random SSNs just two years later.
A “predictable secret” is an oxymoron. Authenticating people by their SSN has led to countless security breaches, making SSNs increasingly valuable to criminals who buy and sell them in bulk. Imagine if banks assigned everyone’s ATM PIN based on their birthday (MMDD). To steal someone’s money, you'd just need to know when they were born. No one would accept such a crazy scheme, yet people willingly use SSN in much the same way. There’s been little public outcry. Some privacy groups have sounded alarms (EPIC for example) but progress is slow.
US government and industry largely responded to privacy concerns through SSN masking. Typically, this involves censoring the first 5 digits and displaying only the 4-digit serial number portion (e.g., “***-**-1234”). The IRS adopted this so-called TIN Masking approach in 2014; SSNs they display or print on forms now show only the last four digits. They call this identifier a Truncated Taxpayer ID Number (TTIN).
The problem with TIN Masking is it hides the part of the SSN that’s most predictable, while leaving the part that’s hardest to guess exposed for all to see. The Carnegie Mellon researchers were able to predict first 5 digits of SSNs on the first guess with a 44% success rate.
Also, unlike SSN, a TTIN is not unique. With only 10,000 possible values allotted to 300 million US citizens, collisions are guaranteed to occur at high rates. For example, by the law of averages and the Pigeonhole Principle, about 30,000 Americans have SSNs that end in “1234”. So TTIN cannot work as a unique user ID.
As an authentication credential, TTIN is also worse than SSN. Precisely because agencies and companies now feel free to display or print the last four digits of your SSN, it’s now the easiest part of your SSN for criminals to steal. They no longer have to go to the trouble of finding out your birthday and birthplace to guess your full 9-digit SSN. Instead they can snatch your last 4 digits via shoulder surfing, key logging, war driving, dumpster diving, etc.
A fundamental rule of knowledge based authentication has been violated: Never use a piece of information as both identifier and credential. As it turns out, TTIN works poorly either way. With TTIN, the federal government took a bad ID and made it worse.
When SSA invented SSN in 1936, it was only meant to be your social security account number. No one imagined it would evolve into a de facto National ID number used by almost every agency at every level of government, as well as financial institutions and corporations. Nor was their predictability considered a liability; in 1936 there seemed to be no reason to keep them secret. If anything, their predictability made them easier to verify.
In the decades since 1936 we’ve tried various gimmicks to preserve the security, privacy, and uniqueness of SSNs. Yet every band-aid we apply seems to make the bleeding worse. With the advent of TTIN in 2014 we’ve dug ourselves into such a deep hole it’s hard to see a way out. Eventually there will be no one alive in America born before 2011, at which time all SSNs in use should be unpredictable. But it seems absurd to let this problem fester for another ninety years.
The obvious solution is to invent a true US National Identifier to replace SSN as a random, private, permanent, unique citizen identifier. But since this idea is political suicide to any leader who advocates it, it too is probably generations away in the future.
You may be thinking “Why don’t I just request a new random SSN to replace my predictable SSN?” Well, it turns out SSA does allow you to change your SSN but they don’t make it easy. You have to prove your current SSN is causing major problems such as serial identity theft or life endangerment. There’s an exception for people with a “religious or cultural objections to certain numbers” but it requires documentation from a recognized religious group. (As a precaution, SSA has never allowed any SSN to begin with “666“.)
It seems in America we are stuck with a dysfunctional national ID scheme for another century, thanks to a government that won’t address the problem and won’t allow us to trade our bad IDs for better ones.
UPDATE 10/4/2017: In aftermath of the Equifax breach, Trump administration is considering ending federal use of SSN for citizen identification.
Michael McCormick is an information security consultant, researcher, and founder of Taproot Security.