Entity Resolution

Spock has a problem they want to pay someone $50K to solve for them:

A common problem that we face is that there are many people with the same name. Given that, how do we distinguish a document about Michael Jackson the singer from Michael Jackson the football player?

That is worth far more than $50K, it seems to me, since it directly impacts all our privacy, not to mention the future of criminal investigations.

The Chinese government is already working on a solution, from a slightly different perspective:

Police in China, where most of the 1.3 billion people share just 100 surnames, are considering rules which would combine both parents’ family names to prevent so much duplication, state media said yesterday.

[…]

“By adopting both parents’ names, 1.28 million new surnames will be added, which will greatly solve the problem of name duplication,” Xinhua news agency said, citing the regulations.

This is just the beginning of the problem. Future generations may more commonly treat names as an evolutionary thing, rather than static. So the question will become how to tie together a history/path of names throughout someone’s life.

On a related note, mandarintools.com has a Chinese name generator. It would be funny if it only had ten names to choose from, based on the latest reports coming from China, but unfortunately it actually tries to create uniqueness.

Every time I run the program I get a completely different answer. Or should I say a differently resolved entity?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.