Privacy protection in big data still a big concern

Protection of personal identifiable information (PII) in big data deployments remains a big concern because current technology designed to protect such information cannot guarantee its safety.

While businesses use PII to serve up targeted ads, products and services (something that’s considered a benefit by some), the exposure of PII can render a person vulnerable to unwarranted scrutiny, potential profiling and discrimination or exclusion based on demographic data, according to an article in the Stanford Law Review.

In order to protect people from biases and misuse of their personal information, some organizations use de-identification to uncouple information identifying person from other data associated to that person.

De-identification methods such as anonymization, pseudonymization, encryption, key-coding and data sharing to separate PII from actual identities.

Anonymization involves removing names, addresses and social security numbers; pseudonymization replaces this data with nicknames and other artificial identifiers. Key-coding, encodes PII and creates a key for decoding them.

Sharding breaks off parts of the data in a horizontal partition. This method provides just enough data to work with but not enough to identity a person.

However, the problem is that current de-identification techniques can be countered by re-identification strategies,

Once you have “even one type of data to work with,” according to Keith Carter, adjunct professor at the business school of the National University of Singapore, data can be pieced back together again in many way.

Carter spoke recently at the Big Data World Asia 2013 conference.

For instance, he said, a business or government is able to get hold of a list of GPS records covering a year, it could be used to determine the identity of a person.

Organizations can deduce the identity of a person, Carter said, by pinpointing the address that person “regularly come from at seven or eight in the morning.” Researchers will be able to determine if the person went to an office or a school.

From this point, addresses and names could be obtained with a high degree of accuracy using public address tools.

Vulnerabilities like these exist because big data systems where never intended to do what they do today, said Brian Christian , chief technology officer for Zettaset Inc., a big data management platform firm.

He said enterprises manage big data systems that are complex and need to execute multiple hand-offs to other systems. Each hand-off is a vulnerable junction, he said.

While de-identification has become a key component of business models in areas such as healthcare, online advertising and cloud computing, it may not be able to provide an adequate solution to big data privacy concerns.

There’s a notion that businesses and governments have spent a lot of money on de-identification to protect PII, Carter said, but what they may have accomplished is to provide themselves with a safe harbor by using de-identification.

Read the whole story here


Nestor E. Arellano
Nestor E. Arellano
Toronto-based journalist specializing in technology and business news. Blogs and tweets on the latest tech trends and gadgets.

Would you recommend this article?


Thanks for taking the time to let us know what you think of this article!
We'd love to hear your opinion about this or any other story you read in our publication.

Jim Love, Chief Content Officer, IT World Canada

Featured Download

ITW in your inbox

Our experienced team of journalists and bloggers bring you engaging in-depth interviews, videos and content targeted to IT professionals and line-of-business executives.

More Best of The Web