Wednesday, June 16, 2010

Your private Data is Unforgettable


Borges in 1951, by Grete Stern
Picture Source: Wikimedia Commons 


On June 14th I attended the Israeli Wikipedia Academy 2010 conference in Tel-Aviv University.
The interesting conference focused on Wikipedia and Wiki technology usage in Academic context and schools.
Most of the presentations focused on Wiki or Wikipedia research, usage and projects.
The main theme repeating in most presentations was that Wiki based Collaboration and Participation changes the Game's Rules. However, in some contexts changing the rules is very useful, while in other contexts the usefulness is questionable.
Changing the rules implies new challenges to all process participants such as Users, Content Creators, Managers, Auditors etc.
I already described these challenges in previous posts: Wikipedia the Good the Bad and the Ugly and Web 2.0 for Dummies – Part 7: Wikipedia.


A Keynote Presentation on Remembering and Forgetting
In my opinion, the Keynote by Prof. Viktor Mayer-Schönberger was the most interesting. Prof.  Mayer-Schönberger is the Director of the Information and Innovation Policy Research Center at the National University of Singapore. His presentation titled:  Remembering and Forgetting, looked beyond Wikipedia perspective.
We already know that human beings do not remember everything: They forget.
As a student of Psychology in the 1970s I studied a research by George A. Miller titled: The Magical Number 7 plus minus two.
The experiment he performed proved that Human's Short Term Memory's capacity is about 7 items. It is true, that Long Term Memory Capacity is less limited, but human beings are not capable of remembering everything.
There are some exceptions to that rule. The Argentinean writer and poet Jorge Luis Borges, describes in one of his stories a man who remembers everything.
When that man was asked to tell about an event which happened to him, he describes every detail so the description is virtually another occurrence of the event. For example, describing a one hour event takes exactly one hour.
From the short reference to the story written by Borges it is clear, that remembering everything is a barrier for human happiness and adaptation. 
According to Prof  Mayer-Schönberger current computing systems "remember" everything. Like human unlimited memory, computing systems unlimited remembering is not a recipe for happiness. 

He described examples of people who published personal information which was included in Web Pages and after few years it damaged them. For example, a Canadian Professor who mentioned in a scientific article in 2001 that he consumed drugs on the 1960s, was forbidden for ever from entering USA after traveling from Vancouver to Seattle airport to pick a friend. One of the American clerks googled and found the article and the professor was accused for no disclosure of that information.  

Another example described by him was non-computerizes information included in the Dutch Population Registry. It includes Nationality and Religion of each person included in this "database". During World War II the Nazis used the information for finding the Dutch Jews. The result was killing of higher percentage of the Dutch Jews than in other countries. The lesson that may be learned from this example is that private data may be used for bad purposes, which has nothing to do with the original reasons for collecting the data. 

The effects of not forgetting

Power – The holder of private data, e.g. Google, can use the data for its own benefits. Even if the holder will not use the data, the possibility of using it empowers the data holder.

Time – private data published could be used many years after it was published. Once you published it you are not able to eliminate it, even if many years after the publishing act you would like to. 

The presentation described means for addressing these effects. None of these means fully addresses the problem. Prof. Viktor Mayer-Schönberger suggested that deletion mechanisms could be a relatively effective mean.
Deletion mechanism is identical to deleting a file or a record in a File System or Database


My Take

The unforgettable Private data issue
Prof.  Mayer-Schönberger's good presentation shades some light on the Privacy issue; however he focused on a limited part of the problem: Private Data published intentionally by the subject of that data.
We should think as well of other scenarios of publishing Private data and data retention such as:

1. Unintentional Publishing by the subject e.g. attaching a wrong file to an e-mail message.
2. Publishing by unauthorized access to the subject's computer
3. Publishing from a Governmental Data Source by unauthorized access or by employee's mistake.   
4. Publishing by unauthorized access from other organizations data e.g. Hospitals, Insurance Companies, Banks etc. 
5. Publishing wrong private data by the subject e.g. publishing that he completed his studies in a well known university.
It could be almost impossible to demonstrate that the data is wrong, especially   after a long time after publishing. 
6. Publishing wrong private data by others e.g. anonymously publishing that he is suffering from a disease or responsible for a project failure. 

Although the Privacy violation act could be the same, data deletion rights may differ.  
      
Addressing the problem
I am very skeptical about mechanisms for private data abuse prevention. 
The only way to fully address the problem is by not publishing the data.
However, this method is unrealistic in many cases. 
The trade off is between availability of large amount of data in the Web with easy accessibility to everyone and the exposure of sensitive data. 

Users Awareness is a key in Security in the Web same as it is in Enterprise Systems:
The amount of Private data should be restricted by the subject to the required minimum.
Accessibility of The minimal Private data exposed, should also be restricted by methods such as Encryption

I am purposely using the word Subject and not the word Owner. The Private Data Owner should be defined and agreed upon. As far as Enterprise Systems Resources , e.g. Files, Databases Tables, are concerned the owner is usually defined. The owner is authorized to scratch the data.
Defining the Private data owner in Web context is more complicated. 
For example, if a person is defined as the owner of Private data pertaining to him, he may be authorized to delete the data or to require data deletion.
Even if the owner will be defined, Data deletion probably will not make it unforgettable.
There will almost always be someone else who will copy or backup the data prior to its deletion.
  




Public Cloud Core Banking: Hype or Reality? - Revisited

  More than 4 years ago I was asked if Public Cloud Core Banking is a Hype or a Short Term Reality? If you had read the post, you would prob...