PRIVACY RISK ASSESSMENT WITH BOUNDS DEDUCED FROM BOUNDS
Abstract
As more and more organizations collect, store, and release large amounts of personal information, it is increasingly important for the organizations to conduct privacy risk assessment so as to comply with various emerging privacy laws and meet information providers' demands. Existing statistical database security and inference control solutions may not be appropriate for protecting privacy in many new uses of data as these methods tend to be either less or over-restrictive in disclosure limitation or are prohibitively complex in practice. We address a fundamental question in privacy risk assessment which asks: how to accurately derive bounds for protected information from inaccurate released information or, more particularly, from bounds of released information. We give an explicit formula for calculating such bounds from bounds, which we call square bounds or S-bounds. Classic F-bounds in statistics become a special case of S-bounds when all released bounds retrograde to exact values. We propose a recursive algorithm to extend our S-bounds results from two dimensions to high dimensions. To assess privacy risk for a protected database of personal information given some bounds of released information, we define typical privacy disclosure measures. For each type of disclosure, we investigate the distribution patterns of privacy breaches as well as effective and efficient controls that can be used to eliminate privacy risk, both based on our S-bounds results.