Currently I have heard a lot about differential privacy but I am unable to find out what exactly it is in a non-mathematicians language. Up-till now my understanding is that it add noise to the answer one gets for his query to database.
1 Answers
The Wikipedia page has mathematics but also an example which may or may not be enlightening: if you can ask a database for the sum of values of a column for rows 1 to n, then doing the request for n-1 and for n allows you to rebuild the information about the row n. Thus, allowing the query "sum of all values for the first n rows" and returning the exact result can be abused into learning exact information for each row.
Differential privacy is the mathematical concept by which how much (or how little) a database preserves anonymity (i.e. avoid issues like the one above) is measured. Adding random noise is a method to achieve (hopefully) some given level of differential privacy. This is not the only possible method, but at least it is relatively simple to implement, and we can compute how much it protects anonymity, i.e., in the mathematical formalism, what is the achieved value of "ε" in the expression "the database ensures ε-differential privacy".
Adding noise is a trade-off: it gives some privacy at the expense of usability, since the returned values are "noisy", thus imprecise. If you want more privacy you must degrade the quality of the database answers. Research on differential privacy concentrates on finding new algorithms for returning more accurate answers to statistical queries while better protecting privacy. Such novel algorithms are heavy in mathematics.

- 172,594
- 29
- 349
- 481
-
I wonder if you can answer this question on sensitivity: https://security.stackexchange.com/questions/206345/differential-privacy-understanding-sensitivity – akilat90 Mar 29 '19 at 05:47