Personal Data Management Systems (PDMS) — data platform allowing users to easily store into a single place any personal data, like Cozy Cloud — have recently been gaining traction as more and more users become aware of how these information are used. However, as exciting as this perspective is, the PDMS approach raises two issues: security and collaboration. Indeed, each PDMS can store potentially the entire digital life of its owner, thereby proportionally increasing the impact of a leakage. Also, collaboration in this context is much more complex: centralizing all users’ data into few powerful servers is risky since the data servers become genuine honeypots (e.g. Equifax and Facebook leaks) and makes little sense as data is naturally distributed at the users’ side.
This thesis tries to provide a secure way to enable the execution of distributed queries while protecting the privacy of the participants, i.e., creating the basis for a secure form of collaboration. Our first contribution tackles the problem of selecting actors (nodes that will be involved in the computation) for the execution: as with all distributed systems, we cannot exclude the possibility of having corrupted nodes, thus it is important to select actors in a random fashion. This process for selecting actor is called “imposed randomness”: the guaranteed randomness is obtained thanks to some constrains, constrains that we enforce. Our second contribution details a generic protocol for executing distributed queries in a privacy respectful manner: we minimize the information accessed by each actor (through “task compartmentalization” and “knowledge dispersion”) so that, if, unfortunately, a corrupted node is involved, the leaked information are both minimal and incomprehensible due to a lack of context.
To validate our contributions we implemented a simulator that analyzed the performances of our protocol for all possible configurations in a given network. Our results show that our security measures guarantee a leakage proportional to the maximum percentage of colluding (our model is only threatened by colluding adversaries) corrupted nodes, this percentage being optimum, and that the induced cost is extremely small in comparison to the number of colluding nodes.