This threat model assumes that all databases (DBs) do not trust each other and that no information
from one set of data should be learned by another data owner. We also require that the computing
engine cannot learn anything about the underlying data, the exact query (for examples, the
selection conditions are protected but the computing engine must know the type of the queries to
response accordingly) and its result. We make the following trust assumptions.
- All DBs are honest but curious.
- For correctness of the result, the computing engine will perform the algorithm faithfully.
- For confidentiality of the intermediate result, the computing engine will not collude with any
of DBs. Moreover, there is no communication between the randomizer and the computing engine.
- The querier can encrypt to the randomizer, the randomizer can encrypt to all DBs (as a whole,
not necessary individually), and all DBs can encrypt to the querier. These can be easily realized
if the randomizer, the DBs, and the querier have authenticated public keys.
- To avoid replay attack and the correctness of the end result received by the querier, these
ciphertexts should contain a cryptographic checksum, e.g. by message authentication code, such
that any outsider adversary cannot inject arbitrary messages to the protocol.
- The querier is distinct from the computing engine, and it cannot collude with any of the DBs.
This can be realized by restricting the access to the querier terminal, which also conforms to
regulations, e.g. HIPAA requires that access to hardware and software should be limited to
properly authorized individuals.