In the cloud computing era, data privacy is a critical concern. Memory
accesses patterns can leak private information. This data leak is particularly
challenging for deep learning recommendation models, where data associated with
a user is used to train a model. Recommendation models use embedding tables to
map categorical data (embedding table indices) to large vector space, which is
easier for recommendation systems to learn. Oblivious RAM (ORAM) and its
enhancements are proposed solutions to prevent memory access patterns from
leaking information. ORAM solutions hide access patterns by fetching multiple
data blocks per each demand fetch and then shuffling the location of blocks
after each access. In this paper, we propose a new PathORAM architecture
designed to protect user input privacy when training recommendation models.
Look Ahead ORAM exploits the fact that during training, embedding table indices
that are going to be accessed in a future batch are known beforehand. Look
Ahead ORAM preprocesses future training samples to identify indices that will
co-occur and groups these accesses into a large superblock. Look Ahead ORAM
performs the same-path assignment by grouping multiple data blocks into
superblocks. Accessing a superblock will require fewer fetched data blocks than
accessing all data blocks without grouping them as superblocks. Effectively,
Look Ahead ORAM reduces the number of reads/writes per access. Look Ahead ORAM
also introduces a fat-tree structure for PathORAM, i.e. a tree with variable
bucket size. Look Ahead ORAM achieves 2x speedup compared to PathORAM and
reduces the bandwidth requirement by 3.15x while providing the same security as
PathORAM.

By admin