Privacy by Construction

Be it the EU's General Data Protection Regulation or California's California Consumer Privacy Act (CCPA), governments have enacted a flurry of new laws seeking to protect people's privacy by regulating the use of Personally Identifiable Information (PII). Despite steep fines, compliance with these laws in the software industry has been spotty at best; privacy breaches abound and companies are paying millions as a result. Data privacy laws like the EU's GDPR grant users new rights, such as the right to request access to and deletion of their data. Manual compliance with these requests is error-prone and imposes costly burdens especially on smaller organizations, as non-compliance risks steep fines. The need for automated compliance is clear, but existing database systems are ill-equipped to provide it.

K9db: Privacy-Compliant Storage for Web Applications by Construction

K9db is a new, MySQL-compatible database that complies with privacy laws by construction. The core idea is to make the data ownership and sharing semantics explicit in the storage system. K9db has been open-sourced and accepted to OSDI '23

What was your role in this?

I started working on K9db in April 2021 and co-authored the paper published at OSDI'23. Over that time period, I worked on several different aspects of the system, including building the database proxy that makes K9db compatible with MySQL, implementing variable data ownership to support complex privacy policies, designing test infrastructure, and benchmarking sample applications.

Database Proxy

Unlike a traditional relational database, K9db structures data primarily by user. This allows web applications to service GDPR-style requests without having to rely on error-prone custom scripts.

Systems1
Most databases versus K9db

One of my first contributions was a database proxy designed to make K9db compatible with existing applications that use a MySQL database. This enables developers using MySQL to gain the privacy benefits of K9db without having to rewrite their application code. The proxy can also be extended to support other widely used relational databases.

The proxy converts MySQL queries into C/C++ compatible types and invokes a Foreign Function Interface (FFI) to call K9db’s database API. Query responses from K9db are converted back into MySQL compatible types which the web application can use. The proxy is designed to support multiple concurrent client connections - it is multi-threaded and synchronized. More details and performance benchmarks can be found in this research symposium presentation.

Systems1
MySQL <-> K9db Proxy

Variable Data Ownership

I implemented variable data ownership to support the privacy policies of Shuup, an open-source e-commerce platform that allows users to request anonymization of their data. More details can be found in this research paper.

Previous

Scroll to top

Next