AntidoteDB (Bachelor / Master Theses)
Antidote is a planet-scale, highly available, transactional distributed key-value database with support for replicated data types (CRDTs). Antidote started in the Syncfree EU project and it now continues as part of the Lightkone EU project. The development is lead by Annette Bieniusa and includes contributions from many people around Europe. Antidote itself is written in Erlang, a programming language for concurrent and distributed systems. Therefore, topics extending the database itself require you to know or learn Erlang.
We are always looking for Bachelor and Master students who want to contribute to this open-source effort for their thesis. Please contact Annette Bieniusa if you are interested in working with Antidote. We often have additional topics related to Antidote, which are not listed below.
Further Reading:
- Conflict-free Replicated Data Types (CRDTs)
Nuno Preguiça, Carlos Baquero, Marc Shapiro - A comprehensive study of Convergent and Commutative Replicated Data Types
Marc Shapiro, Nuno Preguiça, Carlos Baquero, Marek Zawirski - Free book for learning Erlang: Learn You Some Erlang for Great Good!
- Paper about the main protocol behind Antidote:
Cure: Strong Semantics Meets High Availability and Low Latency
Deepthi Devaki Akkoorath, Alejandro Z. Tomsic, Manuel Bravo, Zhongmiao Li, Tyler Crain, Annette Bieniusa, Nuno M. Preguiça, Marc Shapiro
Topic 1: Deployment
Antidote is a geo-replicated cloud datastore that can be deployed in different data centers. In this thesis, you will simplify the deployment, configuration and operation of Antidote using some orchestration management software such as Kubernetes.
Languages/technologies: Docker, Erlang scripts
Topic 2: Support for Elixir
Antidote offers a number of different bindings for languages, the Erlang binding is the one covering most functionality today. We want to investigate how we can make Antidote easily accessible to Elixir programmers.
Knowledge prerequisites: Basic knowledge of Erlang/Elixir
Topic 3: Security for Antidote
Databases often store sensitive information that needs to be secured against unauthorized access. Also, we need to prevent malicious modifications of meta-data to prevent the datastore from crashing.
For this thesis, you will extend the implementation to secure the communication between different nodes of Antidote and with clients.
Implementation language: Erlang
Topic 4: Property-based testing for Antidote
We are currently decomposing Antidote into different components for persistent storage, cross-site communication, transaction management, frontend, etc. which are stateful and therefore difficult to cover with simple unit tests. In addition to the unit and systems tests that are already in place, we want to add property-based tests.
Knowledge prerequisites: Knowledge of Erlang/OTP
Topic 5: Implementing Sequence/JSON CRDTs
Antidote already includes many replicated data types in the Antidote CRDT library. There are different kinds of counters, flags, maps, registers, and sets. In this thesis you will implement your own datatype for sequences. Since maps are already there, this can then be extended to a JSON datatype, which can represent arbitrary JSON documents.
Implementation language: Erlang
- Martin Kleppmann, Alastair R. Beresford:
A Conflict-Free Replicated JSON Datatype - L. Briot, P. Urso, M. Shapiro:
High Responsiveness for Group Editing CRDTs
Topic 6: Implementing a Filesystem with Antidote
A distributed filesystem manages files and folders stored on multiple machines. DropBox is a popular example of such a system. For this project, you should develop a distributed filesystem which is highly available and provides low latency. This means that it should be possible to perform operations on a single server without waiting for other servers. The system should still be running, even if all servers but one are down. Thankfully, Antidote already provides this property, so you will build the distributed filesystem on top of Antidote.
One difficulty in this project is to handle concurrent updates in a meaningful way. For example the question is, how the system should behave if a folder is moved on server A and concurrently a file in the folder is changed on Server B?
Implementation language: Java, TypeScript, or Erlang
- V. Tao, M. Shapiro, V. Rancurel:
Merging Semantics for Conflict Updates in Geo-Distributed File Systems - M. Najafzadeh, M. Shapiro, P. Eugster:
Co-Design and Verification of an Available File System - Mehdi Ahmed-Nacer, Stéphane Martin, Pascal Urso:
File system on CRDT