GSoC Project Ideas

Below, you can find some ideas on the directions in which we could jointly push Polypheny forward. Please consider them as starting points for your proposal. Of course, if you have other ideas, we would be very happy to hear them. Feel free to contact us and get feedback on what you plan to do beforehand.

Simply copying and pasting one of the ideas will not work. On the other hand, creating a completely new idea without first consulting the mentors might be difficult as well.

Query the Blockchain

A blockchain can be seen as a distributed append-only database. The aim of this project is to build a data source adapter for executing (read) queries against (different) blockchains like the Bitcoin blockchain or the Ethereum blockchain.

Due to Polypheny’s ability to join and combine data from different adapters in one query, this project will allow to integrate the latest data from a blockchain into arbitrary queries.

Difficulty: medium-hard

Quality Check and Assurance

A major problem in the process of developing any kind of software is to ensure that a change does not introduce new bugs in a completely different subsystem of the software. For a database system this also includes to ensure the completeness and correctness of the results of a query.

Continuously and automatically checking that a system behaves and works like expected is therefore important to ensure consistent software quality and to avoid regressions. Usually, this is done using unit tests and integration tests. While unit tests check that individual parts (units) of the code (typically individual methods) work as expected, integration testing checks if the whole application works correctly.

Currently, Polypheny has only a few integration tests which make it hard to avoid regressions and unintended side effects. The aim of this project idea is to systematically add additional tests to cover as many features as possible. This especially means writing checks for the SQL query interface.

Difficulty: easy

Visualize It

In debugging mode, Polypheny creates a lot of log output on all decisions and optimizations taken while processing a query. This output is hard to read but contains a lot of useful and interesting information for developing and optimizing Polypheny.

In the query optimization process, several candidate plans are generated by applying optimization rules. Every candidate plan has a certain cost assigned. While Polypheny-UI contains support for visualizing query plans, this is currently only done for the selected plan and not for all candidate plans.

The idea of this project is to visualize this debugging information in the UI. Furthermore, the existing query plan visualization should be extended to allow browsing through the candidate plans including a list of the applied rules and the associated costs.

Difficulty: medium

Support for Contextual Query Language

The Contextual Query Language (CQL) is a formal language for representing queries to information retrieval systems such as search engines, bibliographic catalogs and museum collection information. The idea of this project is to add a read-only CQL query interface to Polypheny-DB.

Difficulty: medium

Server-side Query to File

For some applications, especially for those making use of the multimedia and file storage capabilities of Polypheny-DB, it is useful to represent and interact with a table (or the result of an arbitrary query) as file system. With Query to File we already have a prototype implementation of this using FUSE and running on the client computer.

The idea of this project is to integrate this concept directly into Polypheny-DB. Instead of an application running on the local machine, Polypheny-DB should provide a FTP or WebDAV share that could then be mounted on other machines.

Difficulty: medium

Support for Default Functions and Auto-Increment

Auto-increment a primary key or inserting the result of a function (e.g. the current timestamp) is a common feature in database systems. The idea of this project is to add support for this handy feature in Polypheny-DB.

Difficulty: medium

Data Source Adapter for Excel Sheets

Data source adapters allow to map existing data into the schema of Polypheny-DB. This allows to query the mapped data using the available query languages and features of Polypheny-DB. Furthermore, imported tables can be joined with other tables. Polypheny-DB currently contains support mapping data from different JDBC databases and CSV files.

The goal of this project is to add adapter similar to the CSV adapter but for Excel files. You can also come up with your own idea for a data source adapter. Data source adapters do not necessarily need to support data modification queries.

Difficulty: easy-medium

Physical Query Plan Builder

The Polypheny-UI already comes with a logical query plan builder integrated. The aim of this project is to implement a graphical build tool for physical query plans similar to the one for logical query plans. In the polystore context, the physical query plan distinguishes from the logical one by using operators specific to the involved data store adapters.

A physical query plan builder would be extremely helpful for development. Because there is already an implementation for logical plans, parts of the code can be reused.

To get an impression how a physical query plan in Polypheny looks like you can simply execute a query in the UI and have a look at the plan by selecting “Physical Query Plan” in the left menu.

Difficulty: medium