"You should know your queries!" is the long version of the project title. It means that you should not just want tho have a database, but you should also think about the evaluations (which are written down as queries) that you actually would like to do with that database. The creation of a database is quite an effort, regarding not only the deployment of the software on a computer, but even more the capture of all the data to fill it. This effort should be spent with a goal in mind. The project will therefore collect queries, which can then even be used to automatically design a database. This saves resources on one hand, but on the other also supports the privacy goal of data minimization.
The world of data-management systems has become a bit confusing during the last years. Next to the well-established relational database systems, so-called NoSQL systems have been developed, which pretend to cope with much larger data volumes. At the same time, they can only offer limited functionality with respect to efficient data access and can only give reduced consistency guarantees. That raises the question when to stick to a relational database and when to move to a NoSQL system. This project collects the criteria that allow to make such a decision on a well-founded basis.
The sub-project of the chair in the context of the EFRE project E|ASY-Opt deals with the storage of large datasets ("Big Data") that are produced by today's production processes. These data show properties of products as well as production machines, and it is assumed that valuable evidence can be found in them, for instance with statistical methods or data mining. For that, the data must be stored in a way that allows to access them with acceptable effort. The form of storage must therefore be adapted to the kinds of accesses. This is the task of the project.
The goal of this project is to provide novel hardware and optimisation techniques for scalable, high-performance processing of Big Data. We particularly target huge datasets with flexible schemas (row-oriented, column-oriented, document-oriented, irregular, and/or non-indexed) as well as data streams as found in click-stream analysis, enterprise sources like e-mails, software logs and discussion-forum archives, as well as produced by sensors in the Internet of Things (IoT) and in Industrie 4.0. In this realm, the project investigates the potential of hardware-reconfigurable, FPGA-based Systems-on-Chip (SoCs) for near-data processing where computations are pushed towards such heterogeneous data sources. Based on FPGA technology and in particular thier dynamic reconfiguration, we propose a generic architecture called ReProVide for low-cost processing of database queries.