Langohr M (2020)
Publication Language: German
Publication Type: Thesis
Publication year: 2020
A constantly changing society must constantly face new problems. One problem that
has become increasingly important in recent years is the protection of personal data.
Large data-processing organizations want to know more and more about the people
interacting with them and gain new insights from this. The resulting data-collection
frenzy of personal data should be stopped by the new General Data Protection Regulation (GDPR). However, the implementation of such laws in the everyday life of
an organization is proving to be very dicult. Organizations still store most of their
data in relational database systems. In most cases, the Structured Query Language
(SQL) is used to extract information from this data. In theory, a data-protection
ocer would have to decide for each SQL query whether it violates the GDPR or
not. Since this is not feasible in practice due to the high number and complexity of
the queries, a data-protection ocer needs technical support.
In this work, a system consisting of a SQL parser and a rule language is designed
and implemented. SQL queries are parsed by Apache Calcite. The resulting logical
execution plan is than evaluated and persisted in a graph database using the Apache
TinkerPop3 Stack and the Object Graph Model (OGM) mapper Ferma. The rules
are formulated using a Domain Specic Language (DSL) based on the graph traversal
language Gremlin. With this system, it is possible for a data protection ocer to
dene rules that prohibit linking certain relations and attributes in certain ways. The
system presented in this thesis is based on an existing prototype, a query repository
(QRep). The QRep allows managing database queries and rules. It is also possible
to check queries automatically according to the previously dened rules. The system
can also prevent the future execution of illegal queries. The result of this work is the
extension of the prototype by a system for the creation of rules and the classication
of SQL queries based on these rules. The prototype is now able to support a large
part of the SQL language and process it accordingly. The rule language can also be
easily extended with additional rule components.
APA:
Langohr, M. (2020). Konzeption und prototypische Implementierung einer regelbasierten Anfrageklassifizierung zur Einhaltung des Datenschutzes (Master thesis).
MLA:
Langohr, Maximilian. Konzeption und prototypische Implementierung einer regelbasierten Anfrageklassifizierung zur Einhaltung des Datenschutzes. Master thesis, 2020.
BibTeX: Download