Inspired by recent successes towards automating highly complex jobs like programming and scientific experimentation, the ultimate goal of this project is to automate the task of the data scientist when developing intelligent systems, which is to extract knowledge from data in the form of models. More specifically, this project wants to develop the foundations of a theory and methodology for automatically synthesising inductive data models. An inductive data model (IDM) consists of:
While the DM can be used to retrieve information about the dataset and to answer questions about specific data points, the IMs can be used to make predictions, propose values for missing data, find inconsistencies and redundancies, etc. The task addressed in this project is to automatically synthesise such IMs from past data and to use these to support the user when making decisions. It will be assumed that the data set consists of a set of tables, that the end-user interacts with the IDM via a visual interface, and the data scientist via a unifying IDM language offering a number of core IMs and learning algorithms.
The key challenges to be tackled in SYNTH are:
The approach will be implemented in an open source software and evaluated on two challenging application areas: rostering and sports analytics.
Call identifier: ERC-ADG-2015
Project number: 694980
Duration: September 1, 2016 - August 31, 2021
Budget: around € 2,500,000