Profile description and modules
Data science, also known as knowledge discovery in databases (KDD), is an automated process to discover new and interesting information in large quantities of data.
Artificial intelligence methods; machine learning methods; visual data analysis
Data science imparts the principles of data management and analysis and advanced methods of information preparation and visualisation as well as the required basic principles of core computer science. In the seminars and internships that accompany the study programme, students will have the opportunity to gain knowledge of these methods and to apply them in practical contexts.
The area of data science is represented by two internationally-known researchers (Professor Keim, Professor Berthold). Their many contacts to countries beyond Europe frequently provide students with the opportunity to spend part of their studies in the US, for example.
A possible sample curriculum focusing on data mining could be as follows (lectures and classifications can change over time):
- Data mining 1
- Multimedia database systems
- Digital signal processing
- Introduction to economics (different department)
- Data mining 2
- Information visualisation 1
- Anorganic chemistry and analytical chemistry 1 (different department)
- Business intelligence: from reporting to analytics
- Algorithms for the analysis of large volumes of data
- Drawing of graphs
- Stochastics (different department)
- Text mining
- Master's project: Machine learning - implementing a hierarchical self-organising map
- Master's thesis in the field of data mining, machine learning, artificial intelligence, information visualisation, information retrieval, e.g. Visual Clustering of Finance Arrays.
Research groups involved
Area of application
As the quantity and complexity of stored data from science and industry continues to increase, the need for intelligent machine and expert-supported analysis methods of this data also increases. Due to the high demand for data mining, it has become an interface for a variety of areas of research, such as machine learning and information visualisation, artificial intelligence and human computer interaction. Naturally, the basic principles from the standard areas of computer science still apply, for instance in regard to databases, algorithms and software engineering.
Laboratories and features
A 5.20 m x 2.15 m Powerwall for the visualisation of huge quantities of data
KNIME, pronounced [naim], is a modular data exploration platform that enables data flows, so-called "pipelines", to be visually combined. These are then executed, allowing the data to be "pumped through", which in turn allows for the inspection of the results in interactive views of data and models.
KNIME was (and continues to be) developed at the Chair for Bioinformatics and Information Mining. Michael Berthold's working group utilises this platform for teaching and research purposes. Almost all the data mining methods developed by the working group have been integrated into KNIME.KNIME