Data mining in multi-relations databases
Tools used to apply knowledge discovery to relational databases are focused on single tables. Unfortunately, the data needed for knowledge discovery is rarely isolated to a single relation. Rather, the data is spread out over several relations. Relevant data relations are to be joined in order to create a single relation called a Universal Relation (UR). However, from a data mining point of view, this could lead to many issues such as universal relations of unmanageable sizes. In this thesis, we consider the problem of knowledge discovery in multi-relation databases. In particular, we examine a knowledge discovery algorithm for multiple databases based on distributed decision tree induction, knowledge discovery algorithms based on primary and foreign keys, peculiar and surprising data, and the foreign set - which allows multi-relations mining without a primary or foreign key. Lastly, we propose extensions of these methods with the foreign set.