Course: Data Mining I

« Back
Course title Data Mining I
Course code USII/EDM1
Organizational form of instruction Lecture + Tutorial
Level of course Bachelor
Year of study 3
Semester Winter
Number of ECTS credits 10
Language of instruction English
Status of course Compulsory
Form of instruction Face-to-face
Work placements This is not an internship
Recommended optional programme components None
Lecturer(s)
  • Kašparová Miloslava, Ing. Ph.D.
Course content
Introduction to DM (what is data mining (DM), taxonomy of DM methods, examples of DM use, methodologies, etc.) Phases and tasks of CRISP-DM methodology (problem understanding, data understanding, data preparation, modeling, evaluating, deployment, hierarchical decomposition of CRISP-DM). Data understanding (basic terms: data, data matrix, dependent and independent variables, data encoding, data classification, data file format, data dictionary, basic data visualization, data sources, etc.). Data preparation for modeling (data manipulation with a focus on joining files, selection of records, filtering, generation of derived variables, aggregation, replacement of missing variable values, use of selected statistical methods for data preparation, etc.). Basics of models creation and evaluation (cluster analysis methods, multiple linear regressions, models based on logistic regression, use of selected decision tree algorithms, etc.) and their evaluation.

Learning activities and teaching methods
Monologic (reading, lecture, briefing), Methods of individual activities, Monitoring, Demonstration
  • Term paper - 100 hours per semester
  • Preparation for an exam - 88 hours per semester
  • Home preparation for classes - 25 hours per semester
  • Contact teaching - 52 hours per semester
  • Independent critical reading - 15 hours per semester
  • Data/material collection - 20 hours per semester
Learning outcomes
The objective of the course is to acquaint students with the possibilities of data mining (DM). After introductory part, definition of objectives and techniques for DM, selection of data sources and their preparation for modeling, modeling and evaluation is followed.
Students will be able to define individual phases of a data mining project and its content. Using software tools they will know how to solve basic tasks in the area of data preparation and choose the appropriate methods for a model creation.
Prerequisites
The student is expected to know the basics of working with databases within the scope of the subject Database Systems I (EDS1), basic knowledge of mathematics, basic processing of a data file within the scope of the subject Management Informatics I (EMI1), including basic editing of a text file in a text editor, basic knowledge of selected statistical methods within the scope of the subject Probability and Statistics (EPAS).

Assessment methods and criteria
Written examination, Home assignment evaluation, Work-related product analysis

Requirements for credit: attendance at seminars - attendance (75%), elaboration of assigned tasks at seminars, submission of semester work according to assignment.
Recommended literature
  • Albright, S. Christian. Data analysis & decision making. Mason: Thomson South-Western, 2006. ISBN 0-324-40086-1.
  • Berry, Michael J. A. Data mining techniques : for marketing, sales, and customer relationship management. Indianapolis: Wiley, 2004. ISBN 0-471-47064-3.
  • Berry, Michael J. A. Mastering data mining. New York: John Wiley & Sons, 2000. ISBN 0-471-33123-6.
  • PYLE, D. Data Preparation for Data Mining. San Diego, Academic Press, 1999, 540 s.. San Diego, 1999.
  • Wendler, Tilo. Data mining with SPSS modeler . Cham: Springer International Publishing, 2016. ISBN 978-3-319-28707-2.


Study plans that include the course
Faculty Study plan (Version) Category of Branch/Specialization Recommended year of study Recommended semester