|
Lecturer(s)
|
-
Kašparová Miloslava, Ing. Ph.D.
|
|
Course content
|
Introduction to DM (what is data mining (DM), taxonomy of DM methods, examples of DM use, methodologies, etc.) Phases and tasks of CRISP-DM methodology (problem understanding, data understanding, data preparation, modeling, evaluating, deployment, hierarchical decomposition of CRISP-DM). Data understanding (basic terms: data, data matrix, dependent and independent variables, data encoding, data classification, data file format, data dictionary, basic data visualization, data sources, etc.). Data preparation for modeling (data manipulation with a focus on joining files, selection of records, filtering, generation of derived variables, aggregation, replacement of missing variable values, use of selected statistical methods for data preparation, etc.). Basics of models creation and evaluation (cluster analysis methods, multiple linear regressions, models based on logistic regression, use of selected decision tree algorithms, etc.) and their evaluation.
|
|
Learning activities and teaching methods
|
Monologic (reading, lecture, briefing), Methods of individual activities, Monitoring, Demonstration
- Term paper
- 100 hours per semester
- Preparation for an exam
- 88 hours per semester
- Home preparation for classes
- 25 hours per semester
- Contact teaching
- 52 hours per semester
- Independent critical reading
- 15 hours per semester
- Data/material collection
- 20 hours per semester
|
|
Learning outcomes
|
The objective of the course is to acquaint students with the possibilities of data mining (DM). After introductory part, definition of objectives and techniques for DM, selection of data sources and their preparation for modeling, modeling and evaluation is followed.
Students will be able to define individual phases of a data mining project and its content. Using software tools they will know how to solve basic tasks in the area of data preparation and choose the appropriate methods for a model creation.
|
|
Prerequisites
|
The student is expected to know the basics of working with databases within the scope of the subject Database Systems I (EDS1), basic knowledge of mathematics, basic processing of a data file within the scope of the subject Management Informatics I (EMI1), including basic editing of a text file in a text editor, basic knowledge of selected statistical methods within the scope of the subject Probability and Statistics (EPAS).
|
|
Assessment methods and criteria
|
Written examination, Home assignment evaluation, Work-related product analysis
Requirements for credit: attendance at seminars - attendance (75%), elaboration of assigned tasks at seminars, submission of semester work according to assignment.
|
|
Recommended literature
|
-
Albright, S. Christian. Data analysis & decision making. Mason: Thomson South-Western, 2006. ISBN 0-324-40086-1.
-
Berry, Michael J. A. Data mining techniques : for marketing, sales, and customer relationship management. Indianapolis: Wiley, 2004. ISBN 0-471-47064-3.
-
Berry, Michael J. A. Mastering data mining. New York: John Wiley & Sons, 2000. ISBN 0-471-33123-6.
-
PYLE, D. Data Preparation for Data Mining. San Diego, Academic Press, 1999, 540 s.. San Diego, 1999.
-
Wendler, Tilo. Data mining with SPSS modeler . Cham: Springer International Publishing, 2016. ISBN 978-3-319-28707-2.
|