|
Lecturer(s)
|
-
Kašparová Miloslava, Ing. Ph.D.
|
|
Course content
|
Introduction to DM (what is data mining (DM), taxonomy of DM methods, examples of DM use, methodologies, etc.) Phases and tasks of CRISP-DM methodology (problem understanding, data understanding, data preparation, modeling, evaluating, deployment, hierarchical decomposition of CRISP-DM). Data understanding (basic terms: data, data matrix, dependent and independent variables, data encoding, data classification, data file format, data dictionary, basic data visualization, data sources, etc.). Data preparation for modeling (data manipulation with a focus on joining files, selection of records, filtering, generation of derived variables, aggregation, replacement of missing variable values, use of selected statistical methods for data preparation, etc.). Basics of models creation and evaluation (cluster analysis methods, multiple linear regressions, models based on logistic regression, use of selected decision tree algorithms, etc.) and their evaluation.
|
|
Learning activities and teaching methods
|
|
Monologic (reading, lecture, briefing), Methods of individual activities, Monitoring, Demonstration
|
|
Learning outcomes
|
The aim of the course is to acquaint students with possibilities of data mining. The introductory part of the course is followed by presentation of definitions of aims and techniques for data mining. Further, selection of data sources and their preparation for modelling are explained, tvorba modelů a jejich vyhodnocení.
Students will be able to define individual phases of a data mining project and its content. Using software tools they will know how to solve basic tasks in the area of data preparation and choose the appropriate methods for a model creation.
|
|
Prerequisites
|
The student is expected to know the basics of working with databases within the scope of the subject Database Systems I (CDS1), basic knowledge of mathematics, basic processing of a data file within the scope of the subject Management Informatics I (CMI1), including basic editing of a text file in a text editor, basic knowledge of selected statistical methods within the scope of the subject Probability and Statistics (CPAS).
|
|
Assessment methods and criteria
|
Oral examination, Written examination, Home assignment evaluation, Work-related product analysis
The assignment is granted upon elaboration of given tasks at seminars (minimum achievement of 60 percent is required) and submitting the seminar paper. Assessment methods: oral, written. The oral examination is based on defence of the seminar paper. The final assessment is comprised of the following proportions: work at seminars - 40 percent, defence of the seminar paper and responding to examiner questions - 60 percent; the written examination might also be considered.
|
|
Recommended literature
|
-
Berka, Petr. Dobývání znalostí z databází. Praha: Academia, 2003. ISBN 80-200-1062-9.
-
Berry, Michael J. A. Data mining techniques : for marketing, sales, and customer relationship management. Indianapolis: Wiley, 2004. ISBN 0-471-47064-3.
-
Berry, Michael J. A. Mastering data mining. New York: John Wiley & Sons, 2000. ISBN 0-471-33123-6.
-
Petr, Pavel. Data Mining.. Pardubice: Univerzita Pardubice, 2006. ISBN 80-7194-886-1.
-
Petr, Pavel. Metody Data Miningu.. Pardubice: Univerzita Pardubice, 2014. ISBN 978-80-7395-872-5.
-
Petr, Pavel. Metody Data Miningu.. Pardubice: Univerzita Pardubice, 2015. ISBN 978-80-7395-873-2.
-
PYLE, D. Data Preparation for Data Mining. San Diego, Academic Press, 1999, 540 s.. San Diego, 1999.
-
RUD, O. L. Data Mining - Praktický průvodce dolováním dat pro efektivní prodej, cílený marketing a podporu zákazníků (CRM). Praha, Computer Press, 2001, 330 s.. 2001.
-
Wendler, T., Gröttrup, S. Data Mining with SPSS Modeler. 2016.
|