Course: Data Mining I

« Back
Course title Data Mining I
Course code USII/FDM1
Organizational form of instruction Lecture + Tutorial
Level of course Bachelor
Year of study 3
Semester Winter
Number of ECTS credits 10
Language of instruction Czech
Status of course Compulsory, Compulsory-optional
Form of instruction Face-to-face
Work placements This is not an internship
Recommended optional programme components None
Lecturer(s)
  • Šanda Martin, Ing. Ph.D.
  • Kašparová Miloslava, Ing. Ph.D.
Course content
Introduction to DM (what is data mining (DM), taxonomy of DM methods, examples of DM use, methodologies, etc.) Phases and tasks of CRISP-DM methodology (problem understanding, data understanding, data preparation, modeling, evaluating, deployment, hierarchical decomposition of CRISP-DM). Data understanding (basic terms: data, data matrix, dependent and independent variables, data encoding, data classification, data file format, data dictionary, basic data visualization, data sources, etc.). Data preparation for modeling (data manipulation with a focus on joining files, selection of records, filtering, generation of derived variables, aggregation, replacement of missing variable values, use of selected statistical methods for data preparation, etc.). Basics of models creation and evaluation (cluster analysis methods, multiple linear regressions, models based on logistic regression, use of selected decision tree algorithms, etc.) and their evaluation.

Learning activities and teaching methods
Monologic (reading, lecture, briefing), Methods of individual activities, Monitoring, Demonstration
  • Term paper - 100 hours per semester
  • Independent critical reading - 15 hours per semester
  • Data/material collection - 20 hours per semester
  • Home preparation for classes - 25 hours per semester
  • Preparation for an exam - 88 hours per semester
  • Contact teaching - 52 hours per semester
Learning outcomes
The aim of the course is to acquaint students with possibilities of data mining. The introductory part of the course is followed by presentation of definitions of aims and techniques for data mining. Further, selection of data sources and their preparation for modelling are explained, creation of models and their evaluation.
Students will be able to define individual phases of a data mining project and its content. Using software tools they will know how to solve basic tasks in the area of data preparation and choose the appropriate methods for a model creation.
Prerequisites
The student is expected to know the basics of working with databases within the scope of the subject Database Systems I (FDS1), basic knowledge of mathematics, basic processing of a data file within the scope of the subject Management Informatics I (FMI1), including basic editing of a text file in a text editor, basic knowledge of selected statistical methods within the scope of the subject Computerized Business Data Processing (FPZD) or Probability and Statistics (FPAS).

Assessment methods and criteria
Written examination, Home assignment evaluation, Work-related product analysis

Requirements for credit: attendance at seminars - attendance (75%), elaboration of assigned tasks at seminars, submission of semester work according to assignment.
Recommended literature
  • Berka, Petr. Dobývání znalostí z databází. Praha: Academia, 2003. ISBN 80-200-1062-9.
  • Berry, Michael J. A. Data mining techniques : for marketing, sales, and customer relationship management. Indianapolis: Wiley, 2004. ISBN 0-471-47064-3.
  • Berry, Michael J. A. Mastering data mining. New York: John Wiley & Sons, 2000. ISBN 0-471-33123-6.
  • Petr, Pavel. Data Mining.. Pardubice: Univerzita Pardubice, 2006. ISBN 80-7194-886-1.
  • Petr, Pavel. Metody Data Miningu.. Pardubice: Univerzita Pardubice, 2014. ISBN 978-80-7395-872-5.
  • Petr, Pavel. Metody Data Miningu.. Pardubice: Univerzita Pardubice, 2015. ISBN 978-80-7395-873-2.
  • PYLE, D. Data Preparation for Data Mining. San Diego, Academic Press, 1999, 540 s.. San Diego, 1999.
  • RUD, O. L. Data Mining - Praktický průvodce dolováním dat pro efektivní prodej, cílený marketing a podporu zákazníků (CRM). Praha, Computer Press, 2001, 330 s.. 2001.
  • Wendler, T., Gröttrup, S. Data Mining with SPSS Modeler. 2016.


Study plans that include the course
Faculty Study plan (Version) Category of Branch/Specialization Recommended year of study Recommended semester