Course: Unstructured data processing

« Back
Course title Unstructured data processing
Course code USII/KZND
Organizational form of instruction Lecture
Level of course Master
Year of study 1
Semester Summer
Number of ECTS credits 4
Language of instruction Czech
Status of course Compulsory-optional
Form of instruction Face-to-face
Work placements This is not an internship
Recommended optional programme components None
Lecturer(s)
  • Hájek Petr, prof. Ing. Ph.D.
Course content
The role and specifics of unstructured data Possibilies of inverted index design Dictionary-based models Statistical approaches to unstructured data extraction Relation extraction from unstructured data Semantic annotation and ontologies Visual and text information extraction from images Models for information retrieval from images Models for automatic speech recognition Evaluation of quality of unstructured data processing

Learning activities and teaching methods
Monologic (reading, lecture, briefing), Dialogic (discussion, interview, brainstorming), Work with text (with textbook, with book), Methods of individual activities, Laboratory work
Learning outcomes
The subject aims to develop a general understanding of the fundamental methods for unstructured data processing, in particular text documents, image and audio data. This processing leads to structured or at least semi-structured data. This enables further analyses, including data visualisation and knowledge discovery.
Students will be capable of understanding both theoretical and practical aspects of unstructured data processing and information retrieval in these data. They will also be able of designing systems for automatic unstructured data processing.
Prerequisites
Basic skills in PC and MS Excel utilization.

Assessment methods and criteria
Oral examination, Systematic monitoring

Assignment: successful elaboration of given tasks with 60% at minimum. Successful defense of a practical project that cover theoretical knowledge gained within this course and includes a design of system for automatic processing of selected set of unstructured data. Examination: oral examination. Detailed information will be provided during the first lecture and in Stag.
Recommended literature
  • AUGER, A., BARRI?RE, C. Pattern-based Approaches to Semantic Relation Extraction: A State-of-the-art.. 2008.
  • BOULTON, D., HAMMERSLEY, M. Analysis of Unstructured Data. London, 2006.
  • DATTA, R., JOSHI, D., LI, J., WANG, J. Z. Image Retrieval: Ideas, Influences, and Trends of the New Age. 2008.
  • GRIMM, M., KROSCHEL, K. Robust Speech Recognition and Understanding. Vienna, 2007.
  • HEATH, T., BIZER, CH. Linked Data: Evolving the Web into a Global Data Space. 2011.
  • MANNING, C. D. Foundations of Statistical Natural Language Processing. Cambridge, 1999.
  • MANNING, CH. D., RAGHAVAN, P., SCHUTZE, H. Introduction to Information Retrieval. New York, 2008.
  • MINER, G. Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications. Amsterdam, 2012.


Study plans that include the course
Faculty Study plan (Version) Category of Branch/Specialization Recommended year of study Recommended semester
Faculty: Faculty of Economics and Administration Study plan (Version): Informatics in Public Administration (2014) Category: Economy 1 Recommended year of study:1, Recommended semester: Summer