Studieinfo emne ACIT4530 2021 HØST
ACIT4530 Data Mining at Scale: Algorithms and Systems Emneplan
 Engelsk emnenavn
 Data Mining at Scale: Algorithms and Systems
 Studieprogram

Master's Programme in Applied Computer and Information Technology
 Omfang
 10 stp.
 Studieår
 2021/2022
 Pensum

VÅR
2022
 Timeplan
 Emnehistorikk

Innledning
We are witnessing the era of Big data where data is generated, collected, and processed at an unprecedented scale and datadriven decisions influence many aspects of modern life.
Data mining is the process of discovering patterns in large data sets involving methods in statistics and database systems.
A large number of applications such IoT sensors generate large amounts of data streams. The necessity of data stream mining and learning from the data is increasingly becoming more prevalent and urgent.
Extracting knowledge from data sets requires not only computational power but also programming abstractions as well as analytical skills. In this course, the students will be exposed to the different approaches for data mining and stream processing such as associationrule learning, anomaly detection, data clustering, visualizations, and extracting statistical features on the fly from large data streams. In this course, the student will also be exposed to different data mining systems including the landscape of MapReduce and the ecosystem it spawned, such as Spark and its contemporaries. With a focus on data mining applications, we will study some powerful numerical linear algebra methods.
Anbefalte forkunnskaper
It is an advantage to have some experience with the following subjects:
 Mathematical Analysis
 Basic programming, such as scripting
 Statistics, specifically probability theory
Forkunnskapskrav
No formal requirements over and above the admission requirements.
Læringsutbytte
The student should have the following outcomes upon completing the course:
Knowledge
Upon successful completion of the course, the student:
 has a deep understanding of how data mining can be used to extract knowledge from data sets.
 has advanced knowledge of the different data mining algorithms.
 should be able to use data mining systems to mine data.
Skills
Upon successful completion of the course, the student:
 can design and implement data mining algorithms
 can deploy different data mining systems and configure them
 can utilize a specialized library for data mining
General competence
Upon successful completion of the course, the student:
 can analyse data mining solutions with regard to robustness and in relation to his/her intended tasks
 can explain how data mining can be used in different applications areas such as business analytics
Innhold
 Data streaming systems
 Data mining systems and BigData platforms
 Data stream processing methods, such as, but not limited to, anomaly detection, clustering, association rule learning
 Data visualization
 Statistical analysis on large data sets
 Linear algebra applied on BigData
 Using programming to implement analysis and toolchaining
Arbeids og undervisningsformer
This course is divided into two parts. The first part with focus on covering the principles of data mining and stream processing. Different seminars will be given on the different methodological aspects of data mining and stream processing as well as the programming paradigms and software tools that enable them.
The second part will focus on the students completing a programming project. The project can be chosen from a portfolio of available problems. The student will work in a group on the project and submit a final codebase with a report.
During this part, there may be lectures if needed, but most of the time will be spent on individual supervision of students in labsessions.
Practical training
Lab sessions.
Arbeidskrav og obligatoriske aktiviteter
None.
Vurdering og eksamen
Group project (24 students) (15 000  17 500 words)
The exam can be appealed.
New/postponed exam
In case of failed exam or legal absence, the student may apply for a new or postponed exam. New or postponed exams are offered within a reasonable time span following the regular exam. The student is responsible for applying for a new/postponed exam within the time limits set by OsloMet. The Regulations for new or postponed examinations are available in Regulations relating to studies and examinations at OsloMet.
Hjelpemidler ved eksamen
All aids are permitted.
Vurderingsuttrykk
For the final assessment a grading scale from A to E is used, where A denotes the highest and E the lowest pass grade, and F denotes a fail.
Sensorordning
Two internal examiners. External examiner is used periodically.
Emneansvarlig
Professor Anis Yazidi