EPN-V2

ACIT4530 Data Mining at Scale: Algorithms and Systems Emneplan

Engelsk emnenavn
Data Mining at Scale: Algorithms and Systems
Omfang
10.0 stp.
Studieår
2021/2022
Emnehistorikk
Timeplan
  • Innledning

    We are witnessing the era of Big data where data is generated, collected, and processed at an unprecedented scale and data-driven decisions influence many aspects of modern life.

    Data mining is the process of discovering patterns in large data sets involving methods in statistics and database systems.

    A large number of applications such IoT sensors generate large amounts of data streams. The necessity of data stream mining and learning from the data is increasingly becoming more prevalent and urgent.

    Extracting knowledge from data sets requires not only computational power but also programming abstractions as well as analytical skills. In this course, the students will be exposed to the different approaches for data mining and stream processing such as associationrule learning, anomaly detection, data clustering, visualizations, and extracting statistical features on the fly from large data streams. In this course, the student will also be exposed to different data mining systems including the landscape of MapReduce and the ecosystem it spawned, such as Spark and its contemporaries. With a focus on data mining applications, we will study some powerful numerical linear algebra methods.

  • Anbefalte forkunnskaper

    All modules will be taught as lectures / seminars with assignments for students. Throughout the course, students will work on an individual essay from their own specialization topic. The essay will contain:

    • a literature survey
    • a discussion on the methods applied by the researchers in the material reviewed in their survey
    • a discussion on the ethical challenges related their topic both with regard to the research applied and relative to uses in society

    The essay will give the opportunity for the student to tie together all modules in this course into a cohesive document.

  • Forkunnskapskrav

    No formal requirements over and above the admission requirements.

  • Læringsutbytte

    The student should have the following outcomes upon completing the course:

    Knowledge

    Upon successful completion of the course, the student:

    • has a deep understanding of how data mining can be used to extract knowledge from data sets.
    • has advanced knowledge of the different data mining algorithms.
    • should be able to use data mining systems to mine data.

    Skills

    Upon successful completion of the course, the student:

    • can design and implement data mining algorithms
    • can deploy different data mining systems and configure them
    • can utilize a specialized library for data mining

    General competence

    Upon successful completion of the course, the student:

    • can analyse data mining solutions with regard to robustness and in relation to his/her intended tasks
    • can explain how data mining can be used in different applications areas such as business analytics
  • Innhold

    The following required coursework must be approved before the student can take the exam:

    Two mandatory assignments:

    • One recorded presentation of between 10 and 15 minutes.
    • A spreadsheet containing results of a research survey including columns for relevant meta-information with at least 7 relevant research papers
  • Arbeids- og undervisningsformer

    This course is divided into two parts. The first part with focus on covering the principles of data mining and stream processing. Different seminars will be given on the different methodological aspects of data mining and stream processing as well as the programming paradigms and software tools that enable them.

    The second part will focus on the students completing a programming project. The project can be chosen from a portfolio of available problems. The student will work in a group on the project and submit a final code-base with a report.

    During this part, there may be lectures if needed, but most of the time will be spent on individual supervision of students in lab-sessions.

    Practical training

    Lab sessions.

  • Arbeidskrav og obligatoriske aktiviteter

    None.

  • Vurdering og eksamen

    Group project (2-4 students) (15 000 - 17 500 words)

    The exam can be appealed.

    New/postponed exam

    In case of failed exam or legal absence, the student may apply for a new or postponed exam. New or postponed exams are offered within a reasonable time span following the regular exam. The student is responsible for applying for a new/postponed exam within the time limits set by OsloMet. The Regulations for new or postponed examinations are available in Regulations relating to studies and examinations at OsloMet.

  • Hjelpemidler ved eksamen

    All aids are permitted.

  • Vurderingsuttrykk

    This course offers an introduction to the practice of writing and reading academic literature. Communicating properly, both in written and oral form, is a cornerstone of research. This course therefore also contains an element of rhetoric and writing techniques. The course will cover common concepts from research methods, such as qualitative and quantitative methods. A brief introduction of the philosophy of science helps us put the vast variations of research in context. Finally, the course introduces research and professional ethics.

  • Sensorordning

    No formal requirements over and above the admission requirements.

  • Emneansvarlig

    A student who has completed this course should have the following learning outcomes defined in terms of knowledge, skills and general competence:

    Knowledge

    On successful completion of this course the student:

    • has thorough knowledge of writing in research processes
    • has advanced knowledge of forums and channels in which research results are published
    • has thorough knowledge of the formal academic writing conventions
    • has thorough understanding of the common research methods within either qualitative or quantitative research

    Skills

    On successful completion of this course the student:

    • can find research results in literature databases
    • can analyse and critically evaluate various information sources
    • can write summaries using his/her own words
    • can formulate scientific reports
    • can use electronic reference tools
    • can carry out objective and constructive peer reviews on written work
    • can identify the research method used in a scientific text
    • can explain the application of a research method in a research project
    • can discuss and compare research approaches in the domain of quantitative or qualitative methods

    General competence

    On successful completion of this course the student:

    • can identify research fraud and plagiarism
    • has a thorough knowledge of responsibility for author and co-authorship in accordance with the Vancouver Convention