====== Enterprise Data Analytics ======
===== News =====
**Exam 2020-07-30 results**: partial results are available [[https://drive.google.com/file/d/1DJvK3TXEe3ckvaSkH5yheyzDOzXvwSp5/view?usp=sharing | here]]. Complete results will be available after project evaluation.
**Exam rules update**: students must confirm their presence to the next exam sessions at least 5 days before the exam date. Confirmation must be send to by mail to the teacher, esse3 registration is not sufficient.
**Exam 2020-07-09 results**: partial results are available [[https://drive.google.com/file/d/1VsmSYfgxvbM4LzpnA8gv2NHWzjtf-kfL/view?usp=sharing | here]]. Complete results will be available after project evaluation.
**Exam 2020-06-19 results**: partial results are available [[https://drive.google.com/file/d/1YIoCLpzknaFLJ54NuGM0o7xPIM5sCuBi/view?usp=sharing | here]]. Complete results will be available after project evaluation.
**Exam update**: exam rules are available [[https://docs.google.com/document/d/1WiqSQSDoQeeFMI1wVZ8hZE4zP17Te3u4zXyryqO0y64/edit?usp=sharing | here]]. Note that exam will be done via Webex and students must have Acrobat Reader installed.
**Course update**: students are asked to complete the assessment questionnaire for this course.
**Material update**: Other materials section updated with EDA projects A.Y 2018-19.
**Project update**: projects and IoT dataset have been updated.
**Project update**: a draft of available projects is available [[https://docs.google.com/spreadsheets/d/1hGmHKzxT1qkP1gH9NielVhTGMRZPOtZi7B2liKJ3pKM/edit#gid=0|here]].
**Google meet**: due to some possible problems with Webex, lesson will be done via Google Meet. Instruction [[https://drive.google.com/file/d/1H7QKgkeSFgWw_ydgSP8X0a_5Ado6JfJ3/view?usp=sharing|here]]. Room name ''edaunicam2020''.
**Course start**: the first lesson is planned for 12th of March 2019, only via webex room.
----
===== General Info =====
**Teacher**:
* Dr. Massimo Callisto De Donato
**ESSE3 Link**
* ---
**Webex Link**
* https://unicam.webex.com/meet/massimo.callisto
**Lessons schedule**:
* 42 h - lectures, exercise sessions
* Schedule on Thursday from 14:00am to 18:00pm
* LB1 room
* Schedule (tentative)
- 12-mar-20 - 14:00-18:00
- 19-mar-20 - 14:00-18:00
- 26-mar-20 - 14:30-17:30
- 02-apr-20 - 14:00-18:00
- 09-apr-20 - 14:00-18:00
- 16-apr-20 - 14:00-17:00
- 23-apr-20 - 14:00-18:00
- 30-apr-20 - 14:30-17:30
- 07-mag-20 - 14:00-18:00
- 14-mag-20 - 14:00-18:00
- 21-mag-20 - 14:00-18:00
- 28-mag-20 - 14:30-16:30
**Students Office hours**:
* Send an e-mail to the teacher to fix an appointment.
----
===== Course Objectives =====
* knowledge about business data analysis and modern scenarios such as the Internet of Things.
* Understanding of main differences between classical methods in data analysis and new modern scenarios.
* Knowledge and expertise on Big Data methodologies and technologies, basic principles and concepts, techniques that enable data analysis and management.
* Knowledge of the main Big Data technological frameworks and application in real case studies.
* Highlight some Intelligent Data Analysis techniques.
----
===== Course Contents =====
* Acquire knowledge and competence on Big Data methodologies, techniques and technologies.
* Know most common techniques of Big Data analysis and how they apply to real world examples.
* Apply Big Data Analysis techniques into practical case studies.
----
===== Syllabus =====
*Introduction to enterprise and data management.
*Analysis of scenarios and contexts of data generation: from the inter-organizational model to the Internet of things.
*Introduction to the classic data analysis techniques: ETL (Extract, Transform, Load), Business Intelligence, Reporting Tools.
*Methodologies and technologies for the management and analysis of large amounts of data: introduction to the Big Data model, concepts, principles and technological frameworks.
*Data Analytics methodologies and techniques: batch analysis models, streaming computation.
*Data Analytics evolution towards intelligent data understanding models.
----
===== Study material =====
**Course Slides**
* Slides {{ https://drive.google.com/drive/folders/1dlVI_UJTVWd5WmJnPxRLQFFmlyd2uevL?usp=sharing | link}}
* Webex {{ https://docs.google.com/document/d/1yImTJQ7SM7-0mk-H3fpmreAdMtstrcKLnu1K3BX4QMw/edit?usp=sharing | doc with links}}
* Projects {{ https://docs.google.com/spreadsheets/d/1hGmHKzxT1qkP1gH9NielVhTGMRZPOtZi7B2liKJ3pKM/edit?usp=sharing | link}} | {{ https://drive.google.com/drive/folders/1Sop3tGrQng0dcjKDSCocArHXsMsIKhed?usp=sharing | submitted projects}}
* **Reference materials**
* Slides course.
* Material provided by the teacher.
* Examples
* [[https://drive.google.com/drive/folders/1swSIC7deh1X95j_cZKqJl3wBDXEyQlbk?usp=sharing | IoT Dataset]]
* [[https://drive.google.com/drive/folders/18GUqT-eMNx6zYN9f5dtTPXC3maBZdtrB?usp=sharing | Examples]]
* [[https://drive.google.com/file/d/12yW6sbKlSw4gpc0nPC6-ze1YNPG1KcE1/view?usp=sharing | PySpark Dataframe reference]]
* [[https://drive.google.com/drive/folders/1E4XHrTfmIsEGHo9G7IHMMdUf5iOO15_3?usp=sharing | Spark Standalone cluster installation guide]]
* Other materials
* [[https://shorturl.at/msI09 | Docker Deepe Dive]]
* [[https://shorturl.at/wIJK6 | Kubernetes: Up and Running]]
* [[http://shorturl.at/clAR3 | A Gentle Introduction to Spark]]
* Projects completed on A.Y 2018-19 [[https://docs.google.com/spreadsheets/d/1t-iwy3SD7LpzGmim4IgpXWT3JeU0ev-5xQrr4J-b430/edit?usp=sharing|Description]] [[https://drive.google.com/drive/folders/1KSrXuj1o9cawKBgtP2epSN6QIbGgol8W?usp=sharing|Projects]]
* [[https://www.researchgate.net/publication/299379163_A_formal_definition_of_Big_Data_based_on_its_essential_features | A formal definition of Big Data based on its essential features]]
----
===== Exams =====
**Exam Dates A.Y. 2019/2020 (tentative)**
* 06/18/2020
* 07/09/2020
* 07/30/2020
* 09/03/2020
* 10/01/2020
* 10/22/2020
* 12/10/2020
* 01/14/2021
* 02/11/2021
**Exam rules**:
* Writing Examination on the topics of the course
* Project lab (max 2 member per team)
* [[https://docs.google.com/document/d/1WiqSQSDoQeeFMI1wVZ8hZE4zP17Te3u4zXyryqO0y64/edit?usp=sharing | Rules]]
** Exam Results **
* Exam 2020-06-19 [[https://drive.google.com/file/d/1YIoCLpzknaFLJ54NuGM0o7xPIM5sCuBi/view?usp=sharing | results]]
* Exam 2020-07-09 [[https://drive.google.com/file/d/1VsmSYfgxvbM4LzpnA8gv2NHWzjtf-kfL/view?usp=sharing | results]]
* Exam 2020-07-30 [[https://drive.google.com/file/d/1DJvK3TXEe3ckvaSkH5yheyzDOzXvwSp5/view?usp=sharing | results]]