====== Enterprise Data Analytics ====== ===== News ===== **Exam 2020-07-30 results**: partial results are available [[https://drive.google.com/file/d/1DJvK3TXEe3ckvaSkH5yheyzDOzXvwSp5/view?usp=sharing | here]]. Complete results will be available after project evaluation. **Exam rules update**: students must confirm their presence to the next exam sessions at least 5 days before the exam date. Confirmation must be send to by mail to the teacher, esse3 registration is not sufficient. **Exam 2020-07-09 results**: partial results are available [[https://drive.google.com/file/d/1VsmSYfgxvbM4LzpnA8gv2NHWzjtf-kfL/view?usp=sharing | here]]. Complete results will be available after project evaluation. **Exam 2020-06-19 results**: partial results are available [[https://drive.google.com/file/d/1YIoCLpzknaFLJ54NuGM0o7xPIM5sCuBi/view?usp=sharing | here]]. Complete results will be available after project evaluation. **Exam update**: exam rules are available [[https://docs.google.com/document/d/1WiqSQSDoQeeFMI1wVZ8hZE4zP17Te3u4zXyryqO0y64/edit?usp=sharing | here]]. Note that exam will be done via Webex and students must have Acrobat Reader installed. **Course update**: students are asked to complete the assessment questionnaire for this course. **Material update**: Other materials section updated with EDA projects A.Y 2018-19. **Project update**: projects and IoT dataset have been updated. **Project update**: a draft of available projects is available [[https://docs.google.com/spreadsheets/d/1hGmHKzxT1qkP1gH9NielVhTGMRZPOtZi7B2liKJ3pKM/edit#gid=0|here]]. **Google meet**: due to some possible problems with Webex, lesson will be done via Google Meet. Instruction [[https://drive.google.com/file/d/1H7QKgkeSFgWw_ydgSP8X0a_5Ado6JfJ3/view?usp=sharing|here]]. Room name ''edaunicam2020''. **Course start**: the first lesson is planned for 12th of March 2019, only via webex room. ---- ===== General Info ===== **Teacher**: * Dr. Massimo Callisto De Donato **ESSE3 Link** * --- **Webex Link** * https://unicam.webex.com/meet/massimo.callisto **Lessons schedule**: * 42 h - lectures, exercise sessions * Schedule on Thursday from 14:00am to 18:00pm * LB1 room * Schedule (tentative) - 12-mar-20 - 14:00-18:00 - 19-mar-20 - 14:00-18:00 - 26-mar-20 - 14:30-17:30 - 02-apr-20 - 14:00-18:00 - 09-apr-20 - 14:00-18:00 - 16-apr-20 - 14:00-17:00 - 23-apr-20 - 14:00-18:00 - 30-apr-20 - 14:30-17:30 - 07-mag-20 - 14:00-18:00 - 14-mag-20 - 14:00-18:00 - 21-mag-20 - 14:00-18:00 - 28-mag-20 - 14:30-16:30 **Students Office hours**: * Send an e-mail to the teacher to fix an appointment. ---- ===== Course Objectives ===== * knowledge about business data analysis and modern scenarios such as the Internet of Things. * Understanding of main differences between classical methods in data analysis and new modern scenarios. * Knowledge and expertise on Big Data methodologies and technologies, basic principles and concepts, techniques that enable data analysis and management. * Knowledge of the main Big Data technological frameworks and application in real case studies. * Highlight some Intelligent Data Analysis techniques. ---- ===== Course Contents ===== * Acquire knowledge and competence on Big Data methodologies, techniques and technologies. * Know most common techniques of Big Data analysis and how they apply to real world examples. * Apply Big Data Analysis techniques into practical case studies. ---- ===== Syllabus ===== *Introduction to enterprise and data management. *Analysis of scenarios and contexts of data generation: from the inter-organizational model to the Internet of things. *Introduction to the classic data analysis techniques: ETL (Extract, Transform, Load), Business Intelligence, Reporting Tools. *Methodologies and technologies for the management and analysis of large amounts of data: introduction to the Big Data model, concepts, principles and technological frameworks. *Data Analytics methodologies and techniques: batch analysis models, streaming computation. *Data Analytics evolution towards intelligent data understanding models. ---- ===== Study material ===== **Course Slides** * Slides {{ https://drive.google.com/drive/folders/1dlVI_UJTVWd5WmJnPxRLQFFmlyd2uevL?usp=sharing | link}} * Webex {{ https://docs.google.com/document/d/1yImTJQ7SM7-0mk-H3fpmreAdMtstrcKLnu1K3BX4QMw/edit?usp=sharing | doc with links}} * Projects {{ https://docs.google.com/spreadsheets/d/1hGmHKzxT1qkP1gH9NielVhTGMRZPOtZi7B2liKJ3pKM/edit?usp=sharing | link}} | {{ https://drive.google.com/drive/folders/1Sop3tGrQng0dcjKDSCocArHXsMsIKhed?usp=sharing | submitted projects}} * **Reference materials** * Slides course. * Material provided by the teacher. * Examples * [[https://drive.google.com/drive/folders/1swSIC7deh1X95j_cZKqJl3wBDXEyQlbk?usp=sharing | IoT Dataset]] * [[https://drive.google.com/drive/folders/18GUqT-eMNx6zYN9f5dtTPXC3maBZdtrB?usp=sharing | Examples]] * [[https://drive.google.com/file/d/12yW6sbKlSw4gpc0nPC6-ze1YNPG1KcE1/view?usp=sharing | PySpark Dataframe reference]] * [[https://drive.google.com/drive/folders/1E4XHrTfmIsEGHo9G7IHMMdUf5iOO15_3?usp=sharing | Spark Standalone cluster installation guide]] * Other materials * [[https://shorturl.at/msI09 | Docker Deepe Dive]] * [[https://shorturl.at/wIJK6 | Kubernetes: Up and Running]] * [[http://shorturl.at/clAR3 | A Gentle Introduction to Spark]] * Projects completed on A.Y 2018-19 [[https://docs.google.com/spreadsheets/d/1t-iwy3SD7LpzGmim4IgpXWT3JeU0ev-5xQrr4J-b430/edit?usp=sharing|Description]] [[https://drive.google.com/drive/folders/1KSrXuj1o9cawKBgtP2epSN6QIbGgol8W?usp=sharing|Projects]] * [[https://www.researchgate.net/publication/299379163_A_formal_definition_of_Big_Data_based_on_its_essential_features | A formal definition of Big Data based on its essential features]] ---- ===== Exams ===== **Exam Dates A.Y. 2019/2020 (tentative)** * 06/18/2020 * 07/09/2020 * 07/30/2020 * 09/03/2020 * 10/01/2020 * 10/22/2020 * 12/10/2020 * 01/14/2021 * 02/11/2021 **Exam rules**: * Writing Examination on the topics of the course * Project lab (max 2 member per team) * [[https://docs.google.com/document/d/1WiqSQSDoQeeFMI1wVZ8hZE4zP17Te3u4zXyryqO0y64/edit?usp=sharing | Rules]] ** Exam Results ** * Exam 2020-06-19 [[https://drive.google.com/file/d/1YIoCLpzknaFLJ54NuGM0o7xPIM5sCuBi/view?usp=sharing | results]] * Exam 2020-07-09 [[https://drive.google.com/file/d/1VsmSYfgxvbM4LzpnA8gv2NHWzjtf-kfL/view?usp=sharing | results]] * Exam 2020-07-30 [[https://drive.google.com/file/d/1DJvK3TXEe3ckvaSkH5yheyzDOzXvwSp5/view?usp=sharing | results]]