AIDE | QUITTER
   

Année académique 2016-2017
16/12/2017
Image transparente

Langue/Language


Data warehouses
INFO - H419

I. Informations générales
Intitulé de l'unité d'enseignement * Data warehouses
Langue d'enseignement * Enseigné en anglais
Niveau du cadre de certification * Niveau 7 (2e cycle-MA/MC/MA60)
Discipline * Informatique
Titulaire(s) * [y inclus le coordonnateur] Toon CALDERS (coordonnateur)
II. Place de l'enseignement
Unité(s) d'enseignement co-requise(s) *
Unité(s) d'enseignement pré-requise(s) * INFO-H-415: Advanced databases
Connaissances et compétences pré-requises *
Programme(s) d'études comprenant l'unité d'enseignement - M-INFOS - Master en sciences informatiques (5 crédits, optionnel)
- M-IRIFE - Master of science in Computer science and engineering, Focus Information Technologies for Business Intelligence (Erasmus Mundus) (5 crédits, obligatoire)
- M-IRIFS - Master en ingénieur civil en informatique, à finalité spécialisée (5 crédits, optionnel)
III. Objectifs et méthodologies
Contribution de l'unité d'enseignement au profil d'enseignement *
Objectifs de l'unité d'enseignement (et/ou acquis d'apprentissages spécifiques) *

At the end of the course students are able to

  • Understand the difference between operational databases and data warehouses
  • Understand and be able to apply the principles of multidimensional modeling
  • Exploit a data warehouse for querying and reporting
  • Understand best practices and methodologies for data warehouse development
  • Understand the process of populating a data warehouse from internal and external sources
 
Contenu de l'unité d'enseignement *

Relational and object-oriented databases are mainly suited for operational settings in which there are many small transactions querying and writing to the database. Consistency of the database (in the presence of potentially conflicting transactions) is of utmost importance. Much different is the situation in analytical processing where historical data is analyzed and aggregated in many different ways. Such queries differ significantly from the typical transactional queries in the relational model:

  • Typically analytical queries touch a larger part of the database and last longer than the transactional queries;
  • Analytical queries involve aggregations (min, max, avg, …) over large subgroups of the data;
  • When analyzing data it is convenient to see it as multi-dimensional.

For these reasons, data to be analyzed is typically collected into a data warehouse with Online Analytical Processing support. Online here refers to the fact that the answers to the queries should not take too long to be computed. Collecting the data is often referred to as Extract-Transform-Load (ELT). The data in the data warehouse needs to be organized in a way to enable the analytical queries to be executed efficiently. For the relational model star and snowflake schemes are popular designs. Next to OLAP on top of a relational database (ROLAP), also native OLAP solutions based on multidimensional structures (MOLAP) exist. In order to further improve query answering efficiency, some query results can already be materialized in the database, and new indexing techniques have been developed.

The first and largest part of the course covers the traditional data warehousing techniques. The main concepts of multidimensional databases are illustrated using the SQL Server tools. The second part of the course consists of advanced topics such as data warehousing appliances, data stream processing, data mining, and spatial-temporal data warehousing. 

 
Méthodes d'enseignement et activités d'apprentissages *
  • Theory lectures (24h) (including invited lectures by industrial partners)
  • Exercises: pencil-and paper and lab sessions (24h)
  • Group project (12h)
 
Support(s) de cours indispensable(s) * Non
Autres supports de cours
Course website ( http://cs.ulb.ac.be/public/teaching/infoh419 ) offers:
  • Selected research papers and articles will be offered on the course website
  • Lecture slides
  • students will have access to the software used during practical exercises
 
Références, bibliographie et lectures recommandées *

Course books:

  • Christian S. Jensen, Torben Bach Pedersen, Christian Thomsen. \textit{Multidimensional Databases and Data Warehousing}. Morgan and Claypool Publishers. 2010
  • Kimball, Ralph; Margy Ross, Warren Thornthwaite, Joy Mundy, Bob Becker (2008). The Data Warehouse Lifecycle Toolkit (2nd ed.). Wiley.

Additional sources of information:

  • Data Warehouse Design: Modern Principles and Methodologies, Golfarelli and Rizzi, McGraw-Hill, 2009
  • Advanced Data Warehouse Design: From Conventional to Spatial and Temporal Applications, Elzbieta Malinowski, Esteban Zimányi, Springer, 2008
  • The Data Warehouse Toolkit, 2nd Ed., Kimball and Ross, Wiley, 2002
  • Building the Data Warehouse. 4th edition. Inmon, Wiley, 2005
  • Data Warehousing Fundamentals For IT Professionals. 2nd edition. Paulraj Ponniah, Wiley, 2010
 
IV. Evaluation
Méthode(s) d'évaluation *
  • Written exam
  • Group project
 
Construction de la note (en ce compris, la pondération des notes partielles) *
  • Written exam (70%)
  • Group project (30%)
 
Langue d'évaluation *

English

 
V. Organisation pratique
Institution organisatrice * ULB
Faculté gestionnaire * Ecole polytechnique Bruxelles
Quadrimestre * Premier quadrimestre (NRE : 25868)
Horaire * Premier quadrimestre
Volume horaire
VI. Coordination pédagogique
Contact *

toon.calders@ulb.ac.be

 
Lieu d’enseignement *

ULB, Campus Solbosch, room and building TBD

 
VII. Autres informations relatives à l’unité d’enseignement
Remarques

Retour aux détails du cursus
Image transparente
Passer directement au début de la page