Cyntech’s Tech Blog

Icon

Technical tidbits; coding and I.T.

Data Warehousing, Part 2

It now seems that the solution goalposts for the statistics have been moved. The Data Warehouse idea is a massive task and is ideally a project, so I’ve been instructed to just tailor it to Sakai and not the enterprise.

That is ok by me, as it’s reduced my work load considerably. Mind you, the amount I’ve read on Data Warehousing has helped me understand what I need to do a whole lot more.

Data Warehousing

I’m investigating Data Warehousing at the moment for work. It has been suggested that a Data Warehouse would be the suitable solution for the DIT Statistics for our implementation of Sakai. In the space of weeks, I have to learn the basics of what some people have an entire career based on. This should be fun.

I’ve started off reading through Ralph Kimball’s The Data Warehouse Toolkit (2nd Ed.), which is a bit like the Dummies guide but not quite as dumbed down. It’s not too bad at the moment, but there’s just a lot of information.

What’s interesting to note that when you’ve had your head buried in relational databases for some many years, Data Warehousing can be quite tricky to get your head around.  This is because most of the concepts behind Data Warehousing, or at least the Data Presentation side, is that the data is not normalised.  The aim of the Warehouse is to be fast and easy to understand.  For it to be normalised means that there would be referential links to a billion tables so that data isn’t duplicated anywhere.  To the end user, who doesn’t necessarily have any SQL knowledge is not understandable.

Thus, the need to map business processes into simple and easy to understand groups is the crux of Data Warehousing – and sometimes one of the most difficult things to understand as a database programmer!