Sunday, November 8, 2009

BI 101 ADAPT modeling

A good white paper on ADAPT modeling:

http://www.symcorp.com/downloads/ADAPT_white_paper.pdf

Very useful if the party you communicating with is not deep into BI terminology. It comes with it own visio template:

http://www.symcorp.com/downloads/ADAPTv3_visio_stencil.zip

Till Next Time

Thursday, October 29, 2009

BI 101 a generic calendar dimension

Within BI at least 98 % of the reporting is time based. They show the facts aggregated over a certain period.

When designing a DWH often little time is invested in setting up the calendar dimension and it’s related tables. Net result is that later on certain info is missing and therefore making easy querying tricky. In this article I want to give you a head start in designing a future proof calendar dimension.

The basic calendar hierarchy often looks like this:

image

Now which information do we need to store for each level?

First of all it’s very handy to make all your key’s readable, so for instance make your YearId like 1900, 1901, 1902 … 2099 instead of 1,2,3, … 199.

Year

  • YearId, integer, Primary Key => 2000 (YEAR)
  • Year, varchar(10) => “2000”
  • YearShort, varchar(10) => “ ‘00”
  • YearLong, varchar(60) => “twothousand”
  • YearRemark, varchar(4000) => “take over over ACME inc.”
  • YearStartDT, DateTime => “01-JAN-2000 00:00:00”
  • YearEndDT, DateTime => “31-DEC-2000 23:59:59”
  • YearLastYear, interger => 1999
  • YearIsLeapYear, Bit => 1

Quarter

  • QuarterId, integer, Primary Key =>20001 (YEAR*10 + QUARTER)
  • Quarter, varchar(10) => “2000 Q1”
  • QuarterNumber, integer => 1..4
  • QuarterShort, varchar(10) => “ ‘Q1”
  • QuarterLong, varchar(60) => “ first quarter 2000”
  • QuarterRemark, varchar(4000) => “Arizona storage flooded”
  • QuarterStartDT, DateTime => “01-JAN-2000 00:00:00”
  • QuarterStartDT, DateTime => “31-MAR-2000 23:59:59”
  • YearId, integer, Foreign Key Year => 2000
  • QuarterLastYear, integer => 19991
  • QuarterLastQuarter, integer => 19994

Month

  • MonthId, integer, Primary Key => 200001 (YEAR*100 + MONTH)
  • Month, varchar(10) => “2000 M01”
  • MonthNumber, integer => 1..12
  • MonthShort, varchar(10) => “Jan”
  • MonthShortSort, varchar(10) => “(01) – Jan”
  • MonthLong, varchar(60) => “January 2000”
  • MonthRemark, varchar(4000) => “Alaska branch closed down”
  • MonthStartDT, DateTime => “01-JAN-2000 00:00:00”
  • MonthEndDT, DateTime => “31-JAN-2000 23:59:59”
  • YearId, integer, Foreign Key Year =>2000
  • QuarterId, integer, Foreign Key Quarter =>20001
  • MonthLastYear, integer =>199901
  • MonthLastQuarter, integer => 199909
  • MonthLastMonth, integer => 199912

Week

  • WeekId, integer, Primary Key => 200001 (YEAR*100 + WEEK)
  • Week, varchar(10) => “2000 W01”
  • WeekNumber, integer => 1..52(53)
  • WeekShort, varchar(10) => “week 01”
  • WeekLong, varchar(60) => “2000 week 01”
  • WeekRemark, varchar(4000) => “Texas branch closed down for holidays”
  • WeekStartDT, DateTime => “03-JAN-2000 00:00:00”
  • WeekEndDT, DateTime => “09-JAN-2000 23:59:59”
  • YearId, integer, Foreign Key Year => 2000
  • QuarterId, Integer, Foreign Key Quarter => 20001
  • MonthId, Integer, Foreign Key Month => 200001
  • WeekLastYear, integer => 199901
  • WeekLastQuarter, integer => 199939
  • WeekLastMonth, integer => 199950

Day

  • DayId, integer, Primary Key => 200000101 (YEAR*10000 + MONTH*100 + DAY)
  • Day, varchar(10) => “20000101”
  • DayNumber, integer => 1..365(366)
  • DayDate, Date => “01-JAN-2000”
  • DayShort, varchar(10) => “01/01/2000”
  • DayLong, varchar(60) => “January First 2000”
  • DayOfWeek, integer => 1…7
  • DayWeekDayName, varchar(60) => “Monday”
  • DayRemark, varchar(4000) => “boxing day”
  • DayStartDT, DateTime => “01-JAN-2000 00:00:00”
  • DayEndDT, DateTime => “31-JAN-2000 23:59:59”
  • DayIsHoliday, Bit, 1
  • YearId, integer, Foreign Key Year => 2000
  • QuarterId, Integer, Foreign Key Quarter => 20001
  • MonthId, Integer, Foreign Key Month => 200001
  • WeekId, Integer, Foreign Key Week => 200001
  • DayLastYear, integer => 19990101
  • DayLastQuarter, integer => 19990901
  • DayLastMonth, integer => 19991201
  • DayLastWeek, integer => 19991224

Shift (optional)

  • ShiftId, integer, Primary Key => 200010101 (YEAR*1000000 + MONTH*10000 + DAY*100 +SHIFTNUMBER)
  • Shift, varchar(60) => “2000101S01”
  • ShiftNumber, integer => 1..X
  • ShiftCodeShort, varchar(60), “01/01/2000 S1”
  • ShiftCodeLong, varchar(60), “01 January 2000 Shift 1”
  • ShiftRemark, varchar(60), “New years eve 1999”
  • ShiftStartDT, DateTime => “31-DEC-1999 22:00:00”
  • ShiftEndDT, DateTime => “01-JAN-2000 05:59:59”
  • YearId, integer, Foreign Key Year => 2000
  • QuarterId, integer, Foreign Key Quarter => 20001
  • MonthId, Integer, Foreign Key Month => 200001
  • WeekId, Integer, Foreign Key Week => 200001
  • DayId, integer, Foreign Key Day (optional) => 20000101
  • ShiftLastYear, integer => 1999010101
  • ShiftLastQuarter, integer => 1999090101
  • ShiftLastMonth, integer => 1999120101
  • ShiftLastWeek, integer => 1999122401
  • ShiftLastDay, integer => 1999123101
  • ShiftLastShift, integer => 1999123103

PeriodLastPeriod columns

These are used for easy Period Ago analysis. Simple join an alias of your fact table “shifted back” in time. This way you can easily created columns like YAGO, MAGO, WAGO

PeriodStartDT PeriodEndDT columns

Not all calendars start at the first of a period. Especially fiscal and production calendars will have there own start and end date.

PeriodRemark Columns

Used for keeping historical records of events which might have disrupted normal operations.

3NF or one flat table

The arguments for putting the whole calendar dimension into one table are often only technical. They are based on the believe that a flat table is always quicker then separate tables. With modern db like Oracle10G or MS-SQL 2005 the query analyzers are clever enough (with proper indexing) to handle separated tables without loss of performance. (I’ve seen case where separated tables are quicker.)

From a modeling standpoint it’s better to use separate tables especially if you are using aggregated tables. (It doesn’t need to perform costly distinct operations on the calendar table to get the unique month keys.)

Till Next Time

If you have any extra columns which I should add feel free to post them

Friday, February 15, 2008

Welcome to BI 101

Hi,

My name is John Minkjan and I'm a senior BI-consultant at CIBER in the Netherlands. I reserverd a couple of ...101 blogspots. At the moment I'm mostly active on OBIEE101. I will start posting here as soon as something intressting to report on BI in general about .

Till Next Time

John