Building a Raw Data Warehouse in SQL Server 2005 – Part 1
1. Project scope, deliverables & documentation
One of the most important things in life is having a solid foundation and goal before you do anything. If you build a house without a concrete foundation, or begin programming without an understanding of syntax, or taking a shower without soap your end result will be less than desirable. The same goes for any data warehouse project.
In this post I’ll outline #1 of the 5 major steps needed to build a Raw Data Warehouse in SQL Server 2005.
Project Scope:
When in the initial stages of a DW project, be sure to define the project scope and get it written down. If you do not document the exact project scope, your project will suffer from what’s commonly referred to as “scope-creep” and could fail (or at least miss deadlines.) Scope-creep is when a project is not properly planned and the scope of the project continues to creep up each meeting, conversation, email, etc as features or functionality continue to get added. A properly defined project scope is one of the most critical pieces of a DW project because they tend to be rather costly, long and large projects to implement.
The scope for this DW project was split into 2 phases:
- Phase I – Build a raw-data data warehouse. The problem is that the business unit spends too much time searching for data on the Mainframe and other data sources and needs a central location to query for individual records (basically to see the raw data).
- Phase II – Build a traditional, analytical data warehouse. This DW will consist of Fact and Dimension tables where the raw data will be summarized and can be reported against (for trend analyzing, future projections, data mining, etc.)
Deliverables:
The next step in defining a solid foundation for a DW project is to set expectations for your client(s). You want to make sure the deliverables (the end result(s) of the project) are properly defined and documented.
The deliverables for this DW project are:
- A front-end reporting tool that will allow the BU to create custom queries against the raw data and the summarized Data Warehouses
- Dashboards that will contain canned reports based off of the summarized data from the DW
Documentation:
Along with the solid foundation, your documentation must be exact when creating a DW.
This DW documentation should consists of 3 main documents:
- Issues list
- keeps track of outstanding issues that need to be resolved (it’s very important to maintain a list)
- Functional Specs
- This document is the living core for the ETL and specifically states how the column and table specific transforms should occur, and in what order the ETL should be processed.
- Database ER Diagram
- You should maintain this document to store the structure of the data warehouse database and what mappings will occur from your staging files to your DW databases, tables, and columns.
I hope that covers a bit about what work needs to be done before a DW project is started. Trust me the pre-work will go a long way.
I’ll be posting part 2/5 soon.
- Reagan
Professional SQL Server Experiences » Building a Raw Data Warehouse in SQL Server 2005 - Part 2 on April 24th, 2009
[...] http://www.rwsql.com/datawarehouse/building-a-raw-data-warehouse-in-sql-server-2005-part-1/ [...]