Monthly Archives: April 2009
Building a Raw Data Warehouse in SQL Server 2005 – Part 2
Data Warehouse Performance – Indexes
We are now dealing with index fragmentation on our data warehouse. After loading, updating, deleting, inserting, etc. hundreds of MB worth of data over the past month, the indexes that we initially created for the DW have become severely fragmented. This is one of the causes of the performance issues in the Data Warehouse.
Raw Data Warehouse Performance Issues
I am going to take a small break from the first series of this blog to discuss performance issues and considerations. This will only apply to data warehouses with raw data only (relational, not OLAP cubes).
Building a Raw Data Warehouse in SQL Server 2005 – Part 1
1. Project scope, deliverables & documentation
One of the most important things in life is having a solid foundation and goal before you do anything. If you build a house without a concrete foundation, or begin programming without an understanding of syntax, or taking a shower without soap your end result will be less than desirable. The same goes for any data warehouse project.
Building a Raw Data Warehouse in SQL Server 2005
I’m working on a project where we have to load over 700GB of raw data into a Data Warehouse to provide querying and analytics to the business unit. The choice was made to build this DW on an SQL Server 2005 3-node cluster with a 6TB network attached SAN. (The excess disk space was purchased in order to support ~1GB of nightly loads and 3 years of growth. I’ll cover more on that later.)