Thursday, January 18th, 2007

Audit Your ETL With CHECKSUM

Loading your target incrementally offers a huge performance benefit over running full truncate/reloads, but there is a danger of missing inserts or updates in your source system. It can take hours or days to track down the source of these problems (if it can be done at all!) and problems are generally not found until [...]

No Comments » - Posted in Other by Shane

Thursday, November 2nd, 2006

Testing Framework for ETL

The holy grail of a former .NET web developer is how to combine the techniques of agile and test-driven development with ETL tools and data volumes. Mark Rittman shows that he is a brother-in-arms with my quest.

No Comments » - Posted in Informatica, Oracle, SQL Server by Shane

Thursday, November 2nd, 2006

Why Do We Use Informatica?

This excellent piece in DMReview deliberates on some criteria for selecting (or justifying) an ETL tool. We are struggling with this right now. Our shop uses a combination of Informatica and SQL Server DTS for our ETL jobs. With Oracle Warehouse Builder 10gR2 and SQL Server Integration Services both released as viable alternatives, we need [...]

1 Comment » - Posted in Informatica, Oracle, SQL Server by Shane

Thursday, October 12th, 2006

Scaling to Infinity Seminar

I went to a seminar on Oracle partitioning and data warehousing earlier this week call “Scaling to Infinity” by Tim Gorman. It gave me some new ideas, and reinforced some others, for where we should take our architecture at the U.
Here are some notes that I will hopefully talk more about as I research them.

Functions [...]

No Comments » - Posted in Oracle by Shane