Shane on November 2nd, 2006

Rule number two of blogging is to title your posts after search terms people might use on Google. I’m sure this title will get a lot of traffic. (Rule number one, by the way, is to write a lot.)

Anyway, Margy Ross and Ralph Kimball write an interesting article about interviewing customers during the analysis phase of your projects. What does “Hawkeye” Pierce have to do with it? You’ll have to read the link to find out.

Tags: ,

Shane on November 1st, 2006

So I need to import data from Excel into SQL Server. I whip up a new table, knock out a quick package, run the package, and look at the data. Looks fine, okay. Email the customer so they can look at the table. Two days later I get the email from the customer saying, “How come these rows have null values in the somekeyidentifier column? The data is in the Excel source.”

And thats when I realize that #1, I need to enforce some constraints when I’m whipping up new tables. And #2, Excel is not very good at guessing what datatypes I have. You see, an intelligent data profiling tool samples data when determining datatypes. Excel just looks at the top X rows and just ignores every row that doesn’t match that datatype.

The ever helpful Allan Mitchell at SQLDTS.com has some obscure tips for dealing with this.

Tags:

Shane on October 12th, 2006

I went to a seminar on Oracle partitioning and data warehousing earlier this week call “Scaling to Infinity” by Tim Gorman. It gave me some new ideas, and reinforced some others, for where we should take our architecture at the U.

Here are some notes that I will hopefully talk more about as I research them.

  1. Functions and query optimization
    One slide he had was a demo where
    TRUNC(date_posted) = '10/12/06'
    would force a table scan where
    date_posted >= '10/12/06'
    AND date_posted < '10/13/05'

    would use an index or use partition-pruning to execute. It seems to me that if you have good statistics, then the CBO would be smart enough to partition-prune even with the first method. The instructor recommended using a function-based index. There must be an easier way.
  2. Why do we get “corrupt index” errors when people are running queries during partition exchanges? According to the presenter, partition exchanges do not take effect until all the index rebuilds, etc. have completed. PE commits like a regular transaction. I’ll need to replicate this error to figure out what’s going on.
  3. “Range of ROWIDs stored in a direct-path log made visible by view ALL_SUMDELTA” Not very useful, but interesting nonetheless.
  4. DBMS_MVIEW.REFRESH Fast Refresh does not work if your MVIEW has unions, subqueries, etc.
  5. You can use SQL*Loader to generate external table syntax? I’ll need to check this out. Generating external table syntax, especially for wide flat files, is a major PITA.
  6. DBMS_XPLAN is a way to look at explain plans.
  7. Table compression. Test table compression (and exchange partitioning) for a large transactions table. Possibly run the loads in parallel in development with the regular production table for a while to see if there are any gotchas. Test ETL performance, maintenance “costs”, space usage, and query performance.
  8. DBMS_REDEFINITION is a package for doing online table reorgs. It would have been nice to have this during our tablespace move.
  9. Does Oracle Warehouse Builder (OWB) do exchange partitioning without a lot of manual coding?
  10. SYS.COL_USAGE$ keeps usage stats on columns. It tells you how often columns are used in WHERE clauses and how they are used? Equi-joins, range predicates, etc.
  11. Download exchpart.sql to automate partition exchange for ETL.
  12. Bitmap join indexes. Very cool technology. You basically create a bitmap index of your dimensions. Then if you are quering multiple dimensions versus the fact table, it can just join the BJI to the bitmap indexes on the fact.
  13. Transportable tablespaces for DWPROD. Could we create PCOM3 and then attach that tablespace to DWPROD to simply our ETL and eliminate those pesky database links.
  14. ALTER SESSION SET CURRENT_SCHEMA
  15. PARALLEL good for lots of processing that return small result sets or CREATE TABLE AS SELECT. If you return large result set, your bottleneck will be the query coordinator.
  16. Loading exchange partitions when your partition key is not the load date. Tim had a neat technique for dealing with the dribble-over between partitions.
  17. Data Vault. Looks promising for creating point-in-time representations of your data.

Tags: , ,

Shane on October 12th, 2006

I finished moving everything to the new host last night. I am done with the old host and their skimpy disk space allowance and am now on Dreamhost.com. They’ve given me 200GB disk space that will grow by 1GB per week. (My old host gave me 1GB total.) How did I learn about Dreamhost? Through their blog, of course!

I don’t need this disk space right now, but with all that space I can put up a lot of baby pictures and videos. All Dora, all the time.

For those of you reading this on the web, I apologize for the new theme. It looks like I forgot to install the theme I was using. I might change the theme a time or three before I settle on something I like. I read blogs through RSS so layout is the least of my concerns.

Tags:

Shane on October 3rd, 2006

So I’m reading this article by David Peterson in the Minneapolis Star Tribune this morning. (No link, they take down articles after two weeks anyway.) The article discusses how housing is accounting for larger percent of people’s take-home pay. And it starts out discussing how peoples’ mortgages have gone up. An example was $600 to $1100 per month.

I’m no finance major, but I’m pretty sure that mortgages don’t magically go up. When you cash out one mortgage and take on another, then the price changes. But this decision to refinance is one hundred percent on the person making the choice. You reap what you sow. Free money is never free.

The article goes on to discuss how this phenomenon has not hit the rental market as badly. No surprise here. Renters can’t make these stupid financial decisions. If your landlord refinances and doubles your rent, you move.

The real kicker though, is this anecdote.

Homeowners say many factors can lead to the same end point: a bigger housing bill than they can comfortably handle.

D’Heilly, who takes home about $2,000 a month from her job as commercial engineering services coordinator at the Toro Co., in Bloomington, pays more than half that for a longtime family home.

Her own refinance to remodel pushed costs higher. But the main thing she blames is surging home values in the city, which push up the value of her home on paper, and therefore her taxes, which have risen three times faster than her paycheck.

“It seems like everything goes up and up and up, except your pay,” she said.

Let me get this straight, so you refinanced your home, which increased your mortgage. Then you remodelled your house, which led to a reassessment and then increased your mortgage. (Assuming your taxes are paid from an escrow account.) But the real problem here is your neighbor’s home values, which are increasing your tax burden. You have got to be kidding?!

Now, I appreciate the article. I think it’s a discussion that is becoming increasingly relevant. I also realize that as a journalist, your job is to report and not debate. But how can you let this assertion go by without a response? How about a little counter-quote here? How about a little basic accounting 101 about how the $10 more in taxes pale in comparison to the hundreds extra you are paying for your refinance. How about correcting misinformation?

Tags:

Shane on July 25th, 2006

When I was 21, I dropped out of college and went to work at a start-up called Creative Internet Solutions. I had played with web development as a hobby since late 1995 but now I got to do it full time without those pesky classes occupying my time. And having money was nice. I was young and hungry with plenty of time, so I poured myself into my work. Many of my co-workers were at a similar point in their lives, so we lived and breathed our work.

After the owner sold the company to Control Data — yes, they were still around — things changed, people left, and we all eventually moved on. The ideas, innovation, and energy from that job are something I’ve never seen since. We didn’t do anything consciously, it just happened. We were lucky. Which is probably why it all fell apart so fast.

Now I work for a large research University. I must confess that part of the draw was that I hoped that I would find some of this energy.  Nope, not here. The focus here, as is in many IT shops, is on compliance, standardization, and process. Because that’s the way a “business” operates. Some businesses, however, pride themselves by cultivating innovation and can then capitalize on these ideas. Marissa Mayer, at Google, discusses their culture in this old Fast Company piece.

Tags:

Shane on June 30th, 2006

Donald Farmer is blogging again! Donald works as Project Manager of the SQL Server Integration Services team at Microsoft. Donald is a very smart guy; gifted at explaining technology in an engaging and enjoyable way. One technique Donald uses is to heavily salt his speeches with good personal anecdotes. This quote aptly demonstrates why I enjoy watching Donald’s presentations and why I am excited that he’ll be writing more regularly now.

… don’t expect formality. Chances are, I’ll be writing this with a glass of wine to hand. Often I’ll be writing at the end of a long day of difficult design decisions or working through customer issues. On other days, I’ll just have put down a new book of poetry, or have come in from the garden, and those will be bubbling in my mind.

Tags: