Archive for the ‘data warehouse’ Category

Column-oriented MySQL for VLDB

InfiniDB – Open Source BI/Analytic Database

Recently ran across some blog posts about InfiniDB, a MySQL based DW and BI analytic database from a company called Calpont. I *think* InfiniDB is their only product. I could be wrong about that, though.
There are plenty of things to like about InfiniDB – Multi-threaded and designed for multi-cpu/cores, ACID compliant, recoverable, supports SQL standards and online DDL, MVCC, dynamic data compression, and FREE! What attracted me first though was the open source implementation of the columnar storage. That’s the current biggie on VLDB, think vertica or Oracle’s ExaData.

Click to continue reading “Column-oriented MySQL for VLDB”

Read the rest of this entry »

Designing the Data Mart – Part 2

Continuing from Part 1.

So now we have our transactional model and a basic user story:

Our first request from the business for our data mart is that they want to be able to query all of the orders by date, by customer and/or by region (state, city or country). They want to be able to aggregate (sum and average) across those items.

The first thing we need to do is talk to the business and find out exactly what that request means. Do the users want to see information about daily orders in general?

Click to continue reading “Designing the Data Mart – Part 2″

Read the rest of this entry »

Designing the Data Mart – Part 1

As I mentioned a while back (a loooong while back), I have been thinking about writing up how I design data marts. The problem with that is that it is a huge topic. Even converting an existing schema (which doesn’t always exist) to a data mart (star schema style), still takes plenty of behind the scenes data analysis and prepwork. Still, I am not going to take a shot at it.

I could start with a laundry list of requirements but I don’t think that would be interesting to very many people.

Click to continue reading “Designing the Data Mart – Part 1″

Read the rest of this entry »

Calculating Business Days and Business Days Between

From the Database Geek.

I recently had a requirement to populate the day dimension of a data mart (I won’t put all of the code here as it’s pretty large). That’s not that big deal but part of the requirement was to set several columns: BUSINESS_DAY_FLAG, BUSINESS_DAY_NO and BUSINESS_DAYS_REMAINING_NO.

  • The BUSINESS_DAY_FLAG is Y is the date is MON-FRI and N is the date is SAT or SUN.
  • BUSINESS_DAY_NO is the business day of the month. There are 5 business days per week so if the month started on a monday, the second monday would be business day 6.
  • BUSINESS_DAYS_REMAINING_NO is the number of business days remaining in the month.

Click to continue reading “Calculating Business Days and Business Days Between”

Read the rest of this entry »

A day with Ralph Kimball, Part 1

I had the opportunity to spend a day in a seminar with Ralph Kimball. If you don’t know who that is, he is a guru of data warehousing. If you’re involved in data warehouses, I hope you are at least familiar with his work. Currently in the industry there are two primary, competing warehousing methodologies, i.e. practically religions to some, Kimball vs. Inmon. I think that’s kind of silly. A methodology is like a hammer or a drill; choose the best one for the job. If I absolutely have to pick, I’m in the Kimball camp.

Click to continue reading “A day with Ralph Kimball, Part 1″

Read the rest of this entry »

The Coming of the Oracle Database Appliance

It looks like Oracle is making the move towards appliances, albeit in a more componentized way.

Oracle today announced the Oracle Optimized Warehouse Initiative to help accelerate data warehouse deployments by offering a choice of optimized solutions that combine the performance, reliability and scalability of Oracle(r) Database with hardware and storage from industry leading manufacturers.

As part of this initiative, Dell, EMC and Oracle today introduced the initial Oracle Optimized Warehouse. (See today’s related press release at: http://www.oracle.com/corporate/press/2007_sep/dell-emc-oracle-warehouse.html ). Available through Dell, the Oracle Optimized Warehouse for Dell and EMC is comprised of Dell PowerEdge servers, EMC CLARiiON networked storage systems and Oracle Database.

“As the data warehousing market continues to grow and mature, Oracle is evolving to meet the changing needs of our customers,” said Ray Roccaforte, vice president of Data Warehousing and Business Intelligence Platform, Oracle.

Click to continue reading “The Coming of the Oracle Database Appliance”

Read the rest of this entry »

A Day with Ralph Kimball, Part 2

This continues the blog began in Part 1 of A Day with Ralph Kimball.

So, on to the seminar! Please remember that this isn’t what Ralph said as much as it’s my interpretation of what Ralph said. I’m trying to explain what you’ll get from his seminar but it’s through my eyes not yours. That’s why this is not a replacement for his seminar. Hopefully, I will induce you to attend if he makes it to your town. You should go just so that you can tell me where I got it wrong if for nothing else.

When we last saw our intrepid heroes they were devouring their lunch and asking Ralph inane questions.

Click to continue reading “A Day with Ralph Kimball, Part 2″

Read the rest of this entry »