Group By Grouping Sets

Posted by the Database-Geek.

Summary Totals Using SQL

Have you ever needed to get totals for different combinations of columns in the same query? Did you ever want to create a nice little report from a simple SQL statement? Here’s a tip that can make your life a lot easier when you need it. The great thing about this is that you can use it in a ref cursor called from other programming languages and reduce the number of calls you need to make to the database.

Let’s say you are running queries against the Employees table of the sample HR schema provided by Oracle.

SQL> SELECT count(*) FROM Employees;

  COUNT(*)
----------
       107

Now, we want to get a total of salaries for each job in each department. We also want the max and min hire dates. We can do this with a simple query:

SELECT department_id,
       job_id,
       count(*),
       sum(salary),
       max(hire_date),
       min(hire_date)
FROM Employees
GROUP BY department_id, job_id
/
DEPARTMENT_ID JOB_ID       COUNT(*) SUM(SALARY) MAX(HIRE_ MIN(HIRE_
------------- ---------- ---------- ----------- --------- ---------
          110 AC_ACCOUNT          1        8300 07-JUN-94 07-JUN-94
           90 AD_VP               2       34000 13-JAN-93 21-SEP-89
           50 ST_CLERK           20       55700 08-MAR-00 14-JUL-95
           80 SA_REP             29      243500 21-APR-00 30-JAN-96
           50 ST_MAN              5       36400 16-NOV-99 01-MAY-95
           80 SA_MAN              5       61000 29-JAN-00 01-OCT-96
          110 AC_MGR              1       12000 07-JUN-94 07-JUN-94
           90 AD_PRES             1       24000 17-JUN-87 17-JUN-87
           60 IT_PROG             5       28800 07-FEB-99 03-JAN-90
          100 FI_MGR              1       12000 17-AUG-94 17-AUG-94
           30 PU_CLERK            5       13900 10-AUG-99 18-MAY-95
           50 SH_CLERK           20       64300 03-FEB-00 27-JAN-96
           20 MK_MAN              1       13000 17-FEB-96 17-FEB-96
          100 FI_ACCOUNT          5       39600 07-DEC-99 16-AUG-94
              SA_REP              1        7000 24-MAY-99 24-MAY-99
           70 PR_REP              1       10000 07-JUN-94 07-JUN-94
           30 PU_MAN              1       11000 07-DEC-94 07-DEC-94
           10 AD_ASST             1        4400 17-SEP-87 17-SEP-87
           20 MK_REP              1        6000 17-AUG-97 17-AUG-97
           40 HR_REP              1        6500 07-JUN-94 07-JUN-94

20 rows selected.

That’s nice but what we really want is to see sub-totals at each of the levels (job and department), as well as a grand total for everything. Enter GROUPING SETS. With a very small syntax change, you can get those totals.

SELECT department_id,
       job_id,
       count(*),
       sum(salary),
       max(hire_date),
       min(hire_date)
FROM Employees
GROUP BY GROUPING SETS (
     (department_id, job_id),
     (department_id),
     ())
/
DEPARTMENT_ID JOB_ID       COUNT(*) SUM(SALARY) MAX(HIRE_ MIN(HIRE_
------------- ---------- ---------- ----------- --------- ---------
              SA_REP              1        7000 24-MAY-99 24-MAY-99
                                  1        7000 24-MAY-99 24-MAY-99
           10 AD_ASST             1        4400 17-SEP-87 17-SEP-87
           10                     1        4400 17-SEP-87 17-SEP-87
           20 MK_MAN              1       13000 17-FEB-96 17-FEB-96
           20 MK_REP              1        6000 17-AUG-97 17-AUG-97
           20                     2       19000 17-AUG-97 17-FEB-96
           30 PU_MAN              1       11000 07-DEC-94 07-DEC-94
           30 PU_CLERK            5       13900 10-AUG-99 18-MAY-95
           30                     6       24900 10-AUG-99 07-DEC-94
           40 HR_REP              1        6500 07-JUN-94 07-JUN-94
           40                     1        6500 07-JUN-94 07-JUN-94
           50 ST_MAN              5       36400 16-NOV-99 01-MAY-95
           50 SH_CLERK           20       64300 03-FEB-00 27-JAN-96
           50 ST_CLERK           20       55700 08-MAR-00 14-JUL-95
           50                    45      156400 08-MAR-00 01-MAY-95
           60 IT_PROG             5       28800 07-FEB-99 03-JAN-90
           60                     5       28800 07-FEB-99 03-JAN-90
           70 PR_REP              1       10000 07-JUN-94 07-JUN-94
           70                     1       10000 07-JUN-94 07-JUN-94
           80 SA_MAN              5       61000 29-JAN-00 01-OCT-96
           80 SA_REP             29      243500 21-APR-00 30-JAN-96
           80                    34      304500 21-APR-00 30-JAN-96
           90 AD_VP               2       34000 13-JAN-93 21-SEP-89
           90 AD_PRES             1       24000 17-JUN-87 17-JUN-87
           90                     3       58000 13-JAN-93 17-JUN-87
          100 FI_MGR              1       12000 17-AUG-94 17-AUG-94
          100 FI_ACCOUNT          5       39600 07-DEC-99 16-AUG-94
          100                     6       51600 07-DEC-99 16-AUG-94
          110 AC_MGR              1       12000 07-JUN-94 07-JUN-94
          110 AC_ACCOUNT          1        8300 07-JUN-94 07-JUN-94
          110                     2       20300 07-JUN-94 07-JUN-94
                                107      691400 21-APR-00 17-JUN-87

33 rows selected.

Notice the change to the GROUP BY. I added a GROUPING SETS and then 3 columns lists.

  • (department_id, job_id) – Totals for the jobs within a department (listing each job_id)
  • (department_id) – Totals for the department over all (job not listed)
  • () – a grand total at the bottom

Play with this query a bit and try different combinations of GROUPING SETS. Add a new set, (job_id) and see what happens.

This is such a cool addition to the SQL language. In the future I will show some tips using GROUPING SETS and show how this relates to ROLLUP and CUBE.

LewisC

Technorati : , , ,

You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

Comments are closed.