This is an Oracle-specific edition of the article explaining the same subject for MySQL.
It’s often helpful to use SQL to group and summarize information by periods of time. For example, we might like to examine sales data. For example, we might have a table of individual sales transactions like so.
Each time we make a sale, a new row is inserted into this table. This kind of information is called a time series: each row data has its own DATE time stamp. Notice that we could give each row a TIMESTAMP value instead.
Always keep in mind what the Oracle documentation says about DATE values:
DATEcolumns always contain fields for both date and time. If your queries use a date format without a time portion, then you must ensure that the time fields in the
DATEcolumn are set to midnight.
What can we do to report on this table? Here are several choices.
Generate a detail report
A detail report typically shows one row for each item in the raw data table. Here’s a (simple) query to do that.
SELECT TO_CHAR(sales_time, 'YYYY-MM-DD HH24:MI:SS') sales_time, gross FROM sales ORDER BY sales_time
This is trivial, but it illustrates an important concept: it’s possible to order by DATE values. You may notice that every query in this article has an ORDER BY clause; that’s because SQL doesn’t have to present its results in any given order unless it has an ORDER BY clause. It’s better not to leave the row ordering to the whim of the query engine in the database.
We can also generate a detail report showing all yesterday’s sales, using this query. The trick here is a WHERE clause restricting the range of sales_time values.
SELECT TO_CHAR(sales_time, 'YYYY-MM-DD HH24:MI:SS') sales_time, gross FROM sales WHERE sales_time >= CURRENT_DATE - 1 AND sales_time < CURRENT_DATE ORDER BY sales_time
Notice that subtracting the value 1 from an Oracle DATE item is the same as subtracting one day from it.
This query illustrates something important about using DATE ranges in WHERE clauses. When selecting a whole day’s worth of information, we want to include everything starting from midnight on the day in question (that is, “CURRENT_DATE – 1” meaning yesterday). We want our range of dates to run up to but not including midnight on the day after the day in question. A common mistake is to use a WHERE / BETWEEN clause like this one.
WHERE sales_time BETWEEN CURRENT_DATE - 1 AND CURRENT_DATE -- wrong!
This clause includes sales_time values equal to midnight on the current day, which is not desired.
Another common mistake is to write the WHERE clause using a TRUNC() function like this. (Using TRUNC() on a DATE item removes the time portion of its value.)
WHERE TRUNC(sales_time) = -- inefficient! CURRENT_DATE - INTERVAL 1 DAY
This yields correct results, but is inefficient. It needs to evaluate the TRUNC() function for every row in the table, so it can’t exploit a index.
Summarize by day
Now that we understand our basic detail data, we can generate all sorts of useful summary (or aggregate) reports. First, we can report gross daily sales. To do this, we need to be able to summarize our sales table by date. Here’s the SQL query to do that:
SELECT TRUNC(sales_time) AS sales_day, SUM(gross) AS total, COUNT(*) AS transactions FROM sales GROUP BY TRUNC(sales_time) ORDER BY TRUNC(sales_time)
This uses the little formula TRUNC(sales_time) to convert a DATE (which has both date and time in it) into a pure date. It then groups by those pure dates sums up the gross sales and counts the transactions. The result set will contain a row for every date on which there were any sales.
Summarize by month
Second, we can report gross monthly sales with a similar query. To do this we need a formula to convert any DATE expression to the date of the first day of the month. Oracle’s TRUNC function can do this, as follows:
We can apply this to our query to get a summary of sales by month.
SELECT TRUNC(sales_time, 'MM') AS month_beginning, SUM(gross) AS total, COUNT(*) AS transactions FROM sales GROUP BY TRUNC(sales_time, 'MM') ORDER BY TRUNC(sales_time, 'MM')
Summarize by calendar quarter
Now let’s report sales by calendar quarter. How do we convert a DATETIME value to a calendar quarter value? Fortunately, Oracle’s TRUNC() function does this.
Applying this formula to our report, here’s what we get:
SELECT TRUNC(sales_time, 'Q') AS quarter_beginning, SUM(gross) AS total, COUNT(*) AS transactions FROM sales GROUP BY TRUNC(sales_time, 'Q') ORDER BY TRUNC(sales_time, 'Q')
Using valid DATE expressions
Notice the discipline of always using valid dates for these reports. When we have valid dates, we gain all kinds of advantages in SQL. For example, if we know the first day of a quarter, we can find out the last day of that quarter with this expression.
ADD_MONTHS(quarter_beginning, 3) - 1
Summarize by calendar week
Summarizing time-series data by calendar week is useful, and it works very much like the other interval reports. To make this work, we need a formula that can yield the first day of the calendar week given any DATE value. Oracle’s TRUNC() function makes this easy.
It retrieves the date of the preceding Sunday from any DATE value. This calendar-week expression works properly for all weeks, including the special case of weeks that span the end of calendar years. For example, for both Tuesday, December 31, 2013, and Thursday, January 2, 2014, this expression returns Sunday, December 29, 2013.
SELECT TRUNC( DATE '2013-12-31', 'DAY'), TRUNC( DATE '2014-01-03', 'DAY') FROM DUAL
In some jurisdictions, Monday is considered the first day of the week. In Oracle, the TRUNC() function honors the session initialization parameter called “NLS_TERRITORY”. For example, in Europe Monday is considered the first business day of the week.
ALTER SESSION SET NLS_TERRITORY=GERMANY; SELECT TRUNC( DATE '2013-12-31', 'DAY'), TRUNC( DATE '2014-01-03', 'DAY') FROM DUAL;
If your Oracle DBMS is not set up to choose the first day of the week correctly, you have two choices.
- Ask your local database administrator to correct the localization settings
- Use this expression to compute the Monday week start explicitly
TRUNC(sales_time -1 , 'DAY') + 1
Here is the query to summarize sales by week (given a Sunday week start).
SELECT TRUNC(sales_time,'DAY') AS week_beginning, SUM(gross) AS total, COUNT(*) AS transactions FROM sales GROUP BY TRUNC(sales_time,'DAY') ORDER BY TRUNC(sales_time,'DAY')
Generate a summary of last week’s sales, day by day
Now that we know how to handle days and weeks, we can generate a day-by-day summary of last week’s sales. Here’s how we do that.
First, we need an expression for the first day of last week. Here it is:
TRUNC(CURRENT_DATE,'DAY') - 7
Next, we need an expression for the first day of the present week. Here it is:
TRUNC(CURRENT_DATE,'DAY') - 7
These two expressions need to be built into a WHERE clause, as follows.
WHERE sales_time >= TRUNC(CURRENT_DATE,'DAY') - 7 AND sales_time < TRUNC(CURRENT_DATE,'DAY')
Finally, this WHERE clause needs to be applied to our day-by-day summary of sales, and our report of the most recently completed week is done
SELECT TRUNC(sales_time) AS sales_date, SUM(gross) AS total, COUNT(*) AS transactions FROM sales WHERE sales_time >= TRUNC(CURRENT_DATE,'DAY') - 7 AND sales_time < TRUNC(CURRENT_DATE,'DAY') GROUP BY TRUNC(sales_time) ORDER BY TRUNC(sales_time)
Indexes and efficiency
When time series data tables are very large, indexes on DATETIME columns can be very helpful to speed up queries. In our example sales table, this is the index to use:
CREATE INDEX salestime ON SALES (sales_time)
When an index is present, a WHERE clause like the one in the previous section uses it to do a very efficient index range scan. Notice that it’s possible to write WHERE clauses in ways that make indexes useless, however. For example:
WHERE TRUNC(sales_time, 'DAY') = --inefficient! ADD_MONTHS(TRUNC(CURRENT_DATE,'DAY'), -7)
will correctly retrieve all sales from week. However, because it applies a function to the value of the column it’s using to search, it cannot use the index. So it will go through the whole table row by row, compute TRUNC(sales_time, ‘DAY’), and compare it.