Even with Oracle Database 19c and its automatic indexing features on Exadata, and especially on the Oracle Autonomous Database, people still come up to me on the street and ask me, “Should I drop all my indexes when I move my applications to Exadata?”
The answer, of course, is “sometimes”, and more often than not, many of the indexes you have in your existing application may not be needed on Exadata due to the features of the Exadata storage layer than make full table scans faster than ever.
Full index scans happen on Exadata too — but as you’ll see, sometimes the tried and true generic features of Oracle Database give you what you need. Serendipity led me to some features of Oracle Database that I more or less forgot about, and they are even faster on Exadata.
Setup and Scenario
I had a table called SAMP_TRANS with 185 columns and its size was 29 GB and approximately 250 M rows. There was a single-column primary key and the index was 6 GB. Do I need the index? The execution plan varied for some reason, but all I wanted to see was the minimum value of the primary key and one other derived value. The query looked like this:
SELECT /*+ monitor parallel(4) */ MIN(trans_id)-1, MIN(trans_id) FROM samp_trans t WHERE trans_id IS NOT NULL;
It could be argued that the “-1” calculation could be done downstream, and that I really didn’t need that WHERE clause, but to stabilize the plan all I could do was change the hint.
Exadata is really good at full table scans at the storage level, right? So I added a FULL hint in the query. I was already using a parallel plan, and that was icing on the cake.
/*+ monitor parallel(4) full(t) */
The query was fast — reading about 26 GB at the storage layer and only returning about 3 GB of data to the compute layer and finishing in 2 seconds. The plan statistics from the SQL Monitor report shows the offloading activity.
I also remembered that Exadata can do a full scan of an index at the storage layer, and since the index is only one column, I put an INDEX_FFS hint in the query.
/*+ monitor parallel(4) index_ffs(t pk_samp_trans) */
The storage layer did a full index scan, and the plan shows the full index scan and I/O equal to the size of the index at the storage layer:
Wait, 2 seconds??? That’s the elapsed time of the full table scan too. And the table has 185 columns. How could the elapsed time be as low with the full table scan as it is with the full index scan? Going back and looking at the execution statistics for the plan with the full table scan:
When an execution plan uses a full table scan at the storage layer, the cell servers not only can do column projection (only the columns needed for the query), but in this case the flash cache was automatically storing the table rows in columnar format, making the full table scan even faster.
It turns out that there was an underlying system problem that made the plan unstable, but once that was fixed, I still thought I could force an index “scan”. And I suddenly realized there was an Oracle feature that transcended the hardware… so I changed the hint to include just an INDEX hint:
/*+ monitor parallel(4) index(t pk_samp_trans) */
And I was surprised by the results:
The I/O was… almost nothing:
Since the index is sorted and has the information I need, the access path chosen (and I ensured it with the hint) was the Index Full Scan (Min/Max) access path, and it was around even before the first Exadata was released 10 years ago.
Still, it’s even faster on Exadata, especially if you’re using an X8M-2 with RoCE (RDMA over Converged Ethernet) hardware, and you can read those 4 index blocks really fast. In this case, the index blocks were already in the SGA. Running in 3.1 ms is faster than any full table or index scan — the Oracle Database engine will traverse one or the other side of the index to get the low or high value of the already sorted PK index blocks.
My takeaways are these — as much as many of the Exadata built-in features kick in automatically, get to know what Exadata can do at the storage layer and sometimes you have to steer it in the right direction. But also keep in mind that some of the classic Oracle database features, like the MIN/MAX index search in the execution plan, might be the best solution of all.