Indexes are Still a “Thing” on Exadata!

Even with Oracle Database 19c and its automatic indexing features on Exadata, and especially on the Oracle Autonomous Database, people still come up to me on the street and ask me, “Should I drop all my indexes when I move my applications to Exadata?”

The answer, of course, is “sometimes”, and more often than not, many of the indexes you have in your existing application may not be needed on Exadata due to the features of the Exadata storage layer than make full table scans faster than ever.

Full index scans happen on Exadata too — but as you’ll see, sometimes the tried and true generic features of Oracle Database give you what you need. Serendipity led me to some features of Oracle Database that I more or less forgot about, and they are even faster on Exadata.

Setup and Scenario

I had a table called SAMP_TRANS with 185 columns and its size was 29 GB and approximately 250 M rows. There was a single-column primary key and the index was 6 GB. Do I need the index? The execution plan varied for some reason, but all I wanted to see was the minimum value of the primary key and one other derived value. The query looked like this:

SELECT /*+ monitor parallel(4) */ 
    MIN(trans_id)-1, MIN(trans_id)
FROM
    samp_trans t
WHERE
    trans_id IS NOT NULL;

It could be argued that the “-1” calculation could be done downstream, and that I really didn’t need that WHERE clause, but to stabilize the plan all I could do was change the hint.

Hint #1

Exadata is really good at full table scans at the storage level, right? So I added a FULL hint in the query. I was already using a parallel plan, and that was icing on the cake.

 /*+ monitor parallel(4) full(t) */

The query was fast — reading about 26 GB at the storage layer and only returning about 3 GB of data to the compute layer and finishing in 2 seconds. The plan statistics from the SQL Monitor report shows the offloading activity.

Offloaded full table scan

Hint #2

I also remembered that Exadata can do a full scan of an index at the storage layer, and since the index is only one column, I put an INDEX_FFS hint in the query.

/*+ monitor parallel(4) index_ffs(t pk_samp_trans) */

The storage layer did a full index scan, and the plan shows the full index scan and I/O equal to the size of the index at the storage layer:

Offloaded full index scan
Bytes read from index at the storage layer

Wait, 2 seconds??? That’s the elapsed time of the full table scan too. And the table has 185 columns. How could the elapsed time be as low with the full table scan as it is with the full index scan? Going back and looking at the execution statistics for the plan with the full table scan:

FTS line 6 plan step: cell server execution statistics

When an execution plan uses a full table scan at the storage layer, the cell servers not only can do column projection (only the columns needed for the query), but in this case the flash cache was automatically storing the table rows in columnar format, making the full table scan even faster.

Hint #3

It turns out that there was an underlying system problem that made the plan unstable, but once that was fixed, I still thought I could force an index “scan”. And I suddenly realized there was an Oracle feature that transcended the hardware… so I changed the hint to include just an INDEX hint:

/*+ monitor parallel(4) index(t pk_samp_trans) */

And I was surprised by the results:

Index full scan (MIN/MAX)

The I/O was… almost nothing:

I can count the index block reads on one hand

Since the index is sorted and has the information I need, the access path chosen (and I ensured it with the hint) was the Index Full Scan (Min/Max) access path, and it was around even before the first Exadata was released 10 years ago.

Still, it’s even faster on Exadata, especially if you’re using an X8M-2 with RoCE (RDMA over Converged Ethernet) hardware, and you can read those 4 index blocks really fast. In this case, the index blocks were already in the SGA. Running in 3.1 ms is faster than any full table or index scan — the Oracle Database engine will traverse one or the other side of the index to get the low or high value of the already sorted PK index blocks.

Strategy

My takeaways are these — as much as many of the Exadata built-in features kick in automatically, get to know what Exadata can do at the storage layer and sometimes you have to steer it in the right direction. But also keep in mind that some of the classic Oracle database features, like the MIN/MAX index search in the execution plan, might be the best solution of all.

5 comments

  1. Hello. I am curious about this comment – “It turns out that there was an underlying system problem that made the plan unstable….” We are experiencing many performance issues since migrating our databases to Exadata. The optimizer is simply not choosing the most efficient execution plans without me forcing it using SQL profiles. Do you have any insight? Thanks.

    1. Justin, in my case it was an actual bug. Your situation sounds like the optimizer is having issues coming up with a good plan, assuming the version and RU level of the database is the same. I’d investigate statistics beyond the schema-level statistics (of course they should be updated at the same frequency as before). I’m thinking it could be system statistics or fixed object statistics that could be part of the culprit. System statistics contain things like single block read time, large 1M read time, CPU speed, and latency. Those will all enter the picture when the optimizer is trying to come up with the best plan. A good starting point is the DBMS_STATS procedure.

  2. Thanks for your reply! I have set all databases to: exec dbms_stats.gather_system_stats(‘EXADATA’); I assume that is the proper thing to do?

    1. Yes, that’s the way you’d collect system statistics on Exadata — but as with any complex Oracle feature, there are many experts even at Oracle that have some disagreements about whether to collect system statistics at all, whether to do it once and leave it alone until you have a hardware or software or application upgrade, or whether to collect those statistics on a regular basis. Any change to a production environment is ideally preceded by a similar implementation in a DEV or REL environment first. But wherever you’ve run that, I’d make sure the execution plan for one of the misbehaving queries is no longer in the shared pool then run it again. Look at the new execution plan (ideally a SQL Monitor report) and see if the cardinality estimates have changed and how much the overall plan has changed. If it has not changed at all, then you might not be on the right track with system statistics.
      For further reading on the discussions about statistics from Oracle experts, I’d start with this blog post:
      https://blogs.oracle.com/optimizer/should-you-gather-system-statistics

Leave a reply to Justin Cancel reply