Another issue that seems to overly concern many DBAs is the issue of extent counts and sizing. Gone are the days of fitting objects into single extents. Yet I still see DBAs who want to keep their extent counts down?like it really matters. Let's see why it just doesn't fly anymore.
First, direct load operations are a must-have and are going to create an extent per parallel degree. So, a nightly data load with a parallel degree of 12 is going to create at least 4,380 extents over just one year (assuming that each parallel process needs only one extent for its portion of the overall data load). However, with real-world volumes in a typical data warehouse, it's not uncommon for that count to be 10?20 times that number, or 43,800?87,600 extents. Even if you use large extents, such as 2GB initial and next, the counts will still be very high. Remember that you may be loading tens of millions of records per night, and that is going to take more than just a few extents. So, your extent count over time is going to be high, period.
Second, data warehouses should be using locally managed tablespaces with uniform-sized extents. This type of tablespace management is far superior in terms of raw performance to that of dictionary-managed tablespaces. Moreover, it does not create any dictionary entries, which are often the actual concern of those DBAs obsessed with keeping the extent counts low. I've easily seen 15?20% improvements in data load and index creation times from using locally managed tablespaces. Furthermore, it seems to add about a 2?4% improvement across the board for all other operations as well, including queries (which I cannot easily explain). The real trick is to pick a uniform extent size that makes sense.
So how do you pick a good extent size? That's actually quite easy; just ask yourself how much disk space you're willing to waste each day. Remember our earlier nightly load example being done in parallel with degree 12? Assume the worst-case scenario: Each process will get one record that will not fit into the next to last extent, so each will create an extent that contains a single record. So, you get 12 extents that each contain a single record, which means that each is probably 95% or more unused. The next day's data load will create new extents and not use these partial extents since direct mode load means allocate new extents and then move the high-water mark. So how much waste can you tolerate? If you have 10MB extents and parallel degree 12, then you're going to potentially have 120MB waste each day. And while smaller means less waste, it does mean more extents.
I've found from 1?4 MB a good extent size when doing parallel. While it does create more extents, the waste is kept to a minimum and using parallel means that I can process lots of data. You'll have to find your own sweet spot.