The second Oracle upgrade I’ll tell you about was another export/import upgrade between 10.2 and 11.1.0.7.
The installation, empty 11g database build, and 10g export/11g import went fine on the test tier. The application testing of the upgraded 11g test database performed by the client also went fine – the client was satisfied with the performance. The run times of the key jobs that were tested bore a strong resemblance to those of the current 10g test database. Too strong, it turned out.
The installation, empty 11g database build, and 10g export/11g import on the production tier also went fine. The client ran a few quick reports to be sure they could connect to the new database, and all seemed well.
Then things went wrong. A job that had taken one or two minutes on 10g was still running nine hours later on the upgraded database. A couple of other jobs were also running much longer than before. Other jobs were running faster than on 10g. How could the results on the production tier be so different from the test tier? We’d even imported production data onto the test tier, to increase our possibility of creating a more similar environment. Importing data isn’t perfect – a physical clone is better, but at least we had the same number of rows, same indexes, etc.
At that point, the person who did the application testing then realized that, when he did the 11g testing, he had accidentally been pointed at the test 10g database. We were now running production on an untested version. The client made the decision to wait and see if we could fix the problems and remain on 11g instead of retreating, although retreating would have been fairly painless, since we didn’t harm the 10g database in the upgrade process, and the data in the database only changed weekly. (This easy retreat is one of the better features of an export/import upgrade.)
We tried standing on the nine-hour query’s throat by trying to force it to use the same execution plan as the 10g database, by hinting it, and then trying a stored plan. Both yielded new, but still long-running, execution plans. This was a complex query, which made things harder for us, since we were on a tight timeline. We tried setting optimizer_features_enabled to 10.2.0.2 (the source version), 10.2.0.3, and 10.2.0.4 (just in case those ran better) at the database level. All three settings were great for the nine-hour query but disastrous for some other queries. We tried restoring segment statistics from the 10g database, to see if that would help. It did help some queries, but harmed others.
Time was short, so the solution we settled on was to hint the nine-hour query and a couple of other queries to use optimizer_features_enabled=10.2.0.4 (this is an instruction to the query act like it’s running on a 10.2.0.4 database); to leave the default 11g optimizer_features_enabled set at the database-wide level (this is an instruction to all queries on the database to act like they’re running on an 11.1.0.7 database unless otherwise instructed); and to leave the 11g-level segment statistics in place. This returned the queries to the 10g level of performance or better.
The client was satisfied with this, as their goal wasn’t to improve performance; it was to get onto a version of Oracle that will be supported longer than 10gR2. (It would have been interesting to dig further into this issue, but the client was ready to move on to the next project on the list.)
It would have also been interesting to see if setting compatible to 10.2.0.4 would have also had the same impact to some queries versus others. Or if it would have toggled off what kernel aspects were conflicting in 11G.
I’ll keep the optimizer_features_enabled in mind. Thanks for the nugget of info.
I’m reluctant to upgrade one of my databases to 11G because it is using Streams and that is a whole other bunch of monkeys to deal with. Who knows what the future will bring.
The 10.2.0.4, 10.2.0.3 and 10.2.0.2 OFE settings on the upgraded (11g) database each gave the same (or what at a quick glance looked the same) results.
But some queries were happier with the 10.2.0.x OFE and some were happier with the 11.1.0.7 setting. I’m glad that OFE is available as query hint – a convenient sledgehammer, much helpful in my desperate situation that needed a quick fix!
About Streams, it’s going to be an even more interesting ride in future versions, with Oracle’s acquisition of Golden Gate.