Friday, May 29, 2009

The RECOMPILE Options

Recompile Options:
WITH RECOMPILE [at a Stored Procedure level only]
Option (Recompile) [at a TSQL statement level only]

Their functionality:
When used in the code of a particular Stored procedure, Option 1compiles that SP everytime it is executed by any user. It can not be used at a individual statement level. Any existing plan is never reused even if the new plan is exactly the same as any pre-existing plan for that SP. Once the SP is executed, the plan is discarded immediately. There is no caching of the execution plan for future reuse.

When used with a TSQL statement whether inside an SP or adhoc, Option 2 above creates a new execution plan for that particular statement. Any pre- existing plan even if it is exactly the same as the new plan, is not used. The new plan is discarded imeediately after execution of the statement.

General Plan Reuse in brief:
SP:
Assuming both of these options are not being used, an execution of an SP prompts a search for pre-existing plans (one serial plan and one parallel plan) in memory (plan cache). If found, it is reused. New input parameters in the current execution replace the previous input parameters from a previous execution plan in the execution context handle which is part of the overall execution plan. So usage of different input parameters doesn't cause a recompilation. Recompilation happens only when about 20% of the data in the tables being called from within the SP is found to have changed since the last time statistics was updated for those tables and its indexes. If say, you are accessing four tables in that SP and roughly 20% of the data for one table has been found to have changed since last statistics update, then that statement is recompiled. If all four tables have changed by about 20% since last statistics update, then the entire SP is recompiled. This plan reuse system works as long as objects are qualified with it owner and DatabaseName (for e.g. DBName.dbo.TableName instead of just TableName). Since we have auto-update statistics off, we have less recompilation of SP's to begin with. Use of a temporary table within an SP also causes recompilation of that statement. There are other factors also that affect plan reuse but that theory is too detailed. So I have just mentioned the most relevant points. Please note that while searching for an existing execution plan, SQL server searches by the ID of the SP being currently submitted for execution. So the search for existing execution plans is faster in case of SP's.

Adhoc TSQL:

Adhoc TSQL statements follow the same rules (statistics, etc) as an SP regarding plan reuse and some more. An adhoc SQL statement doesn't have an ID to begin with since it is not a database object. So existing plans are searched by the statement structure and the objects contained in that statement. So it takes a bit longer for SQL Engine to search for existing reusable plan in case of adhoc TSQL statements. Any existing execution plan in order to be reused is matched with the previous similar statement in the plan cache referencing the same object(s) (like tables). For e.g. the second statement below would reuse the plan created by the first statement. This is due to a database setting called "Simple Parameterization" which is ON by default (it treats the constant in the WHERE clause as a parameter and just switches the parameter value from 1 to 2 in the execution context of the existing plan created by the first statement). The third statement would also reuse the existing plan. The fourth statement below (it has the exact same functionality as the first statement) will not reuse the existing plan because the structure of the physical query is not exactly the same as the first query (extra spaces before the keyword WHERE). So the fourth statement would be compiled afresh and will have a new execution plan.

select a, b from db.user.abc where c = 1

select a, b from db.user.abc where c = 2

select a, b from db.user.abc where c = @input_parameter

select a, b from db.user.abc where c = 1

Again, if the statement is a bit complex unlike the statements above, even Simple Parameterization won't help and the statement would always generate a fresh execution plan unless the database option called "Forced Parameterization" is set. That again is a different ballgame altogether. The rules for exact match of the complex adhoc TSQL statement still hold good though.

Explanation:
SP:
It is for the reasons given in the theory above that RECOMPILE clause at an SP level should not be used unless we have just dropped and recreated/changed indexes, updated statistics, etc or if it is a new SP in which case it would be automatically generate a fresh plan anyway. Another drawback is that this SP-level RECOMPILE clause would erase any previous execution plan and create a new plan on each execution which is unnecessary and expensive.

AdHoc TSQL:
Using the RECOMPILE option would discard any reusable plans. But most of the existing plans do not get reused in case of adHoc TSQL due to reasons mentioned above. So plan cache ends up holding those un-reusable plans unnecessarily till the time comes when they need to be really discarded to make space for new plans in the procedure cache. Usage of the RECOMPILE option would help reduce the utilization of plan cache in such cases. But again the choice is between RECOMPILE option in adhoc TSQL or, the basic rules to be followed (same structure, same columns, same tables, fully qualified names, etc) for plan reuse. Changing an SP is simple. Changing an adhoc TSQL statement means making changes to the application as well that generates the adhoc TSQL code and recompiling the affected application dll's in most cases. So SP is always the best choice not only for this reason but for scores of other reasons as well whcih we can discuss in a separate thread.


Note: I would like to stress the point. that the Data Access Layer (DAL) would never pose a limitation on the ability to use SP's. If you can call even just one SP through DAL, you can call any number and any type of SP's from the same DAL whether the SP's use static or dynamic TSQL code inside them. DAL can not be blamed under any circumstance for excess usage of adhoc TSQL code in the application.

A slight clarification:
Once these two options are used (SP level and adhoc statement level), there would be no scope for reusing any "good" plans. This is because plans won't exist after an execution is over. They would be discarded immediately after the execution.

Caution:
Developers often tend to use OPTION (RECOMPILE) an an excuse for introducing more and more adhoc TSQL and then tend to eliminate SP's altogether. And then they reach a stage where plan cache gets filled up to the brim again. This defeats the very purpose of Option (recompile) for adhoc TSQL code. Please note that every recompile issues locks on the underlying tables. And every new plan generation process takes an additional time and overhead (CPU time and statistics analysis). Adhoc TSQL reflects/implies inappropriate application design. It also increases network roundtrips and length/duration of a business logic implementation and execution. OPTION (RECOMPILE) is to be only used as a temporary measure while a process for converting adhoc TSQL code to Stored Procedures is underway.

Suggestion/Conclusion:
Based on the explanation above, using WITH RECOMPILE clause for SP's is not suggested. Using OPTION (RECOMPILE) at a statement level can be tried for adhoc TSQL. The effort in changing this for each adhoc TSQL statement in the application can be of course more productive instead if conversion from adhoc TSQL to Stored Procedures is taken up. This is because SP's are generally much superior to adhoc TSQL statements in all respects.