User defined functions called from SQL

I was cc’d on an email yesterday morning that was sent out to the development group.  It was a pretty bluntly worded question asking why a monthly job was taking well over an hour to run. Time to investigate.

The issue was reproducible in our staging environment and generating a SQL trace was an no-brainer to take a look at what the problem was.  

After running the trace file through tkprof (all defaults) it was immediately apparent that the issue was with recursive statements. Recursive statements include all the SQL Oracle has to do in the background to parse incoming SQL. That overhead will vary depending on the type of parsing necessary but is unavoidable for newly created cursors. In addition, user defined PL/SQL functions called within SQL statements are also accounted for under the recursive statements summary.

Embedding user defined PL/SQL in a SQL statement or view definition has a number of disadvantages.

  • Any time you embed user defined PL/SQL calls within a SQL statement (or in this case a view definition) Oracle must perform a context switch between the SQL and PL/SQL engines. This implies CPU overhead.
  • If there are any performance problems embedded in the PL/SQL function, thos problems are inherited by the calling SQL or view.
  • PL/SQL is not part of the relational model. Anytime you do something none-relational in SQL you limit the opportunities for the Cost Based Optimizer to do some behind the scenes magic to improve the efficiency of the statement

  • Below is the summary information from tkprof.

    
    OVERALL TOTALS FOR ALL NON-RECURSIVE STATEMENTScall
    
    count       cpu    elapsed       disk      query    current        rows
    
    ------- ------  -------- ---------- ---------- ---------- ----------  ----------
    
    Parse        2      0.01       0.01          0          7          0           0
    
    Execute      2      0.88       5.98          0      12145      74414           2
    
    Fetch        0      0.00       0.00          0          0          0           0
    
    ------- ------  -------- ---------- ---------- ---------- ----------  ----------
    
    total        4      0.89       5.99          0      12152      74414           2
    
    OVERALL TOTALS FOR ALL RECURSIVE STATEMENTS
    
    call     count       cpu    elapsed       disk      query    current        rows
    
    ------- ------  -------- ---------- ---------- ---------- ----------  ----------
    
    Parse     2141     17.89      17.89          0          0          0           0
    
    Execute 13589173    492.64     489.04          5        180          0           0
    
    Fetch   13589083   4007.37    4035.89     778773  293428221        462    13589118
    
    ------- ------  -------- ---------- ---------- ---------- ----------  ----------
    
    total   27180397   4517.91    4542.83     778778  293428401        462    13589118
    
    

    13.5 million executions of recursive SQL statements.  Returning 13.5 million rows.  I always have mixed reactions when I see stuff like this. On the one hand it is frustrating that easily avoidable inefficiencies are so common. On the other hand these sorts of things are typically quite easy to fix and the improvement in response time often seems like a miracle to the end user which is always rewarding.

    Its my understanding that the application that produced this trace file was written under heavy deadlines and constantly changing requirements. I know, nothing new there. However, the developers seemed very fond of writing views on top of views on top of views. That wasn’t much of a problem when the data volume was relatively low but now that the application has been running for quite a few years it is not scaling very gracefully.

    Anyways, after seeing the recursive statement summary it was only a matter of identifying the user defined function and figuring out how to merge the logic of that function into a purely SQL solution. I went backup up the trace file and found the following SQL was executed nearly 13.5 million times.

    
    SELECT PERIODID
      FROM PERIOD 
      WHERE TRUNC(STARTDATE) <= :B1 AND 
            TRUNC(ENDDATE) >= :B1;
    

    It took me a little while to figure out where this was coming from but I eventually tracked it down to a view with a user defined function used to filter results for the current period.  A few emails between the developers and myself and a simple view re-write resulted in a several orders of magnitude in improvement in response time. It was fun creating a new trace file after the rewrite and emailing out a difference report. It was a good day.

    Advertisements
    Leave a comment

    Leave a Reply

    Fill in your details below or click an icon to log in:

    WordPress.com Logo

    You are commenting using your WordPress.com account. Log Out / Change )

    Twitter picture

    You are commenting using your Twitter account. Log Out / Change )

    Facebook photo

    You are commenting using your Facebook account. Log Out / Change )

    Google+ photo

    You are commenting using your Google+ account. Log Out / Change )

    Connecting to %s

    %d bloggers like this: