Monday, February 11, 2013

Getting a performance boost to SQL Report Builder Reports on SharePoint - Its the little things that count

Where I work, for a number of years now our analytics team has made use of the SQL report builder to provide report subscriptions and high-level insight into sales performance.  With us recently launching our SharePoint-enabled intranet, we have been porting over some of these reports off of our dedicated report server (using "Native Mode" reporting services) and instead using the newer "SharePoint Mode" that provides web integration with our SharePoint site.

After migrating over some reports, I was somewhat disappointed and concerned at the speed, or lack thereof, of the report completion.  Looking at what could have changed in the reports from the "old" version to the "new" one, it was easy to think that it was my SharePoint implementation.  I performed some "research" (aka Googling) on the issue and came across a few blog posts, technical articles and discussion board posts all indicating that it's due to the level of HTTP requests and responses between SharePoint and the report file that is processing the query.  And to be honest, there is most certainly truth in that statement... but it shouldn't have been as drastic as what I was seeing.  How bad was it?  Well, report's that would take (on the old server) 17 seconds on average, while the reports on the new server were taking 2:58 on average.  In the beginning, I chalked it up to SharePoint and it's usage of returning the results to the client, but over time, I realized I was being lazy and pawning off the blame onto Microsoft... like it could be MY problem.. right?

So in an effort to make the RPT files more maintainable, I decided to look into storing the Dataset queries in a SQL stored procedure, instead of having them as flat queries stored within the RPT file itself.  Once I dug these up, I began to find my issue...

I haven't been at my current occupation much (I started here in August '12, which at the time of this writing is 175 days, 6 hours, 14 minutes at 22 seconds), and our IT staff here is on the smaller side, so we tend to make power users out of the user community when possible.  In this case, our analytics team has controlled and maintained the reports that I had been converting.  The member's of this team are very, very smart, it's just that some of the technical skill sets have been.. tunneled.  To explain, their solutions at times seem to be "I need a report, so I'll create a view and use that for my report".  Not that that's wrong, but from a programmers perspective, you should strive to make things modular and re-usable.  So it's not that they're wrong at all.. it's just created more work for me :)  So, when digging into the report queries, I found a few changes that made a world of difference.  How much?  That report that was taking 17 sec. on average, then up to 2:58, is now down to.. 22 seconds!  Not that I wanted to see these reports take "longer" on SharePoint.. but 5 more seconds due to HTTP chatter is much more paletable than 161 seconds :)

So what were my changes?  In all, there were 3 distinct differences I had to make.
1. Stored procedures are just "better", mainly because of their "execution plan retention and reuse" - among many other items noted here by our very good and respected friend Pinal Dave - (on a side note, if you do *anything* database related, you really should follow his blog, its excellent). 

2. Stored procedures provide flexibility in your database programming.  In my scenario, there were times I needed to read data from another table in a separate database.  With a stored procedure, it made the query that much simpler, because I was able to read into a table variable and use that data only when necessary, rather than relying on the "query builder" in the report builder, which allows for a query with some parameters, but not much beyond that.

3. Looking at the query really made me realize how innefficient the queries were!  I didn't realize it until I gave it a full look, but the queries had several repeated items in the where clause.  As an example, what's easier to read?

a.) WHERE contractNumber = '' And VoidValue = 'AK' Or contractNumber = ' ' and VoidValue = 'AK' or contractNumber = '' And VoidValue = 'AR' Or contractNumber = ' ' and VoidValue = 'AR' and contractOwner = @user

b.) WHERE contractOwner = @user and contractNumber NOT IN ('',' ') and VoidValue IN ('AK,AR')

This is an oversimplified example of what I ran into (because, frankly, you wouldn't care about the exact SQL used, nor would my company appreciate me posting actual database structure :), but you can see that the query was much more complicated than it had to be.  As a rule of thumb, if it's complex to read, it's complex to compute as well.   However, the largest performance boost I found was moving the "contactOwner" condition to the beginning of the where clause.  Rather than first testing for contract numbers and void values that weren't pertinent to my user, I was now testing for the records that would only pertain to my user first.. greatly speeding up my query.

So what did I learn here?  Well, sometimes it's easy to overlook, discount or even ignore the simple stuff with your technical issues, immediately resolving that the problem must be complex and highly technical.  In the end, with only 15 minutes of analysis and work, you too can find that it's some pretty basic things that can provide a performance boost to your work.

No comments:

Post a Comment