Our client had been receiving complaints from users about the extensive page load time of particular webpages in their legacy application. Our task was to improve the page load time of the problem area by up to 50% in 2 days. The improvements had to be made within the limitations of the legacy web application; old tech; coupled data sources; limited access to the SQL server and other data sources; no local testing; and no ability to inject performance profiling tools. Working within these limitations is traditionally very manual and time-consuming.
We began the investigation by attempting to replicate the issue in our test environments, however, we found it a challenge to capture meaningful performance metrics. As a result of this, we chose to work with a performance monitoring tool, NewRelic, as it worked within the limitations and captured enough information for us to perform analysis. NewRelic allowed us to quickly identify the root cause of the performance issue by providing traces sorted by response time and returning all invocation patterns and details. We found that caching of complex data, non-performant database queries and logging added directly to response times.
Once the most impactful areas were identified, we pinpointed the exact methods and SQL calls that were creating the most damage using the method tagging analysis feature in NewRelic. From here, we analysed the code to understand its intent and any dependencies it may have.
We then promptly workshopped solutions, estimated their effort with t-shit sizing and the associated value. We found this a valuable exercise as it ensured that we targeted the improvements with the highest return on investment. Our team began developing the solutions in the ‘Quick Wins’ bucket. If we reached a blocker or found that the fix required complex code, we moved onto the next improvement. This tactic kept us lean by preventing us from getting bogged down on a singular improvement. It was also important not to over-optimise as each optimisation was partial to diminishing returns.
The following changes required minimal investment and were enough to reach the goal of improving the page load time by 50%.
During the diagnostic phase, we found that the agent caching was not running efficiently, on a large page, the same agent was being traced over 300 times. By fixing flaws in the caching logic, we were able to see an immediate improvement.
We reduced the number of non-performant database queries by tailoring their implementation to their specific functions. For example, we were able to pull back the number of database queries by returning only the required results.
We were able to combine similar queries and write them into a singular, concise query.
We offloaded logging operations into a separate thread so that the server didn’t have to finish the logging process before returning user requests.
We hit the target of reducing the page load target by 50% in 2 days, this meant that users were no longer required to wait excessive amounts of time for their data to load in the legacy system.
By focusing our efforts on the quick wins, we were able to implement high impact changes with the minimum amount of investment required.