2100 Solutions – NEW Ruby on Rails website NOW in production on Heroku

After a few embarassing announcements, and failed launches, I have successfully re-crafted my website2100solutionsLogofor 2100 Solutions Consulting, LLC.   This branding of my services includes, Program Management, BPM Strategy, Quality Assurance, Testing, Performance Preparation and Strategy, Automated Testing and Product Development.

Here is a URL for the Ruby on Rails website I just launched using the Heroku enviornment this weekend.  http://2100solutions.com

Heroku was by far the easiest host to implement.

Now a stable 24/7 presence with links to my external blogs and pursuits.

I have other RoR websites completed, but not yet launched on Heroku.   All published websites will be managed through GitHub.  My account at GitHub is WAFulbright, if you’d like to see or follow or contribute to my code!  Let’s do talk before you contribute!   Thanks!

Bill Fulbright

 

Advertisements

14 Factors Slowing Your Website & How to Handle Them

bizpic-tablet
Techno Functional Consultant at HighRadius

 

“250 milliseconds, either slower or faster, is close to the magic number for competitive advantage on the Web” – Harry Shum, Executive Vice President of Technology and Research, Microsoft.

According to a report published by the site optimization website Strangeloop (now Radware), 57% of online customers will abandon a website after waiting 3 seconds for a website to load out of which 80% will not return ever again and half of this fraction will tell others about their bad experience. Add to this the fact that Internet users are reported to have faulty perceptions of time spent ‘waiting’ (15% slower than actual load time), we can imagine how crucial role Web Performance (Speed , Stability and Availability) plays in determining the success or failure of an enterprise in this exceedingly online world as it directly impacts user retention , online feedback, number of downloads (mobile apps) , conversions, and thus, revenue.

The quantitative rewards of having a fairly fast and robust online presence can be imagined by the fact that President Obama’s fundraising site raised an additional $34 million for his campaign in 2011, after increasing page speed by 60% and that Intuit saw a 14% increase in conversions after cutting down its load time to half (Ref.). On the contrary, any adverse behavior brings with it the cost of lost opportunity, potential loss of customer loyalty and a severe damage to the brand reputation.

An ideal way to mitigate such risks will be to estimate the impact of web performance and downtime on the business revenue using production load statistics juxtaposed with business revenue model and use the findings to optimize the capacity planning process and appropriate funding for performance tuning/testing activities accordingly.

Large enterprises mostly have dedicated teams to avert any ‘slowness’ in production , armed with state of art 24 X 7 real time monitoring and diagnostic tools equipped with dashboards and alerts/triggers , self-healing mechanisms, graceful fail-overs (like Oracle RAC implementation) and multiple levels of redundancy and data backups to take over the primary configuration in case things go south.

The precursor to such monitoring and arguably a much more vital step is to test this chunk of code and all the pieces of production hardware where this is going to be deployed for predefined performance goals w.r.t throughput and latency through a battery of tests (load, stress, scalability, soak) usually performed by a dedicated Performance Testing and Engineering Team (or the development team itself) over a period of time before release. Since prevention is always better than cure, almost 90% of  performance issues in production can be prevented if the hardware, configurations and code are thoroughly tested together by performance experts and the results carefully analyzed (avoiding wrongful extrapolations) for any anomalies and deviation from the standard expected behavior agreed upon by all the stakeholders.

Easier said than done, there are a plethora of challenges that the Performance Engineering team usually faces in this process , the most important ones being the absence of a 100% mirror environment of production (due to cost constraints), lack of enough unique test data for applications under heavy load (in tune of millions of hits per hour), stubbing or virtualization of test data in absence of enough unique test data (which fails to simulate exact production like realistic loads) and lastly the common practice of selectively targeting only business critical and high volume business flows for load and stress testing due to time constraints while neglecting other business flows which become potential threats as they can trigger a slowdown and an eventual crash over a time period by a simple memory leak or unexpected load behavior.

How the end user perceives the application/website performance will depend on a host of factors like the end device being used for access, network speed in that geographic location, code behavior and hardware capacity at different levels (back end, middle ware and front end), the overall user load at that point of time and a multitude of configuration settings (like thread pool configuration, message queue lengths etc) at the application, database and web servers.

A Performance Engineer in true sense has the responsibility of tuning all these components and configuration settings within them so that working together they deliver peak performance and do so at the minimum cost, almost with the precision of conducting a 1000 pieces symphony orchestra. Though, this might sound overwhelming at first and admittedly takes years to master, the Performance Experts know by experience that most of these performance issues can be usually attributed to a few broad areas and  improperly configured parameters which might have gone unnoticed or were overlooked by the developer and tester during normal development cycle.

The remaining part of this article is a non-exhaustive list of such probable problem areas and configurations to inspect when facing a performance degradation. In a holistic way agnostic to any specific technology, database or middle ware/web server (but mostly revolving around Java Enterprise implementations), the list is based on my personal experiences in similar roles and also talks about detecting and resolving the related performance issues using appropriate tools and methodology, assuming basic understanding of performance engineering related terms and concepts by the reader. This can also be used in parts as a high level checklist by a Performance Engineer to refer to before finally stamping the code and hardware configuration fit to go live.

  1.  Memory Leaks: Memory leaks are one of the most common reasons causing slow response times over time in a Java Code, arising out of improper JVM configurations and malicious objects. If left unchecked, such leaks can cause a complete shutdown (OutofMemory error). There are multiple ways to check for memory leaks as per tool and time availability like interpreting the output of  -XX:+PrintGCDetails on the console directly or monitoring GC behavior on Introscope (saw-tooth pattern). For digging deeper, taking heap dumps (jmap) and analyzing them on tools like IBM PMAT or Eclipse MAT plugin can give an in-depth information on the classes and objects causing the leak. Memory leaks can be resolved by handling the discovered classes/objects at code-level by the developer and tuning the MemArgs Settings for the JVM.  JVM Tuning is an extensive topic in itself requiring a through understanding of memory management in Java and beyond the scope of this article but on a high level involves using the most suited GC algortihms for minor and major GC  along with setting optimum values of Xmx, Xms, Survivor Ratio, NewSize and MaxTenuringThreshold parameters as per the expected allocation rate. It is a common practice to test different JVM configuration settings for the same load and select the one giving best performance at the end. One can refer to this wonderful guide for in-depth information on JVM tuning for best performance. Tools: JMX Clients (JConsole, JVisualVM), JStat (Command line tool), GCViewer, HP JMeter, Introscope LeakHunter, Profilers (hprof, AProf).
  2. Thread Blocks & Deadlocks: Abnormally high CPU usage, requests getting timed out and abysmally slow processing can be caused by thread blocks (one thread occupying the lock and preventing others from obtaining it) , deadlocks (thread A needs thread B’s lock to continue its task while thread B needs thread A’s lock to continue at the same time) or waiting threads at the CPU level. Best way to identify and fix a thread issue is to take thread dumps (also known as Java Dumps) at frequent intervals (jstack utility) and analyzing them using tools like TDA (Thread Dump Analyzer) to check the number of blocked threads, waiting threads or a potential deadlock pattern (such pattern finding can be done manually too but tools make life easier). getThreadInfo returns information difficult to acquire by thread dumps like the amount of time that the threads waited or were blocked along with a list of threads that have been inactive for abnormally long time-periods. Tools: Thread Dump Analyzer, Introscope (can take thread dumps directly), Samurai (open source), JConsole.
  3.  Gridlocks: Too much of synchronization to avoid thread deadlocks might inadvertently ‘single-thread’ the application. It can lead to very slow response times coupled with low CPU utilization as each of the threads reach the synchronized code and enter a waiting state. This can be confirmed by checking thread dumps for a large number of threads in wait state (Waiting or Timed Waiting) across multiple dumps taken consecutively. Gridlocks can be avoided by trying to use immutable resources and using synchronization sparingly, at the code level.
  4. Thread Pool Size:  In web applications thread pool size determines the number of concurrent requests that can be handled at any given time. A properly sized thread pool allows as many requests to run as the hardware and application can support without straining the hardware. A larger than optimum thread pool leads to unnecessary overhead on the CPU which can further slowdown the response while a smaller than optimum pool size will lead to lot of threads waiting for execution, thus again slowing down transactions. An ideal way for determining the pool size is to estimate the average number of users in the system using Little’s Law (average serving time multiplied by arrival rate) and size the thread pool to match this number (keeping buffer for occasional spikes) while also ensuring that this size is ably supported by the hardware (i.e. number of CPUs at disposal). If there is a sudden increase in user load (more than the expected spikes) or even a gradual growth over time (for example with increasing customer base), the thread pool size needs to be realigned accordingly for the response to be equally fast as earlier.
  5. Database Connection Pool Configuration: Database connections are relatively expensive to create. Hence they are created beforehand and used whenever access to database is needed. This also limits the amount of load coming from a particular application to the database as too much of load can crash the database and impact other applications (in case of a shared database). This pool size has to be optimized according to the load expected and hardware capacity of the database server. As is the case with thread pools, a small DB Connection pool will force the business transactions to wait for a connection to be available. This can be confirmed by monitoring the queue time and length , both of which will be increasing rapidly. Also, majority of the business transactions will wait on aDatasource.GetConnection () call while Database will show low resource utilization.  On the contrary a very large (larger than optimum) pool size will allow too much load to flow to the database and can slowdown business transactions across all application servers accessing this database. This is characterized by high SQL query processing time and high CPU utilization on the database, observed from DB logs or AWR Reports. The application will wait on DB query executions (PreparedStatement.Execute()). The golden value for pool size is just below the saturation point in general.
  6. Poor/Corrupt Table Indexes:  This is a very common cause for SQL queries taking a long time to process. Once we have the query causing the slowness identified (by say AWR report or db logs) we need to list out all the tables being called in that query and check for indexes on all those tables. More often than not, if the query has shown a sudden degradation in performance and the query in itself has not been modified , it can be fixed simply by rectifying/rebuilding the faulty indexes.
  7.  Badly Designed Stored Procedures/SQL Queries: Many times the existing Stored Procedures are modified (making calls to new tables, new join statements etc) to support new functionalities. If there is a noticeable degradation in query processing time when compared to a previous codebase, these modified procedures need to be checked as first suspects. It is very possible that some performance impacting bit has crept in the query (increased number of sortings or full table scans) which is causing such behavior. There are multiple query optimization tools available in the market like DBOptimizer by Idera, SQLSentry PlanExplorer, queryProfiler by DBForgeStudio etc but the best way to get around such issues is to get hold of a DB expert (if you are not one) and let him/her tune the query !
  8. Non-effective bundling / Excessive http requests: While testing front-end systems, if there is observed a degradation in page load times at the browser, one of the easiest ways to do root cause analysis is to observe the Network tab in Developer view on Chrome browser while keeping the page under inspection open. This not only lists all the objects (like stylesheets,  images etc) being downloaded with their size and total count but also the time taken by the browser to download each of them. This list can be sorted for maximum time consuming objects or object size and then compared with the older code base before degradation (if available) for the number of components being downloaded and their respective load times to pinpoint the cause. An increased number of components getting downloaded (without any change in functionality) will require more number of http(s) requests adding to the page load times and indicating non-effective bundling/packaging of objects which need to be fixed at the code level. For a long running test, we can use the Web Page Diagnostics tool offered by HP Loadrunner to get similar metrics on component level response times for a webpage.
  9. Faulty Methods in Specific Transactions:  If there are specific transactions in test or production which appear to be problematic w.r.t. response times, we can trace their execution paths and component response times using Introscope Transaction Tracer. Transaction tracer not only filters poorly performing transactions based on given filters but also helps to identify the cause (components) causing such behavior within those transactions. Introscope also provides the provision of Dynamic Instrumentation on the fly (attaching byte code to return more metrics from a particular method selected for deeper investigation). The identified components need to be isolated and fixed (mostly at code level) in order to get things back to normal.
  10. Poor load distribution / Improper F5 Configuration: Load balancers are used for increasing the concurrency and reliability of large systems (by redundancy) but if  they do not have enough resources or are not configured properly (for example incorrect weights assigned in Weighted Round Robin (WRR) or Weighted Least Connection (WLC) algorithms), they can actually decrease the performance of application by putting too much traffic on a single server in the cluster or acting as single point of failures.  Load balancing issue leads to unequal distribution of load across servers and the same can be easily confirmed by monitoring CPU  usage, memory utilization and logging activity on all the servers sharing the load w.r.t. each other (the instances not taking load will be in ideal state). This overload of traffic on the remaining servers if left unchecked can cause severe performance degradation and might eventually lead to a crash because of not enough hardware available to handle the excess load. Apart from the load balancer configuration, such a behavior can also be caused by all the server instances not coming up properly to take up the load after a restart or new code deployment/ configuration change.
  11. Too Many Context Switches:  A high context-switching rate  (switching of CPU from one thread to another) indicates an excess of threads competing for the processors on the system. Context switching is a computationally intensive task and an unusually high switching rate will adversely impact the performance of multiprocessor computers. Excessive switching can be caused by excessive page faults caused by insufficient memory or a processor bottleneck w.r.t load . This can be monitored using sar -w and proc/[pid]/status on the server or by enablingRSTATD on the server being tested and observing the metric ‘Context switch rate’ in LoadRunner. Context switching rate can be checked by reducing the number of active threads on the system by the use of thread pooling or disabling hyper-threading if enabled.  The ultimate solution is to bring in a more powerful processor  or simply adding an additional one to the existing.
  12. JDBC Connection Leaks:  Connection leaks happen when we open a connection to the database from our application and forget to close it, or for some reasons it doesn’t get closed by the code. Connection leaks saturate the connection pool for new connections to be established causing major timeout and slowdown issues. Such leaks can be investigated by analyzing thread dumps for total number of created, active and idle threads or monitoring JDBC pool utilization using Introscope. Once confirmed, there are multiple ways of resolving this issue based on the implementation. For example using Weblogic Profile Connection Leak mechanism to pinpoint the root cause,  using Spring JDBC templates or using connection pool implementations which offer the option of forcefully closing connections (or notify about connection leakage) based on predefined conditions (the TomCat connection pool has logAbandoned and suspectTimeout properties to configure pool to log about possible leak suspects).
  13. Saturated Message Queues (MQs): A much larger than expected load can saturate the message queues and cause timeouts at the application level. If there is a message queue implementation in place and a slowdown is observed beyond a particular load during testing, it is a good idea to check the queues for saturation and flush them (in test) if saturated. Queue lengths can be monitored using Sitescope or by logging into the MQ servers. In case of  repeated saturation, the queues need to be optimally reconfigured as very large queue lengths can also cause a slowdown by hogging too much of memory. Relevant attributes to configure is Max queue depth in IBM Websphere MQ.
  14. Saturated Disk Space by excessive logging: As trivial as it might sound, it is quite a possibility that the disk space gets saturated earlier than expected (on account of some recurring error making the logs grow 100 times faster than usual or an inadvertent change in logging mode to error/debug from info) causing the response times to slow down before starving and the server thus refusing to process any further requests. With hundreds of metrics to monitor and inspect, this is also one area which tends to be easily overlooked at times. Disk  space utilization can be checked using du/df commands or monitored through Sitescope in real time. Once detected, we need to figure out the cause (look for errors and logging mode), fix it, clean up the disk space (or take backup) and then start all over again.

In addition to the problem areas listed above, it is a good idea to keep an eye on the Stall Count metric on Introscope where consistently high values imply slow backend, periodically high values imply load related bottlenecks in the system and progressively increasing count implies resource leaks. For more advanced diagnostics, custom Probe Build Directories (PBDs) can be written and deployed and the preferred metrics thus configured can be monitored via Introscope graphically. 

References:

  1. http://javaeesupportpatterns.blogspot.in/2011/02/jdbc-pool-leak-using-weblogic-90.html
  2. http://businesswireindia.com/news/news-details/wily-technology-introduces-new-tools-introscope-transaction-tracer-int/2491
  3. https://plumbr.eu/blog/io/acquiring-jdbc-connections-what-can-possibly-go-wrong
  4. https://docops.ca.com/ca-apm/9-6/en/using/ca-apm-metrics/the-five-basic-metrics#TheFiveBasicMetrics-StallCount

2015 Hospital Hacks, Banking and Retail Hacks, Entertainment Hacks – Attack On Financial Services

This post on my blog QA2100.com is in reference to a great post highlighted by Jaden Turner’s share on 2015 Hospital Hacks: and posted into our Group on Linked In:  QA2100 Testing Strategy: Financial Services

Every week we are hearing about another leak, hack or break-in and millions of credit card holders are exposed, at risk, or invaded.   Who are these hackers?  Why are they hacking?  Money.  Greed. Something for nothing.  Retribution.    All of which is Vicious, Criminal and
destructive to infrastructure, commerce, and consumer confidence

Security – is this an oxymoron? We hear it, and aretaught to believe it, so we truWhyItMattersst that others are responsible about implementing it. Real Security means real testing dollars are spent beyond the boundaries of a new project launch… Usually only the minimal security testing is considered if at all. If it is, is usually not part of projects, rather it is part of the ‘network’ group, or ‘infrastructure’ group.

So, this is not about the kind of job our network guys are up to, rather the kind of budget that gets allocated to supporting enough security measures, plus the budget to ensure it is being implemented and maintained at a deep enough or broad enough level. This means maintenance, and keeping up with the latests shenanigans by our nefarious ‘hackers’.

I have the same issues with performance testing. and automation for regression.

So I could go on, but these areas are allowed to get weak due to higher priority profit making budgets. and on and on until an emergency effort to shore up security is done again. Security = Insurance. If you don’t spend the money on the protection, it won’t be there when you need it.

This is the tip of the IceBerg and we need to be vigilant, and attentive to the looming prospects of risk.

BPM Testing – Get the Best Bang for your Buck!

QA2100-BPMTesting-16MP

Interested in getting the best bang for your buck with BPM Product Design, Development, Strategy, Testing, and Implementation?

Need a lift?  We can help!

Give us the opportunity to provide you with our assessments.  We have USA resources, and fully experienced offshore capacities for development, testing and delivery.

We have lived it for over 8 years and provided some of the finest products in the Insurance and Banking Industries.

Ccropped-facebookcover.jpgontact Bill Fulbright
Company: QA 2100 Test Strategy and Consulting
Website: http://qa2100.com
Email: bill@qa2100.com

BPM Testing in Today’s Market – QA 2100’s Testing BPM Testing Toolkits

QA 2100’s BPM Open Source and Web Service Testing Toolkits

The behavior of many BPM service based applications are governed by business process and workflows which are defined by business rules. These business rules must be validated during application testing. For many firms, testing business rules is a costly and complicated process which involves business users and testers. QA 2100 has invested in state-of-the-art automated BPM test methods and tools integrated by Pega Systems into PegaRULES Process Commander®
(PRPC) V.X and Test Management Framework, Bonita, and other opensource BPM products. Within the framework of PRPC, and BPM products is a process of design which utilizes not only business process, but a Requirements Definition tool which clarifies the Requirement process. This process turns use cases based on requirements into design, thus providing fundamental testing paths for automated testing of the BPM framework. This allows you to develop an application using a design based upon business rules, use cases, best practice development and quality principles.

Automated business rules and workflow validations can lower your testing time by 95%

Test Automation Using QA 2100 BPM Testing Toolkits
QA 2100 takes business rules validation testing one step further by automating the creation of test scripts using parameterized data and automating the execution of test cases. For example, QA 2100’s accelerator can execute 65 rule validations in 1.1 minutes using automation, versus 32.5 hours for manual execution. We use the Automated Unit Testing functionality within Pega PRPC to help you build a series of test cases to satisfy test requirements defined by the business requirements and use cases. These test cases are the foundation for automated test scripts. Automated test scripts can be built to pass from workflow to workflow, thus describing a partial or complete path through the application for scenario or end-to-end testing.

With the use of Test Management Framework (TMF), and other Test Repository tools, use case steps and parameters as described within the automated test scripts can be satisfied using the Scenarios and Suites features. The Scenarios and Suites test the behavior of the application and verify compliance with the original requirements. Besides providing significant savings in cost, time and efforts, automation lets you run many more tests during your testing process as a suite to provide hands off BPM Testing results.

Boundary Testing
QA 2100 provides boundary or negative testing of the business rules in the BPM framework and process to confirm the effectiveness of rule sets by requesting conditions that don’t exist. This helps ensure the business rules engine returns the correct value or an appropriate error. These boundary tests are set up as part of the actual application within each workflow.
QA 2100 has experience with automated tools to accelerate testing and improve accuracy
Employing automation tools to test and validate business rules adds breadth and depth to your testing efforts. By using pre-defined testing parameters, hands-off automation methodologies, and innovative solutions, you can accelerate and simplify a complex process.

XMLServiceTestToolExecutionTiming

32.5 hours to perform 65 rules tests manually
1.15 minutes to perform 65 rules tests using automation

Read More »

8 Key Factors for Cloud Delivery: Eight CIO Recommendations

To thrive in today’s swift-changing and unforgiving marketplace, companies need accessible, agile and adaptable IT.   Flexible service delivery is the answer.   Here’s how to employ it.   In the post-PC era, IT decision makers have a choice to make: Stay with the platform that got them here? Adopt a private or public cloud? Perhaps IT as-a-service or a mix of all the above?

CIO_STATS_mapping_the_enablers

Whatever you decide, a move away from rigid IT infrastructures is a move in the right direction. According to a recent McKinsey & Company survey, more than half of surveyed officers cited the switch to flexible service delivery as a top priority

The reason is simple: Flexible delivery is more adaptable and costs less. It’s a smarter way to distribute IT to users.

We recently confirmed this with a large U.S.-based telecommunications client that needed its technology to scale to tens of millions of subscribers on demand and then let those same subscribers pick and choose online the services most important to them.

Originally, the company considered a traditional infrastructure and then briefly a public cloud. But after completing a holistic analysis, we determined an internal private cloud would be the best option for three reasons: 1) It met the client’s needs (i.e. time-to-market, minimal downtime, continuity and rapid scalability requirements); and 2) It was less expensive over time when compared with the public cloud; and 3) It left open the option for later incorporating a public cloud if desired.

The beta test verified that the model enabled rapid elasticity, continuity and increased uptime. Mission accomplished.

You can achieve the same results. But only after you consider the following eight recommendations we’ve used to transition clients to flexible service delivery:

1.  Determine your biggest pain point.
No one’s going to say “no” to faster time to market, enhanced computing flexibility, improved performance, tighter security and better service at a lower cost. While all of those areas can certainly be addressed through flexible service delivery, the model you choose will largely depend on your top pain point. In other words, you’ll need to honestly answer the following: What keeps you up at night?

2.  Decide which functions to shift.
Next, you’ll need to designate which functions and processes to switch to flexible service delivery. This entails a “core vs. context” analysis, in which you distinguish business activities that provide you with a competitive advantage from those that should be offloaded to external providers. Remember, what was previously considered a core activity is now often viewed a contextual one, including (but not limited to) network management, tech support, performance measurement and financial planning.

3.  Start with a low-risk pilot program.
For established companies, it’s usually best to dip your toes into flexible delivery before diving in headfirst. You can achieve this by developing a pilot program for non-production processes, back-office functions or anything else that has lower performance requirements and less impact on end-users, such as testing and development. In some cases, however, it might make sense to start with customer-facing applications if there’s a pressing need to market new products.

4.  Identify “chatty” applications.
To get the most for your money, you’ll need to determine the consumption levels of your CPU, memory, network and disk storage. In doing so, you’ll be able to identify “chatty” applications that sometimes incur surprising surcharges between the cloud provider and the business. The more you know, the more you save.

5.  Mind those non-IT bottlenecks.
Once IT has been upgraded to the equivalent of a 12-lane highway with the help of flexible delivery, other parts of the organization cannot remain in horse-and-buggy mode. Well, they can. But they’ll become a bottleneck to your business. The challenge is to optimize the entire enterprise so that other areas don’t hold up the software lifecycle. To do this, you’ll need to educate and update your company culture.

6.  Compare service requirements.
Buyer beware: Most public cloud providers offer standard service level agreements that cannot be customized according to client needs. In some cases, general service levels may be sufficient for a development environment. But they often fall short of the demands of a production environment. To find a service level best suited to you, you’ll need to know the difference.

7.  Check security qualifications.
Security is a top-of-mind consideration, particularly for applications and systems that handle personal information. To ensure your data is protected, always certify a provider’s security qualifications. And know that cloud providers typically conduct security audits at a more intensive level than companies hosting internal private clouds.

8.  Demand transparency with a daily dashboard.
To manage variable costs, you’ll want to monitor your capacity and all of its operational parameters — straight down to the lowest server — with a daily dashboard. With this level of transparency, you can make day-to-day decisions about the level of CPU, memory and storage required and use metrics and trends to make decisions about future capacity financial modeling.

Admittedly, the changes required to move to a flexible service delivery can seem overwhelming. But by following the above advice, you’ll put yourself in a better position to find your own success.

For more information, read our white paper on Flexible Service Delivery (pdf), get inspired by our enabler series on The Future of Work or visit Cognizant Business Consulting.

Uncover the key factors to consider before designing a flexible service delivery model. Read on to find out more:

http://www.cognizant.com/latest-thinking/perspectives/Pages/finding-flexible-delivery-success-eight-cio-recommendations.aspx#.VE1ilqK-f8R.twitter

Lift Your Performance Testing in Financial Services and Reduce Risk

So…. you have to improve the system performance, the rules, meet new demands on the call center, move to architecture faster and more automated than the legacy systems….

Performance testing is a great way to take a look at the whole system before you begin even a legacy lift.

I say this because in order to prepare for a robust performance testing program, the whole system should be ‘decomposed‘ and analyzed from the final end point of Delivery of Services all the way back to the input of information at the beginning of the system. This type of ‘backing through the system’ ensures that each workflow, each touchpoint (SOAP-service, MQ, etc.), each system, and process to support a given line of business has been assessed and evaluated.

In this process, many important metrics are discovered: Response times based on normal and peak usage of a given worktype, workflow, call center, or actual system, as well as E2E coherency. This process is effective whether performance testing has ever been done in the organization or not. Often it is limited to only a short throw of a specific system’s performance. With expanding technology, the system designed even 5 years ago can easliy be out of date. So, a full E2E Performance Testing of a given system has merit, especially if upgrading the platform. Such as a new Claims processing platform/application, Loan Application Process, Policy Management system, or even a workflow management system.

So, in a way, Performance Testing is the elephant in the corner, because it reveals system weaknesses and needs, broken processes, antiquated code, antiquated data, rules revisions, risks/dependencies, and of course costs to shore up or replace hardware, monitoring tools, or review of process efficiencies or inefficiencies. All of these items can be confrontive and require capitalization.

With that said, Performance Testing in the best of lights, is mostly preparation, analysis, education, and a great investment into system fluency. The information gained in this process alone, will provide the following benefits: general health check on the system, prioritization of system needs based upon risk and cost, planning for implementation of needed changes to applications, hardware, processes and product output.

So when I hear “let’s bring in Performance Testing at the end, after everything is built”, you might say I have some concerns. Did you do a system analysis, or did you just build another project/product to operate on the same system without assessing the impact on present load and operations, or what impact this product will bring in the next year or two, or five.

What are the plans to begin ongoing monitoring and assessment of system capability to handle the growing needs and market demands?

Just a few thoughts in case a new system lift, upgrade, transformation, or product release is being considered. Consider Performance Testing the same as an Insurance Policy, and the reviews of your portfolio to have sufficient coverage if you have a loss. Without regular checkups, your ‘policy’ could be out of date, ‘out of force’, or non-renewed.

One last thought. Trying to cram it in at the last minute to use remaining budget for the year, is highly risky. In the past few months, I have seen just that in several cases. Performance Testing is not a fix, rather it is a validation of what you have. If what you have at the last minute is insufficient, going to production will be risky and costly.

Sorry for the bad news, but it is better to do these things FIRST to prepare the way for a successful launch.

Bill Fulbright
10/24/14

A posting from my Linked-In Article with the same Title