Over the last couple of months an unusually high number of customers (i.e. more than zero) have all been asking the same question… “can you tell us how to create a DR failover solution for IBM Cognos Controller?” – this is a question that got me scratching my head somewhat.
We already know that you can “split the COM+” components in Controller to separate out such functions as Consolidation, this isn’t so much fail-over as a crude form of load-balancing or load-sharing. But can you actually create a fail-over for IBM Cognos Controller?
One of the methods we considered initially, involved the use of Microsoft Clustering, this sounds viable in theory, however a phone call to IBM support suggested that this would not be supported.
After this blow a couple of more “realistic” options came to mind: -
Cold-Standby
DNS Alias
With one particular client we decided that they would utilise their test environment to trial one of the above. It was decided that the cold-standby method was not desirable as they did not want to have to power one server up manually and risk the possibility of both servers becoming available at once in the event that the other box was not shut down.
What we did was to install two separate servers, each with their own IBM Cognos BI and Controller components.
The BI server has to point at its own “real” server name or DNS name, if you use the same DNS alias for both servers within BI it causes problems within the Content Manager when the second server tries to register its dispatchers.
The IBM Cognos Controller Configuration application however needs to be pointed to the DNS Alias, this way when the Controller client etc all look for the server, they are looking for the same thing each time.
On the face of things, this seemed to work, basic testing showed that we could switch servers quite easily by altering the DNS alias record, however I still had an uneasy feeling about this as the COM+ components for Controller would still be running on each server at the same time.
A week or so later, there had been a number of complaints about strange behaviour with Controller, the software was randomly unavailable or not responding. Each time a reboot of the active server seemed to solve the problems.
We suggested that it may be worthwhile turning off the “standby” server for a week or so to see if things improved.
After just over one week, there had been no problems, this pointed at a problem with having two identically configured Controller servers turned on at once.
So it seems, for the time being at least, you cannot automatically, or even semi-automatically fail-over IBM Cognos Controller. We reverted to a cold-standby scenario which seems to be working well for now.
If you have any different experiences, or if you have managed to make this work somewhere, please feel free to get in touch, I am sure there are many customers out there looking for a similar solution.