13th July 2017 - 13h25, Tridion content manager is still encountering issues (left-over from major incident since yesterday). No content updates is possible in the Solvay ONE and any Internet websites. No impact on LIVE websites (viewing).
Incident chronological timeline:
Timestamp
Description
13/07 4.50 pm
After few key users tests synchronisation with SBS comm, a communication has been sent out through Flexmail to all Tridion users + Service centers LO + CGI support regarding the normal availability of the platform
13/07 4.25pm
Second wave key users testing has been performed
Tridion platform is now back to life!
13/07 4pm
After certain investigations (between Adagio team and application team), MS DTC error no longer appearing in the logs
But other error messages occurred
Meanwhile found out that MS DTC configuration has been reset
Reconfigured as per requirement
First wave testing seems good
13/07 1.25pm
Restart did not work
IS adagio will continue to investigate with IBM on the SQL database side
Meanwhile application side is working with SDL provider to get means on testing the connections on MS DTC
13/07 11am
IS adagio has determined certain technical issues left-over from the IBM incident and will require some interruptions on the databases in production
13/07 9am
After synchronisation with SBS comm, a communication has been sent out through Flexmail to all Tridion users + Service centers LO + CGI support regarding the downtime of the platform (no contribution)
13/07 5am
Application team still see traces of SQL errors
Application team also raise a ticket at SDL provider regarding error message received
13/07 2am
Still an issue on Tridion platform. Not fully functional as we get connectivity issues on the SQL
Application team worked and continue with investigation with IS adagio
12/07 10.45pm
IS adagio reported network is stable again
However, Tridion platform is not up still. Contacted IS adagio team to restart the correct server
12/07 9pm
IS adagio confirmed that it was due to a change on network level and reverted changes to resolve the issue
12/07 7pm
IS adagio communicated that IBM is aware n working on the issue, at the network level
Started to get incoming tickets from freshdesk
Communicated to CGI support to reply to tickets on the situation
12/07 5.28pm
IS adagio did mass communication to application leaders that there were troubles on MS SQL 2008 shared production environment at IBM, affecting a few applicstions
12/07 5pm
noticed that the Tridion Content Manager went down (no content updates were possible)