Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Image Added

Warning

13th July 2017 -

13h25

16h19 updates:

Tridion content manager is now back.

Sorry for the inconveniences caused.

Excerpt
hiddentrue

13th July 2017 - 13h25, Tridion content manager is still encountering issues

due to the mass incident that was reported at the network level at IBM data center.

(left-over from major incident since yesterday). No content updates is possible in the Solvay ONE and any Internet websites. No impact on LIVE websites (viewing).

Sorry for the inconveniences caused.

Incident chronological timeline:

TimestampDescription
 13/07 4.50 pm
  • After few key users tests synchronisation with SBS comm, a communication has been sent out through Flexmail to all Tridion users + Service centers LO + CGI support regarding the normal availability of the platform

Image Added

13/07 4.25pm
  • Second wave key users testing has been performed
  • Tridion platform is now back to life!

13/07 4pm
  • After certain investigations (between Adagio team and application team), MS DTC error no longer appearing in the logs
  • But other error messages occurred
  • Meanwhile found out that MS DTC configuration has been reset
  • Reconfigured as per requirement
  • First wave testing seems good
13/07 1.25pm
  • Restart did not work
  • IS adagio will continue to investigate with IBM on the SQL database side
  • Meanwhile application side has worked is working with SDL provider to get means on testing the connections on MS DTC
13/07 11am
  • IS adagio has determined certain technical issues left-over from the IBM incident and will require some interruptions on the databases in production
13/07 9am
  • After synchronisation with SBS comm, a communication has been sent out through Flexmail to all Tridion users + Service centers LO + CGI support regarding the downtime of the platform (no contribution)

Image Added

13/07 5am
  • Application team still see traces of SQL errors
  • Application team also raise a ticket at SDL provider regarding error message received
13/07 2am
  • Still an issue on Tridion platform. Not fully functional as we get connectivity issues on the SQL
  • Application team worked and continue with investigation with IS adagio
12/07 10.45pm
  • IS adagio reported network is stable again
  • However, Tridion platform is not up still. Contacted IS adagio team to restart the correct server
12/07 9pm
  • IS adagio confirmed that it was due to a change on network level and reverted changes to resolve the issue
12/07 7pm
  • IS adagio communicated that IBM is aware n working on the issue, at the network level
  • Started to get incoming tickets from freshdesk
  • Communicated to CGI support to reply to tickets on the situation
12/07 5.28pm
  • IS adagio did mass communication to application leaders that there were troubles on MS SQL 2008 shared production environment at IBM, affecting a few applicstions
12/07 5pm
  • noticed that the Tridion Content Manager went down (no content updates were  possible)
  • escalated immediately to IS Adagio team