LON01 – CORE03 – 24/08/2016 – 20:00 Till 23:59 *Emergency Work *

Following on from recent repeated hardware failures on core03.structuredcommunications.co.uk as detailed HERE the decision has been taken to fully replace the device. We will also be taking the opportunity to upgrade the IOS image to bring this device in line with the current images across the rest of our network.

This work will involve powering down and physically moving the current device along with all installed line cards. Due to this all directly connected services (listed below) will be unavailable for the duration of the works.

> Bonded DSL on AG1 & AG2

> un-managed SIP trunking provided via sipwise.easyipt.co.uk

> Managed VoIP services provided via primary-sw.r03.core03 & primary-sw.r04.core03

> Webhosting via server01.easyhttp.co.uk

> VPS sessions on esxi10.r02.structuredcommunications.co.uk

Other services will remain unaffected. Redundant services provided via other parts of the network (Such as DNS & SMTP) will take over. Please ensure you configuration is up to date.

UPDATE01 – 20:10 – 24/08/2016 Engineers are on site and these works have started.

UPDATE02 – 22:06 – 24/08/2016 Engineers have completed the above works ahead of schedule and we can confirm all services have returned to normal. We apologize for the inconvenience caused.

We will continue to monitor the new device to ensure continued operation.

LON01 – CORE03 – 21/08/2016 – 08:33 *At Risk*

Our network monitoring has alerted us to a fault on CORE03 within our Goswell Road network. This fault is a re-occurrence of an issue identified yesterday that was resolved without impact. Additional logging was added at the time to further assist should it be required.

The issue has been tracked down to the “Ethernet Out of Band Channel” (EOBC) control channel on the devices back plane.

Due to the number of line cards automatically taken out of service by device, we are currently investigating to see if this is part of a common hardware fault such as the current active supervisor module.

We diversely route our internal backhull fibre up-links across each core to insure that a single line card failure does not result in an outage. This is currently in operation however we have lost a number of links due to the fault and the device is classed as at risk along with any directly connected equipment.

We are currently reviewing the logs and will update with further information / action plan asap.

UPDATE01 – 09:49
After reviewing the logs we have concluded the next action step is to swap between active supervisors within that device. This will cause a brief outage to all services connected to that device. We will monitor the device closely after the change to see if the same issue occurs. This reload has been scheduled for 10:00 today.

UPDATE02 – 10:08
The swap completed as expected, however despite this supervisor showing OK and passing diagnostics, it failed to fully take the system load and was reverted back. We suspect this is now a possible backplane issue on this device. Further updates to follow.

UPDATE03 – 14:08
Further observations have been made and the log files reviewed at depth. At this stage we can advise the backup supervisor within CORE03 has been reporting errors however the Cisco IOS listed these as “Non-fatal” and as such have not been flagged up within our monitoring platform.

We suspect a fault had occurred on the standby supervisor which had not been picked up on by the devices internal diagnostics until we bought the card fully in to operation. This fault we suspect was having an impact on the EOBC reporting and thus causing line cards to be disabled. As the previous fault took 24 hours to resurface we are continuing to monitor. An emergency maintenance window is also going to be scheduled for CORE03 to replace the suspected failed card, along with an IOS update.

UPDATE04 – 14:30 – 22/08/2016
Despite seeing the device operate for over 24 hours without further errors, we have just observed the fault conditions triggering line cards to be disabled. We therefore now suspect this a problem with the chassis its-self and the backplane. We will now be replacing the entire device as a matter of course to prevent this escalating to an outage. Further works will be scheduled and notified via the NOC as we dont have a pre-built device on site.

EasyIPT – 0207 Numbers – 08/08/2016 – 16:56

We are aware of some intermittent inbound call issues on 0207 numbers from a single carrier.

We have raised a support query with this carrier and are awaiting an update from them. At the moment we are not aware of this affecting outbound calls, nor are we aware of any other affected inbound area codes. If you are having issues we would advise you to contact us so your examples can be logged to further assist with a quicker resolution.

Virgin Media Based Circuits – 03/08/2016 – Incident – Cleared

Update: 15:52
Virgin Media have advised that all repair work is now complete and all services restored. Circuits should no longer be considered at risk.

We apologise the inconvenience caused during the course of this incident.

————-

We are currently aware of an issue affecting a number of on-net Virgin Media based circuits. We believe this is due to a fibre break within the Virgin Media network, no ETR has currently been given.

Services with backup media such as DSL will have been re-routed.

Update: 09:35
Virgin Media have advised that the location of the break has been identified as 7.5km away from a local Virgin Media hub. Splicing engineers are currently attending the closest joint to determine the exact location of the break.

Update: 10:26
Following our previous post, several engineers are now on site identifying the exact location of the fibre break. We are expecting more specific details on the break in due course.

We will keep in contact with the relevant escalation points within Virgin Media to obtain the latest information, as it becomes available.

Update: 11:30
Virgin Media have advised engineers have attended the next joint along from the break and completed Optical time-domain reflectometer (OTDR) testing between the joints. Red light testing is currently underway to confirm the precise location of the break. Virgin Media have also confirmed splicing teams are on site to begin work as soon as the break location is confirmed.

We will post updates as soon as they become available.

Update: 12:58
Virgin Media engineers have located the break between Corbet Place and Brick Lane in London. Engineers are currently accessing the pits this cable passes through to visually inspect the cable and identify if any spare fibres can be used. If any spare fibres are identified they will be prioritised for our services.

Further updates will be posted as soon as available.

Update: 13:22
Virgin Media have advised that spare fibres have been located. Splicing will begin on the first end of the cable shortly and should be complete within 30 minutes. Splicing at the second end of the cable involves more work and is expected to take up to a further 60 minutes to complete.

Further updates will be posted when available.

Update: 14:55
Splicing work on the cables that carry our services has been completed and connectivity restored to all affected circuits. Virgin Media advised work is still ongoing at the site of the break so circuits should still be considered at risk.

We will advise once Virgin Media confirm all repair work complete.

Update: 15:52
Virgin Media have advised that all repair work is now complete and all services restored. Circuits should no longer be considered at risk.

We apologise the inconvenience caused during the course of this incident.