Incidents – Structured Communications NOC

Incident: sbc03.easyipt.co.uk – 13/12/2024 – 09:15

We are aware of a possible registration issue with SBC03 are are currently investigating.

UPDATE01 – 09:35

This has now been resolved. We apologise for any inconvenience caused.

Notice: Draytek Vulnerability

We have seen an increasing number of attacks targeting Draytek routers, following reports that approximately 700,000+ devices globally are vulnerable due to recently discovered security flaws. You can read more about this issue here: The Hacker News – Draytek Router Vulnerability.

These attacks are causing significant connection instability for affected users. To mitigate potential risks, we strongly advise all users with Draytek devices to ensure their routers are updated with the latest firmware versions as soon as possible.

If you require assistance with updating your device or have any concerns, please contact your IT provider.

UPDATE01 – 06/10/2024

We have seen first hand a number of compromised routers since Friday. These include VPN users being created, both HTTP and HTTPS ports being changed as well as IP objects being created and enabled.

We would highly recommend updating your Draytek router and talking with your IT company. We have a number of providers we can recommend if you don’t have one.

Incident: sip03.easyipt.co.uk – 29/07/2024 – 12:20

We are aware SIP03 is suffering from issues following an emergency reboot. We are currently working on this as a matter of urgency.

UPDATE01 – 12:42

We are currently awaiting the database to recover. As a precaution, we are also restoring a backup of the platform to another host.

UPDATE02 – 13:10

A restore of the platform has resolved the issue. We are sorry for the disruption.

UPDATE03 – 13:15

After review we will be migrating this server to a new version of the software.

UPDATE04 – 17:37

A new software build has been deployed and accounts migrated over. A further window will be raised for the migration works.

UPDATE05 – 02/08/2024 – 21:00

SIP03 accounts have been migrated to the new platform. Test calls have been completed successfully.

Incident: partner01.easyipt.co.uk – 20/05/2024 – 09:01

We are aware partner01 is suffering a repeat of the issues experienced Friday and are currently investigating.

UPDATE 01 – 09:15

This has been resolved and calls are now processing. We are aware of the root cause of the issue.

UPDATE 02 – 14:00

We have made changes to the server to prevent a service being reloaded that we believe to be the root cause of the issue.

UPDATE 03 – 17:00

In light of the recent issues, we have decided to replace this host and deploy a new version of the software. This will be done manually to ensure a clean build. No changes are expected to be required from users.

UPDATE 04 – 22/05/2024 – 16:00

The new platform has been built, audited and deployed to the relevant ESXi host. We expect to take the current host out of service shortly after 20:00 and bring into service the new one at around 20:05.

This should trigger a re-registration of end points to the new host.

We will update this once tested.

UPDATE 05 – 22/05/2024 – 20:00

This work has started.

UPDATE 06 – 22/05/2024 – 20:05

We have not seen subscribers reconnect as expected. This has been tracked to a service not operating as expected due to being loaded with VPS resources disconnected and a reload is required.

UPDATE 07 – 22/05/2024 – 20:09

We have encountered a problem with incoming calls due to a firewall rule. This is being corrected.

UPDATE 08 – 22/05/2024 – 20:15

We are seeing trunks re-register and testing is now under way.

UPDATE 09 – 22/05/2024 – 20:20

We have tested a number of online trunks without issue. We have also tested on our test platform again without issue. We will monitor for another 30 minutes

UPDATE 10 – 22/05/2024 – 21:15

We have continued to make test calls which have all completed successfully. We are seeing calls progress as expected.

UPDATE 11 – 23/05/2024 – 14:45

We have seen the platform operate without issue so are now closing this.

Incident: partner01.easyipt.co.uk – 17/05/2024 – 09:01

We are aware the media gateway partner01 has stopped processing inbound and outbound calls. We are currently investigating as a matter of urgency.

We apologise for the disruption this is causing.

UPDATE01 – 09:10

The issue has been located with a backend service and database We are working to restore.

UPDATE02 – 09:45

Service has been restored and we are looking in to why this occurred. Once again we apologise for the disruption caused.

Incident: Connectivity – 15/04/2024

We have been aware of website reachability issues between approximately 16:30 and 18:00 this afternoon. Initial diagnostics indicated that the issue was beyond our network.

Subsequently, we have discovered that several Tier1 networks in London are experiencing significant network disruptions. To address this, we have identified and isolated the common transit provider between our networks and removed it from service.

We apologize for any disruption this may have caused and we are monitoring the situation.

UPDATE01 21:30

We have re-enabled our GTT transit and are currently not experiencing any issues. We suspect the issue may have been related to a third-party CDN network or, as our contacts suggest, a subsea cable.

Incident: Broadband – 18/04/2024 – 13:00

We are aware of a MTU issue on our wholesale upstream broadband provider.

This will be impacting web browsing and other services.

We are working with them to isolate the issue.

We apologise for the inconvenience

UPDATE 01 – 14:00

We have seen services return to normal after changes made by our wholesale provider. We are awaiting details as to the root cause.

We apologise for the inconvenience

Incident: Horsham DC CORE – 18/04/2024 – 12:15

We have identified an issue within our Horsham Data Center that has impacted several services. Service stability has been restored, and we will provide a further update shortly.

We apologize for any inconvenience caused.

UPDATE01 –

At 11:30 AM, our network monitoring system detected a service disruption affecting multiple platforms situated outside our network.

Preliminary diagnostics indicate that the issue was confined to our Horsham data center.

Subsequent investigations identified the root cause as an MTU (Maximum Transmission Unit) anomaly, implicating either the core infrastructure in Horsham or an interconnect linking our two locations.

Our engineering team pinpointed the issue to a specific interconnect and promptly isolated it from service. This action restored full network operations.

Ethernet EAD – 22/07/2022 – 11:30

We are aware there are a small number of layer 2 ethernet services currently down at the moment. Internal investigations show this to be a upstream supplier issue and we are currently engaging with them to locate the fault.

UPDATE01: 11:40

We are seeing services down from BT Wholesale, Openreach EAD Direct, TTB so we suspect this may be a common POP failure between layer2 providers landing circuits in London. Customers with backup will have automatically kicked in and re-routed.

UPDATE02: 12:00

We are still awaiting for an official update from our layer2 provider as to the root cause of this issue but they have advised they are seeing circuits down with a large number of calls on hold to the service desks.

We have already escalated to our management contacts to push for information so we can provide detailed updates.

UPDATE03: 12:25

Our Layer2 provider has now declared a “Major Incident”

They have advised this appears to be related to a “DNS issue” but we have disputed that. We are continuing to chase for updates. We apologise to affected customers and the inconvenience this is causing.

UPDATE04: 12:41

Our layer2 provider has now advised of a major internal core network problem affecting more than just layer2 services. This is currently affecting less than 10% of our overall EAD circuits with this layer2 provider and services delivered via other layer2 partners are unaffected.

We have been advised internal teams are working to identify the root cause and issue a fix. We again apologise to affected customers and the inconvenience this is causing.

UPDATE05: 12:55

Following further pressure to our account manager we have been advised this is affecting layer2 services being delivered to us via a core device located in there network at Interxion London with multiple service tunnels flapping.

We have been advised there will be a further update by 13:30

UPDATE06: 13:40

A further update has been provided to advise they are still working on why network tunnels on this device are “flapping” They have advised a further update by 15:30 but we will keep pushing for information and a ETA on service resolution.

UPDATE07: 15:05

Our network monitoring has shown our end points are reachable again on the affected circuit’s. We have no been provided an official clear yet and services should still be classed as at risk.

UPDATE08: 17:00

Our layer2 provider has advised they have re-routed around the affected device and are currently working with the hardware vendor to establish why the device has failed in the way it has. We suspect there may be a short outage at a later date once services are re-routed back via this device but we will advise at the time.

HOR-DC – at risk

Our network monitoring has alerted us to multiple circuit failures within our Horsham facility. Initial diagnostics seem to show fibre breaks and we suspect this may be the result of civil contractors. Traffic is flowing across redundant paths in to the building with no loss of primary peering or transit, but should be considered “at risk” due to operating on redundant links.

Ethernet services that terminate in to our Horsham facility will have automatically failed over to backup if purchased.

Faults have been logged with Openreach and we will keep updating this page as we know more.

UPDATE 01 – 12:01

We have seen all our “primary” fibre links recover and service has been restored, however no official update has been provided. We are still awaiting recovery of the other affected fibre links.

UPDATE 02 – 12:10

Openreach engineering teams are on route to our facility.

UPDATE 03 – 14:50

Openreach are on site

UPDATE 04 – 15:00 *FINAL*

All fiber links have been restored. Contractors working on the Openreach network had trapped one of our fibre tubes running that route and caused bends on the groups of affected fibre to the point light was unable to pass.

Tubing and fibres have been re run in the AG Node by Openreach and service has been restored.