Network UpdatesGigabit Broadband Voucher SchemeGigabit Broadband Voucher Scheme

Network Updates

 Network Updates

If there are any major outages or disruption affecting multiple customers on our network we will post regular information on this page.  In some circumstances when there is a major outage phone lines may get congested so please check here regularly and we will keep you up to date with progress reports.

Progress reports for the following services will be posted here:

  • Horizon Cloud Phone System
  • Business Phone Lines
  • Business Broadband including Fibre Broadband
  • Private Fibre Ethernet including MPLS
  • SIP Trunking
  • Mobile Networks
  • Non Geographic Services

There are no network updates to report. Please regularly check this page for any new outage notification. 

Service Incident Report

Incident: Reference: Horizon Cloud Telephony Platform – Loss of Horizon connectivity and audio related issues during the outage.

Start Date: Wednesday 14th November 2018 – 08.55

End Date: Wednesday 14th November 2018 – 17.45

Summary

Firstly, it is important to remind our customers that this is the first major outage on Horizon for 5+ years, the platform is very stable and we have every faith and confidence that our supplier will ensure that this does not happen again.

Horizon users will have experienced registration issues on their Horizon devices between 08:55 and 17:45 on Wednesday 14th November. Devices which were able to register may have experienced audio issues on connected calls until 17:45. The incident occurred due to a bug being discovered on the platform in the early hours of Wednesday 14th November. This bug prevented a large proportion of Horizon phones from registering and sending subscription updates to the platform. Although the bug was resolved at 10:18, the impact of the volume of re-registration attempts, coupled with peak call traffic created instability on the wider platform. Our engineering teams were engaged with technology partners that support various part of the Horizon platform for the remainder of the day, introducing various measures to stabilise the platform.

Corrective Action Taken

The initial incident occurred between 08:55 and 10:18 and impacted a subset of Horizon users on a specific server. This outage was caused by the restart of the server earlier that morning and a bug was then triggered causing a memory leak across the server. Working with the Technology Partner we applied a patch to resolve the memory leak however the subsequent restart of the server resulted in device registration and subscription issues across the Access SBCs. The outage from 10:18 until 17:45 was due to the volume of devices attempting to register or send subscription messages (handset status updates) across the SBCs concurrently, a number of devices would have registered successfully however may have experienced media related issues throughout (one-way audio, no audio). Due to the increase in traffic volumes, the SBC went into a protective mode which enables them to continue to operate at a level that prevents them from becoming overloaded. The devices are designed to reattempt registrations on alternate servers if they can’t connect to their primary and this behaviour therefore resulted in the increase in signalling traffic and a subsequent wider registration issue. From our analysis, all pre-configured and active DR plans would have continued functioning as expected provided they were routing to off-net numbers (i.e. non-Gamma). Users may have experienced issues logging onto the Gamma Portal to activate or amend any of the following services throughout this outage

Call Diverts/Forwards
Twinning
Remote Office
Sequential Ring
Horizon Connect calls were unaffected, both inbound and outbound.

During this incident users will also have experienced issues logging onto and applying changes on the Horizon Portal. This was due to the volume of access requests and users attempting to make changes via the Portal. This will be reviewed as part of our major outage analysis and we will investigate how we can ensure that users can access the Portal during similar incidents.

Our Supplier (Gamma) has implemented the following changes following the outage:

• We have normalised all changes made during the incident to aid recovery across both handsets and SBC’s.
• We have applied the patch across all Application Clusters and Network Servers to mitigate a repeat of the issue.
• Analysis of platform and SBC logs are being analysed by our vendors to ensure the issue is fully understood.
• We have seen stability throughout Thursday 15th November.
• Investigation into the portal issues and capacity to deal with situations as per yesterday.

In addition to the above, we are conducting a full review of the server protection behaviours to ensure we can isolate and recover from any similar incidents without impacting the wider Horizon base. We will also review the bug with our Partner to identify how we can implement any service improvements around how we manage and deploy any patches on the network.

Customer Support
Customer Support

0800 206 2107 / 01273 615 600

Please fill out the form below and our team will be in touch.

Your Name (required)

Your Telephone Number (required)

Your Email (required)

Your Message