A fire in the Netherlands, a ferry company in chaos, and a BIA that probably didn’t ask the right questions.
The Brittany Ferries booking outage is a textbook case study in third-party dependency risk, inadequate RTOs, and what happens when your recovery plan assumes someone else has already thought about it.
On the morning of Thursday 7 May 2026, a fire broke out at NorthC’s data centre in Almere, a short drive from Amsterdam. It took emergency services — five fire engines, eight support vehicles, a firefighting robot, and a drone team — twelve hours to bring it under control. No one was hurt. The physical damage to servers and data carriers was reportedly limited. By most measures, it could have been a lot worse.
By the following Tuesday, Brittany Ferries customers in Guernsey still couldn’t book a ferry. Reports suggest that tour operators were holding hotel reservations they couldn’t confirm, sending holding emails they couldn’t act on, and losing revenue by the day. Port staff were handwriting boarding cards. The company’s contact centre was offline. One tour operator described it as “beyond a joke” — and given that this came on top of a bruising year of operational disruption, the frustration is entirely understandable.
This is not a story about a catastrophic IT failure. It is a story about a perfectly foreseeable scenario that appears not to have been adequately planned for. And that, from a business continuity perspective, is the more interesting — and the more instructive — problem.
What the BIA should have identified
A Business Impact Analysis is not simply an exercise in cataloguing your systems. Its purpose is to surface critical dependencies, quantify the consequence of their loss over time, and establish the point at which that consequence becomes unacceptable. Done properly, it asks uncomfortable questions about things that sit outside your direct control.
For a ferry operator, the reservations system is not a back-office function. It is the commercial heart of the business. The ability to take bookings, amend existing ones, issue boarding cards, and manage capacity is operationally critical from the moment the booking window opens. It has a Maximum Tolerable Period of Disruption — the point at which the impact becomes existential rather than merely inconvenient — that is probably measured in hours, not days.
The BIA should have established, clearly and specifically:
- The Recovery Time Objective for the reservations system — the maximum acceptable downtime before operational and reputational damage becomes unmanageable. For a consumer-facing booking platform, that figure should arguably be four hours or less.
- The Recovery Point Objective — how much transactional data could be lost without creating downstream chaos in port operations, customer records, and revenue reconciliation.
- The full dependency chain behind the reservations system, including third-party suppliers such as Carus (which manages the reservations platform) and the infrastructure providers those suppliers rely upon — in this case, IBM Cloud, which in turn was a tenant of NorthC’s Almere site.
- The cumulative financial and reputational impact of extended outage: lost bookings, stranded tour operators, degraded customer confidence, and the reputational compounding effect of a second major disruption within twelve months.
The fact that the system remained unavailable for the best part of a week suggests one of two things: either the RTO was never formally set, or it was set but the recovery architecture was never tested against it. Neither is a comfortable conclusion.
What the risk assessment should have caught
Third-party dependency risk is one of the most consistently underassessed categories in operational risk management, particularly when the dependency is several tiers removed from the organisation’s direct supplier relationships.
In this case, the chain appears to look something like this: Brittany Ferries relies on Carus for reservations system management. Carus relies on IBM Cloud infrastructure. IBM Cloud is a tenant at NorthC’s Almere facility. A fire at NorthC — a company that most Brittany Ferries customers will never have heard of — takes out the booking system for a week. This is a four-tier dependency chain, and the vulnerability sits at the bottom.
A thorough risk assessment of the reservations function should have probed the following:
- Geographic concentration of critical infrastructure. The Almere site is one of fourteen NorthC data centres in the Netherlands. That concentration — and the fact that the fire caused nationwide outages affecting GP practices, Utrecht University, and transport operators simultaneously — points to systemic vulnerability in the hosting geography rather than a quirk of this particular site.
- Physical risk at the data centre layer. The Almere facility markets itself as having a chilled water redundant cooling system and an aspiration fire detection system. The fire nonetheless destroyed the backup power infrastructure and took the power supply offline for days. The gap between marketed resilience specifications and actual recovery performance is itself a risk that should be stress-tested.
- Single points of failure in the supplier’s cloud architecture. IBM’s Amsterdam 03 region was the only region affected — which raises the question of whether Carus, or Brittany Ferries as their client, had ever asked whether the reservations system was architected with geographic redundancy across IBM’s cloud regions, or simply hosted in the nearest available data centre.
- Contractual recovery obligations with the supplier chain. What RTOs are Carus contractually committed to with Brittany Ferries? What are IBM’s obligations to Carus? What happens when neither of those SLAs can be met because the failure is two steps further down the chain?
These are not exotic or theoretical risk scenarios. Data centre fires happen. Power infrastructure fails. The risk exists, it is foreseeable, and it has a known impact profile. If it wasn’t on the risk register — or was on the register but assessed as low likelihood without scrutiny of the dependency chain — that is precisely the kind of gap a structured risk assessment is supposed to close.
The response: what good looks like, and what this wasn’t
It is worth being fair to Brittany Ferries here. The ferries kept running. Existing bookings were honoured. Passengers with confirmed travel were able to board, albeit with handwritten boarding cards on what should have been a busy Liberation Day weekend. That is not nothing — operational continuity in the face of system failure requires real coordination.
What failed was the commercial continuity layer: the ability to take new money, amend existing bookings, and support the tour operators and travel agents whose businesses depend on real-time access to availability and capacity. A business continuity plan for a transport operator should have included specific provisions for exactly this scenario.
A robust response plan for reservations system failure would typically include:
- A manual or semi-manual fallback for booking intake, even if limited in capacity. The observation from one tour operator that Condor under previous ownership accepted fax or email bookings at the start of a season is telling — it implies the operational knowledge and the workaround existed previously and has since been retired without a replacement.
- A pre-agreed communications cascade to trade partners, not a generic website notice. Tour operators carry significant commercial exposure during system outages: they are holding hotel inventory, managing customer expectations, and making real-time decisions about whether to redirect bookings to alternative destinations. A blanket email from an unnamed member of the trade support team is not a crisis communications response.
- Honest, time-bound recovery estimates. Brittany Ferries told customers the system would be restored by end of play Monday. It wasn’t. Then Tuesday morning. It wasn’t. Then Thursday afternoon. The progressive slippage of recovery estimates — driven, it appears, by NorthC’s own dependency on a delayed critical component for its power restoration — is precisely what a realistic recovery timeline exercise should surface in advance. If your RTO depends on your supplier’s supplier receiving a delivery on time, you do not have a credible RTO.
- A compensation and goodwill framework, activated without waiting to be asked. The absence of any mention of compensation in communications to affected operators is a reputational risk in its own right.
The wider lesson: your resilience is only as strong as your supply chain
This incident will not make the national headlines. A data centre fire in Almere is not the kind of story that breaks through into mainstream coverage, even when the downstream effects stretch from Dutch GP practices to Channel Islands ferry terminals. That, in itself, is part of the problem.
Organisations routinely assess the resilience of their own operations and of their direct suppliers. They far less commonly map the resilience of their suppliers’ suppliers, or ask what the failure of an infrastructure provider in another country — one they have never contracted with and may never have heard of — would do to their core commercial capability.
ISO 22301 is clear that business continuity management should encompass the organisation’s dependencies, not just its own internal processes. That extends to understanding the concentration risk, geographic risk, and recovery architecture of the technology platforms on which you depend — even where those platforms are delivered “as a service” and the underlying infrastructure is invisible.
The questions every organisation should be asking after this incident:
- Which of your critical systems are hosted in cloud environments, and in which geographic regions?
- Do those cloud environments have genuine geographic redundancy, or are they concentrated in a single data centre or region?
- What are the contractual RTOs in your supplier agreements, and have you ever tested whether those RTOs are achievable?
- What manual workarounds exist for your critical systems — and when did you last check that those workarounds still function in practice?
- What is your communications plan for extended system failure, and does it differentiate between your customers, your trade partners, and your operational teams?
One final thought
NorthC’s Almere facility, the data centre at the root of this incident, describes itself as having a “double knock” aspiration fire detection system and a redundant cooling architecture. The fire destroyed the backup power systems and kept the facility offline for the better part of a week. This is not an attack on NorthC — fires happen, and the fact that no one was hurt and that physical data loss appears to have been limited is genuinely significant.
But it is a reminder that resilience specifications on a data sheet and resilience in practice are not the same thing. When you outsource your critical infrastructure, you also outsource a portion of your resilience risk. The only way to manage that risk is to understand it, assess it properly, and build your recovery plans around realistic rather than optimistic assumptions.
A fire in a data centre in the Netherlands that barely made the UK press has grounded the commercial operations of a major ferry operator for a week. That is not bad luck. That is a gap in a BIA, a gap in a risk assessment, and a gap in a business continuity plan — three gaps that a structured programme of resilience management exists precisely to close.
Cambridge Risk Solutions is an award-winning resilience consultancy specialising in ISO 22301 business continuity and ISO 27001 information security management. We help organisations understand what they depend on, what could go wrong, and what they would actually do about it.
Related posts:
No related posts.
