Made Media Ltd. 105 Carver Street, Birmingham B1 3AP

mail@mademedia.co.uk +44 (0)121 200 2627

Made Media Ltd

Posted in news, technology by Jake on September 17th, 2008

DNS Outage – Blue Tuesday

We had a serious network outage last night. Although all of our servers were fine, Domain Name Service – which was outside our direct control – became unavailable for many of our websites. We’re very sorry for the inconvenience caused to our customers, and if you were affected, you can click through for a more detailed description of what happened, and what we’re doing to prevent it from happening again.

What happened?

The DNS servers for many of our domain names became unavailable last night. We outsource DNS to a specialist company in the US who also have redundant servers in Europe. For some reason these didn’t work. We resolved the problem early this morning by setting up and migrating to our own DNS service. We intend to make this a permanent solution so that we do not rely on a third party in the future.

What is DNS?

When you type a domain name into your browser, your computer actually uses an online directory service to identify which server is hosting the website. Sometimes you may notice a small pause while this happens, but generally the process is completely transparent. Each domain name lists one or more DNS servers in their domain record, and this is the server which provides the directory information to the rest of the Internet for that domain name. At Made Media we use a specialist company for our DNS records. DNS can get slightly complex and we thought it best to outsource to a company that deals solely with DNS.

What happens when DNS fails?

Although all of our servers were fully operational, when the DNS service went down, browsers were no longer able to identify our servers from your domain names, and this caused many websites to appear to be off-line. Many of our clients manage their own DNS records, or use an IT company to do it, and their websites continued to be available.

I thought DNS had fail-over redundancy built in?

That’s right. Generally domain names list two or more DNS servers, and if the first fails then the second one kicks in. This didn’t happen last night and we’re still trying to find out why. The truth is – we’re feeling a little let down.

So what are you doing about it?

We finally fixed the problem early this morning by setting up our own DNS service. We are currently solidifying this approach as a long-term solution, and setting up a secondary redundant service too. This way we will take complete control of DNS. Whilst in theory it’s best to leave hosting and network issues to specialist companies, in practice we find that this inevitably leads to us letting down our clients. Over the past few years we’ve found that we’ve been able to deliver much more reliable service, by bringing as much infrastructure in-house as possible. That’s why an incident of this scale is unprecedented for us.

What about my SLA?

Technically our network retained 100% uptime last night, but we’re not going to hide behind that. If your website was unavailable last night we are happy to refund one month’s hosting to you. Just let us know. We guarantee 99.9% uptime per month and generally achieve 100%. We believe that as a result of last night’s outage some customers will have 98% uptime this month, and that is not acceptable to us.

Comments Off

Comments are closed.

links

categories


archives