What is cloud web hosting?

What is cloud web hosting?

Wednesday, April 20th, 2011

What is Cloud Hosting? I’m not really sure if anyone knows the answer to that question, but if you ask it to 5 different hosting providers you will very likely get 5 different answers. Has anyone seen that Microsoft advert recently? A lady is trying to get a family photo taken but can’t get that perfect shot. So, she says, “Take it to the cloud!”. As a hosting industry person, I find this a baffling thing to say. How many meanings for cloud are there at the moment? Too many, that’s for sure. It is certainly leading to a problem in a consumer perception as well. Although I don’t wish to sound patronising, Cloud is not some magical device which means you will never have problems. However, is it fair to blame the consumer for that, or are we as IT professionals perhaps guilty of leading people to think this? Now that I have introduced the topic, onto my point. 

So, what is Cloud Hosting? Well, if you were to ask me, and you are, I’d say Cloud Hosting is multiple servers, connected to redundant storage, via multiple (redundant) uplinks. My definition of Cloud Hosting is that no single failure can cause the environment to collapse. If that failure be a switch, a server or hard drives, the environment should have enough redundancy to support it.

In server terms, you need to ask yourself how many servers you want to keep sat idle in case of a problem. On our “Cloud Hosting Environment”, we ensure that at least 40% of our server total could fail and customers wouldn’t experience a problem due to the failure. However, someone else may consider 40% to be an excessive amount to keep sat idle. After all, these servers cost thousands of pounds EACH. Would you consider a cluster of 15 servers, with one as a standby, to be redundant? By a technical definition, you have one server free, so it can be called redundancy. In real practical terms, you can’t really call it redundancy, as there are scenarios where multiple servers can fail at a single time (Yes, I’ve seen it). In practical terms, if the server you are hosted on fails, you should only notice a single ping drop while it moves over to the free server.

In storage terms, it becomes more complex. Can a single storage device be classed as redundant? I suppose it can, depending on the device and what features it has. If the device has multiple uplinks to and from it, multiple power feeds from different sources, some people would class that as redundant. If your storage array is configured in a redundant level of RAID (shall we say RAID 10?), that could be classed as  redundancy. Have a hotswap available at all times? Yes, that could be called another level of redundancy. We personally have multiple storage devices configured in a cluster interconnect. It actually happened last year that one of the storage devices failed, although it shouldn’t have (How many times have we heard that?) the customers on this storage device didn’t see a problem because they were on a storage device in a cluster interconnect with another. The storage arrays that failed device was taking care of automatically started being managed by the other. Magic!? No, just an extra level of redundancy. You can do this as many times as you wish, adding additional levels of redundancy, but this has a cost implication. High end storage just isn’t cheap, no matter what people say.

Software is where it gets really interesting. Of course, having multiple servers and storage devices in a cluster interconnect is all fine and well, but what about the software managing all that? By software, am I talking about the Cloud software such as VMware or Xen, or am I talking about the Linux or Windows install on an individual VM residing inside the cluster. Let’s tackle the Cloud Software point first.

What happens if Software fails? What happens if your failover didn’t kick in because the software monitoring it all didn’t pick up that there was a failure? What happens if the storage devices talking happily to each other in their little cluster are not aware of problems? I have seen it happen where two storage devices thought each other were fine as their interconnect was very healthy, but VMs were down. What happened? I’ll put it in simple terms. The storage device which was supposed to take over if the other failed was happily chatting to the other device in the interconnect and wasn’t told that the servers couldn’t reach it. The result was customers were down and the redundancy in place couldn’t do a thing about it. In fact, knowing what the problem was now, nothing could have been done about it. What happens if the software doesn’t detect a server as failed and switch over? Yes, I’ve seen it happen when a server was intermittently failing and the controller had no idea and didn’t switch. Such software issues are rare, but there are cases out there. Cases I have seen first hand.

OS Software is also an interesting subject to speak on. If your Windows environment becomes infected by a virus, which in turn, let’s say, stops IIS from starting, has the Cloud failed? Put simply, NO! The Cloud Environment is perfectly fine, no hardware has failed and the Cloud Software is all fine as well. Is it fair to expect a virus infection, hack attempt or other guest OS issue problem not to happen in a Cloud Environment? No, I’m afraid it isn’t. I don’t mind saying that I’ve actually spoken to a customer who accidentally deleted all his web content files and then tried to complain that this counted against the uptime guarantee. Really? Yes, I’ve had this conversation, more than once actually, where someone was unable to understand why their files being deleted, by them as well, would cause their website to go down? I can smile about it now, but at the time I don’t mind admitting that it gave me quite a headache.

Human error is also another pitfall. If you speak to an employee of an IT company, be it hosting or any other, and they say they have NEVER made a mistake, they are not being truthful. They have, everyone has. Mistakes in a cluster can mean that all that hardware and software redundancy isn’t worth the hundreds of thousands of pounds you paid for it. Human error is always going to happen and this should be taken into account when you are planning your project. Is it a customer problem if their IT Provider makes a mistake? In part, yes. It’s unfair to say that the customer is to blame and I think most decent IT companies would cover that under their SLA, but to the people who are reading this, mistakes are going to happen. Be it something tiny that you barely notice, or something huge which after 5 years of perfect service caused downtime, it will happen at some point.

When it comes to marketing, how should one market something like a Cloud Environment? I suppose that entirely comes down to how confident you are in your own infrastructure and staff members. I don’t mind saying that we offer a 100% uptime guarantee on the environment. Can I imagine scenarios where we could drop below that? Yes, although they are extremely unlikely. I’m confident enough to say we will stand by our SLA should it fail to reach that level. That’s what an uptime guarantee is, standing by your product and compensating should it fail.

When it comes to IT consultants trying to sell people things, I know Cloud is the big word at the moment. I know even in Government Cloud is a big for them. Cloud is going to save them money, so their consultants tell them. Let me ask you this. If you need a Dual Quad Core Server, 16 GB RAM and 300 GB of storage, how is it possibly going to be cheaper to run that in a redundant environment? You would need, at the minimum, an extra server sitting idle. That means you have to buy two servers instead of one. The only scenario in which it is going to be cheaper is if you sacrifice something to balance it out. I ordered a Dual Quad Core with 16 GB RAM and 300 GB Storage, but my content never goes above 75 GB and my RAM use has never gone above 8 GB. My CPUs are also at 25% capacity. In that scenario, it can be cheaper to switch to a cloud and get the redundancy, while sacrificing some of the raw power. However, if you were buying too powerful servers to begin with, that is another problem entirely.

If a consultant is reading this they are probably thinking “Ah ha! He hasn’t taken into account the cost of downtime or failure when determining the price!”. That’s fine, and I agree to it, to an extent. If you have a website which your business depends on and you lose more money than the Cloud would cost in the event of downtime, then yes, in that sense it does make financial sense. However, do the calculations yourself and make an informed decision. Cloud isn’t for everyone and unlike what some people think, it isn’t a magic thing which means you will never have a problem. You will have problems at some point, just not the same kind of problems and at no where near the same frequency.

You will note that my definition of Cloud involves redundancy. The Cloud definition used by others may not. Some may think of it as a scalable platform, which is very different from redundant. When you are looking to outsource your requirement or take it in house, ask yourself carefully what you are wanting to achieve, what you can afford to do and what the cost of failure would be. No matter what you decide to do, ALWAYS keep your own backups and inspect them as often as practically possible, as ultimately your data is worth more to you than anyone else, so you HAVE TO take that responsibility.

Thanks for taking the time to read my views on Cloud Hosting. I hope it will be the first on many articles published here.

John Strong
Managing Director


Comments are closed.