I have trained hundreds upon hundreds of cloud engineers. They work for companies with ten to thousands of employees. The scope of their responsibilities varies from one job to the next. Despite this, I see consistent trends in common mistakes that should not be made in the cloud.
Systems engineers are responsible for maintaining your applications and databases responsive. While the cloud has taken a lot of their hardware responsibilities, their jobs are no less difficult. This is why most mistakes are made. However, these problems can be caused by a lack of attention to details and misperceptions about the cloud.
Here are four mistakes to avoid if you want to be or stay a cloud engineer.
1. Assuming that the Cloud is Unbreakable
A hardware housed in a data centre under another’s control seems like it promotes a sense of invulnerability. It’s still the same hardware. Hardware fails. It is possible to become dependent on the maintenance and built in redundancies of your cloud provider, which can lead to disaster. Although four nines is the standard for most cloud services, it does not mean that you should not have a plan in place for when things go wrong.
How to avoid this error
Plan for disaster recovery that includes more than just backups of your data. Server images, responsibility charts, regular updating and maintenance are all good ideas. Unfortunately, I have seen many outdated plans that place important responsibilities in the hands of people who haven’t worked there for years. Don’t stop working on the plan. Every team member should practice game day exercises until they are comfortable with recovering from these issues.
It is not something that should be left up to chance. Much of the responsibility for disaster recovery rests on engineers. You should never try to manage any system that is business-critical. While the cloud has some inherent resilience, it is best to be prepared for any failure.
2. Overpaying
Anyone who has worked in IT over a period of time has seen the importance of constantly assisting the company’s accountants. We need money to buy servers. We need money to store our data. We need more staff. There are always more requests than there are engineers.
The cloud gives us a new opportunity to have direct control over our day-to-day costs and even the refreshing ability to lower them. Cloud providers often promote the idea of “pay as it goes” to explain that if something is turned off, you no longer pay for it. While this is generally true, it is also true in some cases. You will continue to be responsible for what you have left running if you don’t take it down.
I was frustrated by the bill that my cloud provider had sent to a client of mine. He was outraged when he said that he had only used two servers but that he was being charged for eight. I looked at his logs and pointed out that he had launched eight servers. A quick check of his portal revealed that all eight were still running. He said he had launched eight servers, but only two. His logic was flawed. If it is still running, you are probably paying. Treat your hardware like a disposable resource. Terminate any item that isn’t currently needed.
How to avoid making the error of paying too much for cloud services
There are other ways to save money. Servers that are consistently running day in and day out can be billed as “reserved instance” This option can dramatically reduce your costs (by as much as 80%) and guarantee that you use the server for at least one year. This is not the right choice if you are working on a six-month project. However, it is a great option for long-running infrastructure servers that won’t be moving.
Spot instances, ma