When we talk about micro services, cloud infrastructure and ability of the infrastructure which is resilient, Netflix is one of the top companies which comes into scene. Netflix always work in the direction of making its infrastructure more resilient to make it always available cause it has to serve a lot of traffic. In this artilcle we will talk about netflix simian army.
Here I will be talking about netflix simian army which makes its infrastructure more secure and less prone to unintended errors like hardware failure etc. There are three elements in Simian Army. Chaos Monkey, Janitor monkey and conformity monkey.
Lets talk about what these does.
Chaos Monkey is a service which identifies groups of systems and randomly terminates one of the systems in a group. The service operates at a controlled time (does not run on weekends and holidays) and interval (only operates during business hours). In most cases we have designed our applications to continue working when a peer goes offline, but in those special cases we want to make sure there are people around to resolve and learn from any problems. With this in mind Chaos Monkey only runs in business hours with the intent that engineers will be alert and able to respond.
These are the words from netflix github definition.
So in short chaos monkey cause chaos to test how much resilient the infrastructure is.
Janitor Monkey determines whether a resource should be a cleanup candidate by applying a set of rules on it. If any of the rules determines that the resource is a cleanup candidate, Janitor Monkey marks the resource and schedules a time to clean it up. The design of Janitor Monkey also makes it simple to customize the set of rules or to add new ones.
In short it is kind of garbage collector for your aws resources.
Conformity Monkey determines whether an instance is nonconforming by applying a set of rules on it. If any of the rules determines that the instance is not conforming, the monkey sends an email notification to the owner of the instance.
Its keeps the infrastructure secure by making it follow certain rules, which makes the system secure.
So these are the netflix simian army. This is how they make their infrastructure more secure, resilient and clean and also save money.
Keep following the latest netflix blog for their awesome infrastructure setup and solutions.