We all want to automate our network as much as we can. We are constantly encouraged to automate trivial tasks, and I'm not here to discourage you to do so. However, if your aim is to have for example automated service activation, there are a number of pitfalls that can be detrimental to your efforts over the long run.
Here are five reasons why your network automation project may fail, if not properly accounted for.
1. Missing or incomplete inventory integration
You must absolutely have an inventory to have proper automation. Every resource you use in the service must be properly tracked. This ensures that you know exactly what services uses which resources, you know when they are depleted, and you know how to remove the service or alter it.
Ideally your automation should book these resources itself, so even before you write your automation, you must make sure that your inventory system has some sort of API that enables this.
Examples of such resources could be:
Ports, subinterfaces, units, etc.
Service reference numbers
The service itself
If there are characteristics to a service that you don't currently use in the configuration, it is important to track these as well, as they may be important either in service assurance or in a later product development. For example, I once worked for an service provider that used leased lines exclusively. They relied on whoever they bought the lines from to do the traffic shaping, so whatever bandwidth the lines had, was not included in the inventory. However, at some point we decided to do our own shaping, but it was impossible to implement this for existing lines, as the information was not documented.
2. Not automating the full life cycle of the service
This is quite a common problem, and it is not trivial mostly. These days it is not that hard to use Ansible or something like that to activate a service, but the problem comes if you need to change anything, or decommission the service.
Decommissioning is in some ways not the worst problem here. You can probably relatively easily go through your inventory and find the relevant resources to a service and then clean them up, both in the network and in the inventory. It will probably be a semi-manual task, but it is doable for smaller networks.
However, it is much more difficult if you need to modify a service, because you often end up with a network that goes out of sync with your inventory. There are many modification scenarios, and you need to make sure that you account for all of them. These could include:
Add/remove IP networks from a service
Change prefix-list, routing parameters, etc.
Move a service to a different node
Change bandwidth parameters
Do bulk changes to a number of services
Some networks implement their automation so that each transaction to
the router is a
replace operation, which partially solves this
problem, as the router configuration is completely overwritten with
whatever is generated from the automation. This is a valid approach,
however it can also be a dangerous operation, so extra care should be
taken to ensure that the generated configuration is properly validated.
It also means that your automation must be able to generate the full configuration of your devices, including both service and non-service configuration alike.
It is also why I happen to like NSO, because you only provision the intended target state to it, and it works out the operations required to reach that state itself.
3. Only automating a reduced feature-set
When working with automation projects, it may be desirable from a project management stand-point to aim for a first release with only a reduced subset of features, and it is probably not always a bad idea, but it certainly can be.
First of all, it is a burden for any engineering department if they have to split their work between an automation system and then manual processes to configure any characteristics that are not part of this reduced service, and secondly it can cause your inventory to go out of sync.
Indeed, network engineers may just decide to provision everything by hand because it may not be clear which features are handled, or your initial assumptions of what the majority of your services includes might be incorrect.
Similarly, you may have fancy new routers that are easy to automate, and a number of legacy devices that are difficult.
If not automated it will cause a host of problems, both in terms of consistency, errors, and poor documentation. If you really can't handle these devices from whatever automation framework you have, you should get rid of them as fast as possible, and if that is not possible, consider one of the following options:
If they are reachable by CLI, some sort of expect-like script can probably be written
Consider if they can be pre-provisioned for future services (e.g. all VLANs configured on a switch)
Check whether they can be provisioned using SNMP or some other legacy method. Perhaps they support configuration file uploading via TFTP.
4. Allowing engineers to provision manually
This is a controversial topic, but it has to be addressed. If your organization is not used to automation, it may face some resistance, especially from senior staff that have been working with hand-provisioning services forever.
Therefore, the automation must both work and have good UX design, in a way that feels like an improvement to everyone. Provisioning of a service must not be more cumbersome or take more time than before, else it may hurt adoption.
You may face situations where a first-phase release may be wanted, where that release may not improve much or even be slower than whatever legacy process exists. Consider whether you want to present this to your users, as it may give them a negative first impression of the system, which will carry on even when the automation is at a desired state.
5. Not anchoring the automation in the entire organization
Automating your network may seem like a thing that is internal to your engineering teams, but in fact it should be a company wide effort.
The BSS systems should at least in some manner be integrated with the inventory, to avoid manual input. If you are using a process engine, the process should start from the sale.
Do you need to deploy field engineers during the service activation, then that could be part of that process engine as well.
Really, it is much easier to get the adequate resources for your automation project, if the whole organization is part of the project, and all parts feel like they too stand to win from the project.
Do you agree with my above points? Leave me a comment if you feel like it.
Happy automating in 2020!