Gathering Data
Whether we are talking about e-commerce or any other movement of data, we will need to capture points of data to ensure proper monitoring and reliability. There is an overall strategy to this, and three types of data points that should be monitored; Points that are convenient. Points that are helpful. And points that are necessary. With these points monitored and the data that passes them recorded and aggregated properly, a reliable and supportable system is created.
Convenient Points of Monitoring
Some systems and technology lend themselves to easy monitoring. These points should be monitored and their data included in the aggregation of monitoring data. These will very from technology to technology, but these point where data is easy to gather, is well documented, and in a friendly format should be collected and aggregated.
For example, it may not be easy or even possible to monitor when an item in a queue becomes stuck. But it is possible and may even be easy to monitor how many items are in a queue. By monitoring the easy point of data, we can find evidence that the queue “may” have stuck items because the number begins to grow beyond a threshold that we have determined.
Helpful Points of Monitoring
Some points are helpful in supporting a system. Having a supportable system should be a requirement, so any point where monitoring would be helpful should be monitored. Only the most extreme obstacles should prevent this monitoring from being in place.
If you have chosen good technologies to be working with, or are lucky enough to have inherited good technologies that have convenient points of monitoring in locations that are helpful to your process, then there should be quite a bit of overlap with the previous group of monitoring points. Where there is no overlap, implementing a passive monitoring option is a good and recommended strategy.
In the example from the convenient points, the number of items in the queue can become helpful in that it alerts us to a condition that may exist. When this is combined with other factors, like duration, it can become helpful. Extending the example so that we are not only checking on the size of the queue, but how long it has been at a certain size, we can know even more information and be in an even better position to alert that a “stuck” condition exists.
Necessary Points of Monitoring
Some points are more than helpful, they are absolutely necessary. These are points where there is a mission criticality to the data. It is beyond mere supportability, but the achievement of the data transfer. Any point that does not have any other points that back it up can fall into this category.
Some times these critical points may only be seen when we expand upon the data that we can see. If we need to know that our invoice files have been delivered to the accounting application. But the method of transport is so passive that we can’t tell during transport that it was successful or not. We may have to build an audit monitor into the system. We may have to have something on the accounting application side, and the delivery side report as to how many invoice documents were sent and how many received.
Then we can compare these numbers to know if something was lost. We may even be able to build in a document identification into the audit so that we will even know which files were lost and only resend those. Being creative and building a monitors that cover the critical areas of data flow are essential, so even if you have to use primitive methods like file counts, when you put these things together you can achieve a robust and stable environment.
Monitoring the Monitors
One can get carried away in this thinking so that we are monitoring the data. And then we monitor the monitoring system, and monitoring the system that monitors the monitors, and so forth. This strategy creates complexity where simplicity is needed. It is also false positive prone, and thus is hard on the support personnel that need to respond to these alerts.
Instead of having this strategy of cascading points of failure. I recommend redundant points of monitoring. Thus monitoring points support each other. If the first point fails, but the next point in the path completes, then there is no need to alert on the monitoring point that failed. Using this strategy we can utilize monitoring methods that have inherent flaws that cause them to fail regularly, but build them into a reliable, robust, fault tolerant system. An example of this is RFID tags for monitoring the movement of materials.
Applying this strategy makes it possible to cover the critical monitoring needs without having to go to extreems of auditing, or expensive systems changes.
Subscribe to "The Integration Engineer" by Email
Find out about the tools and services available at The Integration Engineer's Consulting site.

