We introduce the technique of Software Monitoring with Controllable Overhead (SMCO), which is based on a novel combination of supervisory control theory of discrete event systems and PID-control theory of discrete time systems. SMCO controls monitoring overhead by temporarily disabling monitoring of selected events for as short a time as possible under the constraint of a user-supplied target overhead ot. This strategy is optimal in the sense that it allows SMCO to monitor as many events as possible, within the confines of ot. SMCO is a general monitoring technique that can be applied to any system interface or API.
We have applied SMCO to a variety of monitoring problems, including two highlighted in this paper: integer range analysis, which determines upper and lower bounds on integer variable values; and Non-Accessed Period (NAP) detection, which detects stale or underutilized memory allocations. We benchmarked SMCO extensively, using both CPU- and I/O-intensive workloads, which often exhibited highly bursty behavior. We demonstrate that SMCO successfully controls overhead across a wide range of target-overhead levels; its accuracy monotonically increases with the target overhead; and it can be configured to distribute monitoring overhead fairly across multiple instrumentation points.