This article is Part 2 of a three-part series that we are writing about work that Manifold did with one of our clients, Cortex Building Intelligence. In our previous post, we talked about finding edges in sensor signals so we could use them to help us estimate a building’s start time.
We want to find sharp transitions in the various HVAC sensors—rising edges for electricity, steam, and static pressure and falling edges for supply air temperature (SAT). It’s easy for a human to pick out edges—but how do we teach a computer to do it?
Turns out we’re not the first ones to have this problem. Edge finding is a common problem in image processing, where it is a pre-processing step to many computer vision techniques. We decided to use a traditional technique called the Canny edge finder. Even better, scikit-image in Python already has a built-in function for it.
After some fiddling, though, we found that we couldn’t use the scikit-image implementation out of the box. The primary problem was that the Canny edge detector is tuned to work on images, rather than time series. Specifically, the Canny algorithm finds the center of an edge, but we want to find the beginning of an edge when the signal first started to transition. This is often the case when using textbook algorithms—they have to be adapted to your needs. That’s okay, this is what we do (and why data scientists have jobs).
After some trial and error, we developed what I call the "adapted Canny" algorithm. The flowchart below shows the basic steps of the algorithm.
The figure below illustrates the intermediate steps of this algorithm on some example data. It will be easier to follow the following discussion with this diagram.
x[n], is the portion of the sensor’s time series for which we want to find edges. This is typically the period from 8:00 PM to 8:00 AM. This goes into a series of steps.
- Normalization. We normalize the time series by the historical min and max values from training. This is shown in blue on the plot.
- Gaussian Smoothing. We smooth the noise in the signal by convolving with a Gaussian kernel. This ensures that we don’t find false edges on noise. See here for more details. The
*here represents convolution. And
h[n]is the Gaussian kernel. This is shown in green on the plot.
- First difference. We take the discrete time equivalent of a derivative. This is shown as red on the plot.
- Thresholding. We mark portions where the
x_diff[n]is over (or under) a threshold,
T_edge. Contiguous portions where
x_diff[n]exceeds the threshold are edges in the original signal. The threshold is computed in the training step of the algorithm. The threshold value is shown in purple on the plot.
- Edge extraction. We use
x_edge[n]and the original signal,
x[n], to extract relevant edge information. The final output is a set of edges with the following information. (Note: there could be multiple edges in a single signal or none at all.)
- Time the edge begins. This is shown as a black vertical dotted line on the plot.
- Value of x[n] when the edge begins.
- Time the edge ends. This is shown as a grey vertical dotted line on the plot.
- Value of x[n] when the edge ends.
- Edge strength which is computed as the max x_diff[n] value during the edge. This measures the “sharpness” of the transition.
What we’ve described so far is the detection algorithm. It it what is run every day to find edges in all the sensor signals. It requires three parameters—
T_edge—which must be learned from historical sensor data. The algorithm to automatically learn these parameters is called the training algorithm. This is run infrequently — once a month or less — whenever we want to retune the parameters. Note that this split of the method into an detection algorithm and a training algorithm is extremely common in the data science world. Most machine learning algorithms, from support vector machines to deep convolutional nets, have this pattern.
The flowchart below shows the basic steps of the training algorithm.
x[n], is as much historical data as we want to put into the training. It can be all of the data, or some recent history. This input data is fed into a series of steps:
- Min. Find the minimum value of the historical data. This is
- Max. Find the maximum value of the historical data. This is
- Normalization. We normalize the time series by the newly calculated
x_max. As in the detection algorithm,
- Gaussian Smoothing. As in the detection algorithm, we smooth the noise in the signal by convolving with a Gaussian kernel.
- First difference. As in the detection algorithm, we take the discrete time equivalent of a derivative.
- Clipping. Depending on if we’re looking for rising edges or falling edges we zero out the negative or positive values of
- Otsu’s Method. This figures out the threshold to be used in the detection algorithm. It finds the threshold by finding a binary split of
x_diff[n]that maximizes the inter-class variance. This is a technical way of saying that it finds a the best threshold that differentiates the peaks in the signal from the valleys. Read here for more information about Otsu’s method if you’re interested.
Note that sigma, which controls the amount of smoothing in the Gaussian filter, is not an automatically learned parameter. This is human trained. We typically choose the smallest possible value so we smooth the noise but not the edges. It can be different for every sensor, but for the most part we’ve found that it can be the same across sensors of the same type no matter what building they are in.
So that’s it: our edge finder. The first step toward our start-time estimator.
In the next post, we’ll take the output of this edge finder and synthesize it into an estimated HVAC start time for a building. Before I go, I want to emphasize that this algorithm didn’t just spring to life fully formed. Rather it was an iterative process driven by testing on real data. We’ll talk much more about this process in the next post.
Editor's Note: This blog post was originally published on Medium. It has been lightly edited for this space.