Data Smoothing Functions in Mathcad
by Mathcad Staff
|
A data smoother takes a set of data and returns a new set that contains less noise than the original set but retains the basic shape and important properties of the original data. Mathcad provides three data smoothing functions: medsmooth, ksmooth and supsmooth. |
|
Here's an example provided by Stuart Bruff of what the smoothers do. The matrix contains five sets of stress-strain data from one-dimensional compression tests on different specimens. |
|
Each column of the matrix is plotted below, providing very noisy data, but the trend is apparent. |
|

Now see what each of Mathcad's smoothing functions do to smooth out the curves in the plots: medsmooth, ksmooth, and supsmooth.
|
|
To see how each of these smoothers works, use the following set of data. |
|
The rnd function adds random noise to every other point in the data set. You get a different set of data each time this worksheet is calculated. |
|
Median Smoothing: The medsmooth function |
|
Medsmooth is moving window smoothing, using a symmetric window. But rather than using a mean or a polynomial fit it uses a median as the smoothed value. Median smoothing is particularly useful in cases where there are sudden bursts of noise or incidents of corruption in the data. |
|
The medsmooth function takes two arguments: |
|
y - a real vector of data |
|
n - the window size (number of points considered when Mathcad modifies each point) |
|
Medsmooth returns a modified vector of the same size as the original data. |
|
For almost all points in a data set, Medsmooth takes a window of data around a given data point and replaces that point with the median of the values in the window. The window size, n, must be an odd number so that there are (n-1)/2 points on either side of the point being replaced. |
|
Apply a window size of 5 to the Q data to see what happens: |
|
Notice that elements 1 and 2 of the smoothed data remain the same as the original data. This is because there are too few elements on either side of these points for Mathcad to evaluate the entire window. For a window size n, the first (n-1)/2 data points are unchanged. |
|
Element 3 has changed. Where does this new value come from? |
|
Since the window size is 5, Mathcad replaces element 3 by the median of elements 1, 2, 3, 4, and 5. |
|
It then finds a new value for Q4 using this same technique: |
|
This process continues until the window size is too large to accommodate all the necessary points at the end of the data. As with the beginning of the data, the last (n-1)/2 points are also unchanged. |
|
The new data set is much smoother than the original. However, there still are several bumpy areas. |
|
A larger window may help, but then a larger number of values at the endpoints are not smoothed. |
|
Here's a Mathcad program that uses a smaller window size when necessary.

|
|
Kernel Smoothing: the ksmooth function |
|
Like medsmooth, ksmooth replaces each point in the data set with a modified version of itself based on the values of the surrounding points. |
|
The ksmooth function takes 3 arguments: |
|
x - vector of real numbers, must be in ascending order |
|
y - vector of real numbers |
|
bandwidth - size of the smoothing window. Typically a few times the spacing between the x data points |
|
There are several factors involved with kernel smoothing: |
|
Weighted averaging - the current point has the most impact on the final value. As you move a greater number of points away from the central point, each point has a smaller contribution. The weights can be considered a percent each point contributes to the total, so the sum of the weights should add to 1. |
|
Consider a small sample set of data: |
|
Look at the point at x = 3, and use the following relative weights: |
|
The point at x = 3 is weighted three times more heavily than the points at x = 1 and x = 5. |
|
Since the weights are actually percentages of the total contribution, divide by the sum of the weights. |
|
The weighted average of element 3 is: |
|
See how this differs from the median method used in medsmooth: |
|
Kernels - The distance to each point affects how much of an impact an element has on the final result. In the example above, elements 1 and 5 have the same weight because they are each 2 elements away from element 3. However, element 1 is farther away from the point being modified (element 3), it has a smaller affect on the results. |
|
Manually changing the vector of weights is not efficient because when you modify other elements, they may not fall into this same pattern. |
|
A kernel function can be used to provide a dynamic weighting mechanism that adjusts itself automatically for each point. |
|
Kernel functions must satisfy the following criteria: |
|
It doesn't matter if you are on the left side (+d) or right side (-d) of a point, only the distance matters. |
|
ksmooth uses a Gaussian (Normal) kernel. |
|
Bandwidth controls the number of points considered. Unlike the window size used in medsmooth, this is not a number of points. Rather, it is the actual size of the window on the x-axis. This value is typically be a few times the spacing between values on the x-axis so that several points are considered. |
|
The kernel function used in ksmooth can be manually defined as follows: |
|
xj - the point being replaced |
|
Below is a plot of the smoothed Q data for several different bandwidths. |
|
Super Smoothing: the supsmooth function |
|
The supsmooth function takes two arguments: |
|
x - vector of real numbers, must be in strictly increasing order (no repeating x values) |
|
y - vector of real numbers |
|
The supsmooth algorithm uses a local smoother that does a localized linear fit. |
|
Consider the first five points in Q and t. The line function can be used to find a best-fit line through these points. Essentially, just as medsmooth replaced the point with the median value of the 5 points, supsmooth replaces the point with the value of the best-fit line evaluated at the same x value. |
|
Unlike the other smoothers, supsmooth does not take an argument such as window size or bandwidth. The supsmooth function adaptively adjusts the size of the window based on the behavior of the data. |
Right-click, choose Save Target As, and change the extension to XMCD and File Type to All to download Mathcad file. (Mathcad 13)
|
[PRINTER FRIENDLY VERSION]
|