## Property Control Charts

Control charts were originally developed in the 1920s as a quality assurance tool for the control of manufactured products. Although there are many types of control charts, the most common in a quality assessment program is a property control chart in which we record single measurements.

A property control chart is a sequence of points, each representing a single determination of the property we are monitoring,. To construct the control chart, we analyze a minimum of 7–15 samples while the system is under statistical control. The center line (CL) of the control chart is the average of these n samples. Boundary lines around the center line are determined by the standard deviation, S, of the n points, with the upper and lower warning limits (UWL and LWL) and the upper and lower control limits (UCL and LCL) are given by the following equations

UWL = CL + 2S LWL = CL – 2S UCL = CL + 2S LCL = CL – 2S

An example of a property control chart is illustrated here. The position of the data points relative to the boundary lines determines whether the analysis is in statistical control, based on a set of rules:

1. An analysis is no longer under statistical control if any single point exceeds either the UCL or the LCL.
2. An analysis is no longer under statistical control if two out of three consecutive points are between the UWL and the UCL or between the LWL and the LCL.
3. An analysis is no longer under statistical control if seven consecutive results fall completely above or completely below the center line.
4. An analysis is no longer under statistical control if six consecutive results increase or decrease in value.
5. An analysis is no longer under statistical control if 14 consecutive alternate up and down in value.
6. An analysis is no longer under statistical control if there is any obvious nonrandom patter to the results.

The first two rules are based on the assumption that results are normally distributed; if true, then only 0.26% of results will fall outside of the UCL or the LCL, and only 5% of results will fall outside of the UWL or the LWL. The remaining four rules are based on the expectation that the distribution of results is random and that the presence of an pattern in the data indicates that the analysis is no longer under statistical control.

The illustration below provides three examples of control charts in which the results show an analysis that has fallen out of statistical control. The highlighted areas show violations of: (a) rule 3; (b) rule 4; and (c) rule 5. ## Prescriptive Approach to Quality Assurance

Illustrated here is an example of a prescriptive approach to quality assurance for laboratories monitoring waters and wastewaters, adapted from Environmental Monitoring and Support Laboratory, U. S. Environmental Protection Agency, “Handbook for Analytical Quality Control in Water and Wastewater Laboratories,” March 1979. Two samples, A and B, are collected at the sample site. Sample A is split into two equal-volume samples, A1 and A2. Sample B is also split into two equal-volume samples, one of which, BSF, is spiked in the field with a known amount of analyte. A field blank, DF, also is spiked with the same amount of analyte. All five samples (A1, A2, B, BSF, and DF) are preserved if necessary and transported to the laboratory for analysis.

After returning to the lab, the first sample that is analyzed is the field blank. If its spike recovery is unacceptable—an indication of a systematic error in the field or in the lab—then a laboratory method blank, DL, is prepared and analyzed. If the spike recovery for the method blank is unsatisfactory, then the systematic error originated in the laboratory; this is something we can find and correct before proceeding with the analysis. An acceptable spike recovery for the method blank, however, indicates that the systematic error occurred in the field or during transport to the laboratory, casting uncertainty on the quality of the samples. The only recourse is to discard the samples and return to the field to collect new samples.

If the field blank is satisfactory, then sample B is analyzed. If the result for B is above the method’s detection limit, or if it is within the range of 0.1 to 10 times the amount of analyte spiked into BSF, then a spike recovery for BSF is determined. An unacceptable spike recovery for BSF indicates the presence of a systematic error involving the sample. To determine the source of the systematic error, a laboratory spike, BSL, is prepared using sample B, and analyzed. If the spike recovery for BSL is acceptable, then the systematic error requires a long time to have a noticeable effect on the spike recovery. One possible explanation is that the analyte has not been properly preserved or it has been held beyond the acceptable holding time. An unacceptable spike recovery for BSL suggests an immediate systematic error, such as that due to the influence of the sample’s matrix. In either case the systematic errors are fatal and must be corrected before the sample is reanalyzed.

If the spike recovery for BSF is acceptable, or if the result for sample B is below the method’s detection limit, or outside the range of 0.1 to 10 times the amount of analyte spiked in BSF, then the duplicate samples A1 and A2 are analyzed. The results for A1 and A2 are discarded if the difference between their values is excessive. If the difference between the results for A1 and A2 is within the accepted limits, then the results for samples A1 and B are compared. Because samples collected from the same sampling site at the same time should be identical in composition, the results are discarded if the difference between their values is unsatisfactory, and accepted if the difference is satisfactory.

In total, this protocol requires four to five evaluations of quality assessment data before the result for a single sample is accepted, a process we must repeat for each analyte and for each sample. Clearly this is a lengthy and time-consuming process.

## Collaborative Testing and Two-Sample Plots

When an analyst performs a single analysis on a sample the difference between the experimentally determined value and the expected value is influenced by three sources of error: random errors, systematic errors inherent to the method, and systematic errors unique to the analyst. If the analyst performs enough replicate analyses, then we can plot a distribution of results, as shown here in (a). The width of this distribution is described by a standard deviation, providing an estimate of the random errors effecting the analysis. The position of the distribution’s mean relative to the sample’s true value is determined both by systematic errors inherent to the method and those systematic errors unique to the analyst. For a single analyst there is no way to separate the total systematic error into its component parts. The goal of a collaborative test is to determine the magnitude of all three sources of error. If several analysts each analyze the same sample one time, the variation in their collective results, as shown above in (b), includes contributions from random errors and those systematic errors (biases) unique to the analysts. Without additional information, we cannot separate the standard deviation for this pooled data into the precision of the analysis and the systematic errors introduced by the analysts. We can use the position of the distribution, to detect the presence of a systematic error in the method.

The design of a collaborative test must provide the additional information we need to separate random errors from the systematic errors introduced by the analysts. One simple approach—accepted by the Association of Official Analytical Chemists—is to have each analyst analyze two samples that are similar in both their matrix and in their concentration of analyte. To analyze their results we represent each analyst as a single point on a two-sample chart, using the result for one sample as the x-coordinate and the result for the other sample as the y-coordinate. As illustrated above, a two-sample chart divides the results into four quadrants, which we identify as (+, +), (–, +), (–, –) and (+, –), where a plus sign indicates that the analyst’s result exceeds the mean for all analysts and a minus sign indicates that the analyst’s result is smaller than the mean for all analysts. The quadrant (+, –), for example, contains results for analysts that exceeded the mean for sample X and that undershot the mean for sample Y. If the variation in results is dominated by random errors, then we expect the points to be distributed randomly in all four quadrants, with an equal number of points in each quadrant. Furthermore, as shown in (a), the points will cluster in a circular pattern whose center is the mean values for the two samples. When systematic errors are significantly larger than random errors, then the points occur primarily in the (+, +) and the (–, –) quadrants, forming an elliptical pattern around a line bisecting these quadrants at a 45o angle, as seen in (b).

A visual inspection of a two-sample chart is an effective method for qualitatively evaluating the results of analysts and the capabilities of a proposed standard method. If random errors are insignificant, then the points fall on the 45o line. As illustrated here the length of a perpendicular line from any point to the 45o line, shown in red, is proportional to the effect of random error on that analyst’s results. The distance from the intersection of the axes—corresponding to the mean values for samples X and Y—to the perpendicular projection of a point on the 45o line is shown in green and is proportional to the analyst’s systematic error. An ideal standard method has small random errors and small systematic errors due to the analysts, and has a compact clustering of points that is more circular than elliptical.

## Modeling Response Surfaces Using Factorial Designs

In many cases the underlying theoretical relationship between the response and its factors is unknown. We can still develop a model of the response surface if we make some reasonable assumptions about the underlying relationship between the factors and the response. For example, if we believe that factors A and B are independent and that each has only a first-order effect on the response, then the following equation is a suitable model.

R = β0 + βaA + βbB

where R is the response, A and B are the factor levels, and β0, βa, and βb are adjustable parameters whose values are determined by a linear regression analysis. We call this equation an empirical model of the response surface because it has no basis in a theoretical understanding of the relationship between the response and its factors. Although an empirical model may provide an excellent description of the response surface over a limited range of factor levels, it has no basis in theory and cannot be extended to unexplored parts of the response surface.

As shown here, we build an empirical model by measuring the response for at least two levels for each factor—indicated by the plus and minus signs in the tables—and complete a simple regression analysis. This is known as a 2k factorial design because it requires 2k experiments where k is the number of factors. A 2k factorial design can model only a factor’s first-order effect on the response. A 22 factorial design, for example, includes each factor’s first-order effect (βa and βb), a first-order interaction between the factors (βab), and an intercept, (β0); with four experiments we have just enough information to calculate the four β values.

R = β0 + βaA + βbB + βabAB

A 2k factorial design cannot model higher-order effects because there is insufficient information. Here is simple example that illustrates the problem. Suppose we need to model a system in which the response is a function of a single factor. As illustrated here, a 21 factorial design has but two responses, which means we can fit only a straight line to the data. To see evidence of curvature we must measure the response for at least three levels for each factor, as shown in (b). If we cannot fit a first-order empirical model to our data, we may be able to model it using a full second-order polynomial equation, such as that shown here for a two factors.

R = β0 + βaA + βbB + βabAB + βaaA2 + βbbB2

We can accomplish this using the 3k factorial design shown here. One limitation to a 3k factorial design is the number of trials we need to run. As illustrated above, a 32 factorial design requires 9 trials. This number increases to 27 for three factors and to 81 for 4 factors. A more efficient experimental design for systems containing more than two factors is a central composite design, two examples of which are shown here. The central composite design consists of a 2k factorial design, which provides data for estimating each factor’s first-order effect and interactions between the factors, and a star design consisting of 2k + 1 points, which provides data for estimating second-order effects. Although a central composite design for two factors requires the same number of trials, 9, as a 32 factorial design, it requires only 15 trials and 25 trials for systems involving three factors or four factors.

## Simplex Optimization

One strategy for improving the efficiency of a searching algorithm is to change more than one factor at a time. A convenient way to accomplish this when there are two factors is to begin with three sets of initial factor levels, which form the vertices of a triangle. After measuring the response for each set of factor levels, we identify the combination giving the worst response and replace it with a new set of factor levels obtained by reflecting the worst vertex through the midpoint of the remaining two vertices, as illustrated here. This process continues until we reach the global optimum or until no further optimization is possible. The set of factor levels is called a simplex. In general, for k factors a simplex is a k + 1 dimensional geometric figure.

An example showing the progress of a simplex optimization is shown here. The green dot marks the optimum response of (3, 7). Optimization ends when the simplexes begin to circle around a single vertex.

## One-Factor-at-a-Time Searching Algorithm

A simple algorithm for optimizing the response for a system is to adjust independently each factor. Consider a response that depends on two factors. We begin by optimizing the response for one factor by increasing or decreasing its value, holding constant the value of the second factor. We then vary the value for the second factor, holding the value of the first factor at its previously determined optimum value. We can stop this process, which we call a one-factor-at-a-time optimization, after a single cycle or run additional cycles until we reach the optimum response or until the response exceeds an acceptable threshold value.

A one-factor-at-a-time optimization is an effective, although not necessarily an efficient experimental design when the factors are independent. Two factors are independent when changing the level of one factor does not influence the effect of changing the other factor’s level, as illustrated here where the parallel lines show that the level of factor B does not influence factor A’s effect on the response. Mathematically, two factors are independent if they do not appear in the same term in the equation describing the response surface. The response surface below, for example, is for the equation

R = 2.0 + 1.2A + 0.48B – 0.03A2 – 0.03B2

For independent factors, a one-factor-at-a-time optimization quickly and efficiently finds the global optimum, as illustrated by the orange lines in (b).

Unfortunately, factors usually do not behave independently. Consider, for example, the figure below, which shows a dependent relationship between the a factors. Dependent factors are said to interact and the response surface’s equation includes an interaction term containing both factors A and B. For example, the final term in the following equation accounts for the interaction between factors A and B.

R = 5.5 + 1.5A + 0.6B – 0.15A2 – 0.0245B2 – 0.0857AB

and yield the response surfaces shown here. The progress of a one-factor-at-a-time optimization for dependent factors is illustrated by the orange line in (b) above. Although the optimization is effective, in that it finds the global optimum, it is less efficient than that for independent factors. In this case it takes four cycles to reach the optimum response of (3, 7) if we begin at (0, 0).

## Finding the Optimum Response Using a Searching Algorithm

If we know the equation for a response surface, then it is relatively easy to find the optimum response. Unfortunately, we rarely know any useful details about the response surface. Instead, we must determine the response surface’s shape and locate the optimum response by running appropriate experiments. One approach for finding the optimum is a searching algorithm.

Shown below is a portion of the South Dakota Badlands, a landscape that includes many narrow ridges formed through erosion. Suppose you wish to climb to the ridge’s highest point. Because the shortest path to the summit is not obvious, you might adopt the following simple rule—look around you and take a step in the direction that has the greatest change in elevation. The route you follow is the result of a systematic search using a searching algorithm. Of course there are as many possible routes as there are starting points, three examples of which are shown by the white paths. Note that some routes do not reach the highest point—what we call the global optimum. Instead, many routes reach a local optimum from which further movement is impossible. A searching algorithm is characterized by its effectiveness and its efficiency. To be effective, a searching algorithm must find the response surface’s global optimum, or at least reach a point near the global optimum. A searching algorithm may fail to find the global optimum for several reasons, including a poorly designed algorithm, noise affecting the response, or the presence of local optima.

A poorly designed algorithm may prematurely end the search before it reaches the response surface’s global optimum. As shown in the illustration below, an algorithm for climbing a ridge that slopes to the northeast is likely to fail if it allows you to take steps only to the north, south, east, or west.

All measurements contain uncertainty, or noise, that affects our ability to characterize the underlying signal. When the noise is greater than the local change in the signal, then a searching algorithm is likely to end before it reaches the global optimum. The photo below provides a different view of the photo at the top of this post, showing us that the relatively flat terrain leading up to the ridge is heavily weathered and uneven. Because the variation in local height exceeds the slope, our searching algorithm stops the first time we step up onto a less weathered surface.

Finally, a response surface may contain several local optima, only one of which is the global optimum. If we begin the search near a local optimum, our searching algorithm may not be capable of reaching the global optimum. The ridges in the photo above, for example, has many peaks. Only those searches beginning at the far right will reach the highest point on the ridge. Ideally, a searching algorithm should reach the global optimum regardless of where it starts.

## Response Surfaces

One of the most effective ways to think about an optimization is to visualize how a system’s response changes when we increase or decrease the levels of one or more of its factors. We call a plot of the system’s response as a function of factor levels a response surface.

The simplest response surface has a single factor, which we represent graphically in two dimensions by placing the response on the y-axis and the factor’s levels on the x-axis. The calibration curve below is an example of a one-factor response surface, which is represented by the equation

A = 0.008 + 0.00896CA

where A is the absorbance and CA is the analyte’s concentration in ppm.

For a two-factor system, the response surface is a plane in three dimensions in which we place the response on the z-axis and the factor levels on the x-axis and the y-axis. In the illustration below, (a) shows a pseudo-three dimensional wireframe plot for a system obeying the equation

R = 3.0 – 0.3A + 0.020AB

where R is the response, and A and B are the factors. We can also represent a two-factor response surface using the two-dimensional level plot in (b), which uses a color gradient to show the response on a two-dimensional grid, or using the two-dimensional contour plot in (c), which uses contour lines to display the response surface.

## Incorporating a Separation Into a Flow Injection Analysis

Dialysis and gaseous diffusion are accomplished by placing a semipermeable membrane between the carrier stream containing the sample and an acceptor stream. Shown here is a flow injection manifold incorporating a semipermeable membrane. The smaller green solutes can pass through the semipermeable membrane and enter the acceptor stream, but the larger blue solutes cannot. Although the separation is not complete—note that some of the green solute remains in the sample stream and exits as waste—it is reproducible if we do not change the experimental conditions.

Liquid–liquid extractions are accomplished by merging together two immiscible fluids, each carried in a separate channel. The result is a segmented flow through the separation module, consisting of alternating portions of the two phases. At the outlet of the separation module the two fluids are separated by taking advantage of the difference in their densities. The illustration below shows a typical configuration for a separation module in which the sample is injected into an aqueous phase and extracted into a less dense organic phase that passes through the detector.

The inset shows the equilibrium reaction. As the sample moves through the equilibration zone, the analyte, A, is extracted from the aqueous phase into the organic phase.

## Manifolds for Flow Injection Analysis

The heart of a flow injection analyzer is the transport system that brings together the carrier stream, the sample, and any reagents that react with the sample. Each reagent stream is considered a separate channel, and all channels must merge before the carrier stream reaches the detector. The complete transport system is called a manifold.

The simplest manifold includes only a single channel, the basic outline of which is illustrated here. This type of manifold is commonly used for direct analyses that do not require a chemical reaction. In this case the carrier stream serves only as a means for rapidly and reproducibly transporting the sample to the detector. For example, this manifold design has been used for sample introduction in atomic absorption spectroscopy, achieving sampling rates as high as 700 samples/h. A single-channel manifold also is used for determining a sample’s pH or determining the concentration of metal ions using an ion selective electrode. We can also use the single-channel manifold in Figure 13.25 for systems in which we monitor the product of a chemical reaction between the sample and a reactant. In this case the carrier stream both transports the sample to the detector and reacts with the sample.

Most flow injection analyses that include a chemical reaction use a manifold with two or more channels. Including additional channels provides more control over the mixing of reagents and the interaction between the reagents and the sample. Two configurations are possible for a dual-channel system are illustrated here: (a) injection of the sample after the mixing of the reagent streams; and (b) injection of the sample into one reagent stream prior to mixing with the second reagent stream. The choice of manifold depends on the chemistry of the reactions between the analyte and the reagents.

More complex manifolds involving three or more channels are common, but the possible combination of designs is too numerous to discuss. One example of a four-channel manifold is shown here. 