It is now commonplace for engineers to build mathematical models of the systems they are designing, building, or testing. And, it is nearly universally accepted that phenomenological models of physical systems must be validated prior to use for prediction in consequential scenarios. Yet, there are certain situations in which testing only or no testing and no modeling may be economically viable alternatives to modeling and its associated testing. This paper develops an economic framework within which benefit–cost can be evaluated for modeling and model validation relative to other options. The development is presented in terms of a challenge problem. We provide a numerical example that quantifies when modeling, calibration, and validation yield higher benefit–cost than a testing only or no modeling and no testing option.

# Economic Analysis of Model Validation for a Challenge Problem OPEN ACCESS

**Paul J. Paez**

**Thomas L. Paez**

**Timothy K. Hasselman**

Manuscript received February 11, 2015; final manuscript received December 16, 2015; published online February 19, 2016. Guest Editor: Kenneth Hu.

*J. Verif. Valid. Uncert*1(1), 011007 (Feb 19, 2016) (13 pages) Paper No: VVUQ-15-1013; doi: 10.1115/1.4032370 History: Received February 11, 2015; Revised December 16, 2015

Businesses provide products and services and maintain their operations through the development of profit equilibria. A business scenario has been specified in another paper in this issue, entitled “The 2014 Sandia Verification and Validation (V&V) Challenge Problem: A Case Study in Simulation, Analysis, and Decision Support,” [1]; the scenario will be referred to as the “challenge problem.” The challenge problem describes a business, the MysteryLiquid, Co., a hypothetical operation, which provides a service, namely, the storage of nonhazardous chemicals.

During a periodic, random inspection, one tank exceeded the limits of a safety test, and the challenge problem poses two questions: (1) Given data from experiments on the failed tank and neighboring tanks describing material and geometric characteristics, can one estimate the probability of failure of a tank under “nominal” operating conditions along with uncertainty on the estimate? And (2) given the same information, can one estimate the limits of safe operation of a tank at which the probability of failure is $10\u22123$ or less? We address another question that should have been asked before the project start: What course of action would be most economically advantageous to the company given the importance of the company's reputation and its profit objectives? Answering the latter question might naturally precede the performance of tests and the construction of models in a profit-motivated company. We seek to establish the optimum economics-driven course of action for the MysteryLiquid, Co., to take. We pose the problem in terms of costs and benefits and perform an economics benefit–cost analysis for the MysteryLiquid, Co. The value chain underlying modeling and V&V is introduced in Ref. [2].

Many options might be considered for economic benefit–cost analysis. For purposes of demonstrating our approach, the following three are considered:

^{(1)}*No additional testing or modeling*: Under normal conditions, continuing to monitor the results of periodic random safety testing would incur no additional cost unless the frequency of tanks not passing the test increases. The additional cost in this case would be that of replacing tanks if they fail or exceed the safety margin. A more costly consequence might be the cost to the company's reputation, which could result in the cancelation of their contract.^{(2)}*Additional testing but no additional modeling*: In this option, we consider a program to conduct additional testing with no additional modeling. (“No additional modeling” refers to no additional finite-element (FE) modeling.) The benefits associated with this alternative would be derived by identifying and replacing tanks that do not pass the new tests.^{(3)}*Modeling, calibration, and validation*: This option follows the basic construct of the challenge problem. If this alternative is adopted, a more elaborate FE model will be developed and then calibrated and validated. Destructive and nondestructive calibration tests might be performed on structures in the ensemble; it is assumed that as the number of tests increases, given a satisfactory model and appropriate calibration, the quality of the tank material and geometry models improves. Inclusion of this option permits economic consideration of modeling, calibration, and validation.

Assumptions required for performance of the economic analysis were not made in Ref. [1]. Therefore, we propose assumptions, in this paper, that provide a plausible augmentation of the original problem definition.

The paper proceeds as follows: Section 2 discusses some fundamental ideas of economics and benefit–cost analysis. Section 3 discusses the benefits, values, and costs of testing and phenomenological modeling. Section 4 reviews the probabilistic framework of classical reliability related to the challenge problem. Section 5 develops some probability models and assumptions required to perform optimization of benefit–cost of the three options for action by the MysteryLiquid, Co. Section 6 summarizes the numerical results for the benefit–cost analyses listed above. The paper ends with conclusions about benefit–cost analysis in general and courses of action for the MysteryLiquid, Co., based on the analysis in this paper.

Economics is about choices. Given scarce resources and insatiable needs and wants, people must choose what to produce, how to produce it, and who gets what we produce. There is more than one way to answer these questions, but most modern societies rely mostly on free-market capitalism. Individual workers are allowed to choose their own careers, individual businesses choose what to make and how to make it, and individual consumers are free to choose what types of goods and services to buy and how many of each. This free-market system, however, is often supplemented and/or regulated by a central authority (government) that oversees and sometimes restricts market activity.

Economics, then, is the study of how producers, consumers, and governments make decisions in a free-market economy. Although specific situations lead to a variety of different answers, most economists agree that rational economic agents attempt to maximize net benefits—that is, total benefits minus total costs.

In theory, the concept of benefit–cost analysis seems so intuitive that it hardly bears scrutiny—if the benefits of some course of action outweigh the costs, then take that action. In practice, however, many issues arise that require clarification. Often, there are multiple alternatives among which a rational agent may choose. In the current example, these are options 1, 2, and 3. The one that provides the largest net benefit (total benefits − total costs) would normally be the optimum choice.

More important, though, is how to measure benefits and costs. To clarify this issue, economists refer to “explicit” and “implicit” costs (and benefits). Explicit costs are out-of-pocket, money expenses. For example, the MysteryLiquid, Co., may have to pay for office supplies, rent, utilities, etc. The company also faces implicit costs that are not quite as obvious because they do not involve out-of-pocket expenses. In this example, if the company owns equipment used in the chemical storage operation, it should consider the implicit cost of not using the equipment for another purpose. When measuring implicit costs, economists use the concept of “opportunity costs.” Opportunity costs are measured as the dollar value of the next best alternative use of a limited resource. When making rational decisions, economic agents should consider total costs (both explicit and implicit) [3]. Similarly, decisions should be based on total benefits. The explicit benefit to the company in our example is the money revenue earned from leasing chemical storage. The company may also acquire an implicit benefit from possession of a model of its storage tanks. While the company does not actually receive any money for this benefit, it is still valuable and could be included in total benefits. For purposes of the analysis to follow, only the explicit benefit of the revenue earned from chemical storage is included in the benefit–cost analysis.

Just as there are some costs that should be included, even if they are not obvious, there are some costs that should be ignored. Sunk costs involve expenses that cannot be recovered and should not affect rational choices [3]. For example, the MysteryLiquid, Co., has invested in tests and a simple FE model; this is not a recurring cost that should be considered each month when deciding whether to continue doing business. Because the money is spent and cannot be recouped, it is irrelevant to decision making moving forward.

Another issue that decision makers often confront is comparing costs and benefits over time. People tend to be impatient and would prefer to experience benefits today rather than later. On the other hand, people would rather postpone costs when they can. Neither desire can usually be satisfied. When a project requires up-front costs to put into motion activities that might not yield benefits until a future date, economists discount all amounts to their present value when conducting benefit–cost analysis [4]. Because the quantity we are interested in is the net benefit (positive, negative, or zero) in each time period, the present value of net benefits $(PVNB)j$ of a future value $(FVNB)j$, $j$ time periods into the future, with a discount rate of $i$ is

Using these present values to conduct a benefit–cost analysis for a project that lasts for $J$ periods into the future, ${j=0,\u20091,\u2009\u2026,\u2009J}$, the total present value of net benefit, $(PVNB)tot$, of the project is

Because the example to follow (Sec. 6) analyzes benefit–cost over a 1 year period, discounting is not used. However, for any analysis that covers multiple years discounting should be used.

A problem that arises when considering the government's use of benefit–cost analysis is whose benefits and costs to count. Economists often make the assumption that public policy makers should take into account the effects of a project or action on all of the citizens they represent. (It is widely accepted, however, that elected officials may only care about the costs and benefits experienced by potential or likely voters or campaign contributors. In fact, this is one reason why benefit–cost analysis was first mandated by the federal government [5].) In this case, then, the appropriate measure of the total PVNB for a whole society, $(PVNB)soc$, is

where $(PVNB)kj,k=1,\u2009\u2026,\u2009K,j=0,\u2009\u2026,\u2009J$ is the present value of net benefit to the $kth$ citizen of a future benefit realized $j$ time periods into the future, and $K$ is the number of citizens.

Finally, there is the issue of placing a dollar value on goods and services not traded in markets. In particular, many of the goods and services produced by the government are provided precisely because they are not provided in private markets. National defense, law enforcement, maintenance of public lands, and the protection of clean air and water are just a few examples of what economists call “public goods.” These goods and services are not typically supplied by private citizens because there is no incentive for producers to make them or for consumer to pay for them once produced. Economists, however, have developed ways to estimate the net benefits of these goods [6].

In summary, economics is the study of how consumers, producers, and governments make choices in a world of scarcity and nonsatiation. And while the many different market characteristics might lead to very different answers to what to produce, how to produce it, and who gets what is produced, there is one underlying assumption: rational economic agents attempt to maximize net benefits, i.e., benefits minus costs. Paez et al. [7] provided a more detailed development of the ideas underlying economic analysis.

The remainder of this paper examines how this basic framework of benefit–cost analysis can be applied to making a decision whether or not to employ computer simulation, calibration testing, and validation testing in practical applications. It is assumed that the benefit to a business is the revenues it expects to collect for providing a service and the costs are the costs of providing the service. Both expected benefits and costs are estimated for the three scenarios of options 1, 2, and 3. Benefit–cost is optimized in each case to establish an economically logical course of action.

This section discusses the values of testing and modeling in a qualitative sense and then introduces the quantitative ideas of benefit–cost as they will be used to investigate the challenge problem.

Prior to the advent of digital computers, closed-form analysis and physical testing of engineering systems provided the only means for exploring system characteristics. When possible, prototypes or scale models were built and tested. Test results were used to assess the chances of survival of production systems and one-of-a-kind structures. Many organizations would not consider releasing a system for production, sale, or use without first demonstrating on physical exemplars that the system behaves and operates as intended. The value of a well-planned and executed test is that it reflects the precise behavior of a physical system.

In recent times, the potential accuracy of FE models of systems has improved; the possibility of using computational models to predict system behavior has become attractive because modeling and model prediction may prove less expensive than testing. The value of a computational model is that it relates phenomenology to the behavior of a system; once a model of a system is developed, it might be modified (to a limited extent) to represent the behavior of other systems, excited using many different realizations of a random input and excited using many different forms of input. These things cannot be done, simply and inexpensively, with physical experiments.

Testing and modeling are both valuable and cost money, but possession of test results or possession of a model does not necessarily translate into revenues for the MysteryLiquid, Co. The company is private and it derives its benefit from the revenues it collects for the rental of chemical storage; all else is a cost of doing business. So a pertinent question is: In the sense of benefit–cost, why would a company perform tests or construct and use phenomenological models? The answer is that focused testing and satisfactorily accurate models can diminish the other costs that must be paid by a company.

The specific mathematical expressions for costs and benefits of options 1, 2, and 3 are developed for the Challenge Problem in Sec. 5. Here, we develop the general benefit–cost expressions for the three options that show why performance of tests or creation of a model along with its V&V might prove to be feasible economic alternatives. First, we consider benefits. During the course of providing a product or service, a company might establish a revenue stream that yields a benefit, $B1$, during a pre-established time period (say, 1 month), given that provision of the product or service is satisfactory to the customer. As long as the customer remains satisfied, the benefit continues; after $j$ months of providing satisfactory service, the benefit is $jB1$. We are interested in analyzing revenue status for a pre-established period, say $Nmo$. If the company fails to provide satisfactory service after month $j$, and $j<Nmo$, then the customer might cancel the contract in which case the revenue stream would drop to zero. (We assume the customer has that option.) Total income for the company during the $Nmo$ period is $jB1$; the loss of benefit is $B1(Nmo\u2212j)$. (We focus this analysis on a single customer and assume that if a contract is canceled, it cannot be immediately reinstated.)

Let $Tc$ be a random variable denoting the first month during which the contract is canceled, and let $pTc,j=P(Tc=j),j=0,1,\u2026$ be its probability mass function (PMF); the $pTc,j$ will typically be small, especially for early months where $j$ is small. The probabilistic expectation of benefit to the company, $E[B]$, during the $Nmo$ period equals $NmoB1$ minus the expected loss of benefit

During field operation, any product or service might fail, and that raises the possibility that the company that provides the product or service will incur costs. We assume that failure of a system in field use will incur two costs: cleanup, $Ccu$, and replacement, $Cta$. Let $Nta$ be the number of systems (tanks) deployed in the field, and let $NfF$ denote the random number of those systems that fail in the “field” during a pre-established time period (say, 1 month). Let $pfF,j=P(NfF=j),j=0,\u2026,Nta$ denote the PMF of the random variable $NfF$. The expected cost, $E[CfF,1]$, of failures during one time period is

When failures occurring during separate time periods are independent, the expected cost of failures, $E[CfF(Nmo)]$, during $Nmo$ time periods is

This cost applies to options 1, 2, and 3. Any action that causes the probabilities $pfF,j$ to be greatest near $j=0$ will decrease the expected cost $E[CfF(Nmo)]$. We assert that testing alone or modeling with calibration and validation testing, when done appropriately, will have that effect.

When we choose to pretest systems in the field to establish units that are vulnerable to failure, or to model systems in the field in order to predict vulnerability to failure, we incur costs. Those costs for option 2 are testing costs, and the cost of replacement of units that are vulnerable to failure. With $NfT$ denoting the random number of systems in the field that fail during a pretest, and $pfT,j=P(NfT=j),j=0,\u2026,Nta$ denoting the PMF of the random variable $NfT$, the expected cost, $E[CfT]$, of failures due to pretest is

Similarly, the costs associated with option 3 are the costs of modeling, performing calibration and validation tests, calibrating and validating the model, running the model to make predictions of which units will fail, and then replacing all units deemed vulnerable. Let $NfM$ denote the random number of systems in the field that fail during a pre-analysis. (Pre-analysis is the name we give to the process of modeling a system and then using the model to judge system vulnerability.) Let $pfM,j=P(NfM=j),j=0,\u2026,Nta$ denote the PMF of the random variable $NfM$. The expected cost, $E[CfM]$, of failures due to pre-analysis is

This cost applies to option 3. In addition, both options 2 and 3 incur costs for testing; for now, simply denote those extra costs as $Cex$.

The expected value of benefit–cost (net benefit) is

where $CfO=CfT$ for option 2, and $CfO=CfM$ for option 3. For option 1, $E[CfO]=0$ and $Cex=0$; this establishes a baseline. Both options 2 and 3 tend to increase $E[B]$ and decrease $E[CfF(Nmo)]$; this effect yields an increase in $Bnet$. However, both options 1 and 2 have associated costs, $E[CfO]$ and $Cex$. When the improvement in $Bnet$ that results from testing or modeling outweighs the cost $E[CfO]+Cex$; then, the testing or modeling option is the logical economic alternative. The remainder of this paper derives the probabilities used, above, and then presents some numerical examples connected to operations of the MysteryLiquid, Co. Normalized net benefit is introduced in the numerical example.

The probability of failure of tanks in the field and during pretesting or pre-analysis was shown, in Sec. 3, to be a critical factor in the performance of benefit–cost analysis. This section develops the classical approach to analysis of the probability of failure and shows how probability of failure analysis can be performed for tanks owned by the MysteryLiquid, Co. The influence of testing and analysis of items in the ensemble of a system are also explored.

The challenge problem [1] requires the prediction of tank probability of failure and that quantity enters into the economic analysis of this paper. We propose to use the classical reliability approach as the means for estimating the probability of failure.

The classical reliability approach [8,9] for establishing the probability of structural failure is based on two probability distributions: (1) the applied load distribution for random variable $S$ and (2) the corresponding resistance distribution for random variable $R$. In the simplest framework, the distributions are viewed as univariate, where both are functions of a common variable such as stress or force. The load may be an externally applied load; the resistance is the load at which the structure fails. The Challenge Problem defines failure as yielding of the tank material at a point.

The probability density functions (PDF) of $S$ and $R$ are denoted as $fS(s),\u2212\u221e<s<\u221e$ and $fR(r),\u2212\u221e<r<\u221e$. We assume the random variables $S$ and $R$ to be statistically independent. Classical reliability analysis defines failure as $R<S$ and seeks its probability. The probability of failure is the integral of the joint PDF of $R$ and $S$, $fR(r)fS(s)$, over the region where $r<s$. This yields the integral of the product of the PDF of the *S* distribution and the cumulative distribution function (CDF) of the *R* distribution as the probability of failure, $pf$

When we assume that the random variables $R$ and $S$ are normally distributed with means and variances $(\mu R,\sigma R2)$ and $(\mu S,\sigma S2)$, respectively, then the probability of failure is

where $\Phi (\u22c5)$ is the standard normal CDF. The probabilities of negative values of $R$ and $S$ are negligible for $R$ and $S$ with realistic means and variances. We will use this form for the probability of failure in option 1.

No time frame has been referenced in the probability of failure analysis performed above. In order to explicitly consider time, a failure analysis involving load and resistance random processes would be required. Such an analysis will not be performed here; rather, we will make the conservative assumption that the probability of failure developed here, and those developed later in this section, will be “monthly” probabilities of failure as in Eqs. (5) and (6).

The Challenge Problem provides two items that enable the computation of storage tank probability of failure: (1) experimental data that characterize tank material properties and dimensions and (2) a computer model to compute tank responses given input data.

First consider the data. Experimental data from a destructive test on one tank and nondestructive tests on other tanks are provided [1]. The destructive test used ten samples (from one tank) to characterize modulus of elasticity, $E(psi)$, Poisson's ratio, $\nu $, yield stress, $\sigma Y(psi)$, and wall thickness, $t(in.)$. Two other tanks were used to measure tank lengths, $Lt(in.)$, and radii, $Rt(in.)$. Although the data are limited, they can be used to estimate means and variances of the measured quantities, and the matrix of correlation coefficients, where appropriate, between pairs of simultaneously measured quantities. Let $Y1=(E,\nu ,\sigma Y,t)T$ denote the material properties. Based on the data provided with the Challenge Problem, the sample means and variances and sample correlation coefficient matrix of the material data, $Y1$, are

The lengths and radii are essentially uncorrelated and were measured on structures different from the one used to measure material data; therefore, they are characterized by their sample means and variances alone. They are

The tanks may be analyzed with these data; however, analyses involving critical designs seek a level of conservatism via use of confidence limits, especially when data are limited. In view of this, we develop statistics upon which failure probability analysis will be based, using the bootstrap [10,11].

All the material parameters ultimately play an important role in establishing conservatism; however, the role played by some of the parameters is simpler to visualize. Greater variability in yield stresses and tank wall thicknesses will tend to make a probabilistic analysis more conservative. We normalize the data so that the thickness will have equal importance with yield stress and choose the critical variable in the bootstrap analysis as

where $Y1(3)$ and $Y1(4)$ are the third and fourth elements, respectively, in $Y1$. (Choice of the variability measure, $Q$, for bootstrap analysis is arbitrary; however, the quantity used here does make the analysis conservative.) Two thousand bootstrap replicates, $qb,b=1,\u2026,2000$, of $Q$ were computed. The 95 percentile value of the $qb$ was identified (larger values of the statistic indicate analyses with more variability), and the associated bootstrap sample was used for further analyses. The material model is 95% conservative, in some sense. The sample means, variances, and correlation coefficient matrix for the 95th percentile bootstrap analysis are

Next, consider the computer model provided by the Challenge Problem for analysis of the response of a generic liquid storage tank. It is used with the bootstrap-generated statistics to develop the probability distribution of the generalized load random variable as explained below. It was meant to play a critical part in the Challenge Problem solution. We need it to approximate the probability of failure of a tank without resorting to an elaborate FE model. With respect to the economic benefit–cost analysis, the simple FE model is critical. We use it to compute the relative economy of a more elaborate FE model.

Operation of the Challenge Problem preliminary FE model can be symbolized

Output, $S$, is computed when the model is provided with the input $Y2$, a vector of variables including: (1) The elements of $Y1$ (excluding yield stress). (2) Tank length and radius, $Lt$ and $Rt$. (3) Specific weight of liquid in the tank, $\gamma $, height of liquid in the tank, $H$, and pressure in the tank, $P$. (Other input variables identify the location where tank response is computed, and other quantities taken as constants during the present analysis.) Output, $S$, is stress in the tank at its bottom, along the tank centerline. We interpret $S$ as the load on the system.

We interpret yield stress, $Y1(3)$, the third element in the vector $Y1$, as the tank resistance, $R$.

The Challenge Problem provides no guidance regarding the probability distributions of $\gamma $, $H$, and $P$, only intended operational limits. Yet, it informs us that a single tank has failed to pass a standard safety test. We assume it likely that one or more of the quantities surpassed its intended limits, but do not know how. We assume that $H=55\u2009in.$ is a constant, and that $\gamma $ and $P$ are the normally distributed, statistically independent random variables with means and variances $(3.3,0.332)$ and $(75,152)$, respectively. The remaining input variables are assumed jointly normal as well. The normality assumption is made for simplicity in this preliminary analysis. A more detailed analysis takes one of the two actions: (1) Perform statistical analysis on the Challenge Problem data to explore the most accurate representation of probability distribution; however, the Challenge Problem data are very limited and this alternative may prove unrealistic. (2) Use sensitivity analysis to investigate levels of conservatism associated with various assumptions of distribution form. These comments hold for all the distributions to follow.

One hundred joint realizations of the input variables with the moments in Eq. (15) were generated, using standard procedures, and the stress response, $S$, to each set of inputs was computed. The mean and variance of $S$ are $(2.25\xd7104,(4.20\xd7103)2)$. The mean and variance of $R$ are $(4.31\xd7104,(2.63\xd7103)2)$.

These statistics were used, along with the assumption of normality to compute the probability of failure. Based on the normal assumption for load and resistance, it is

Because of the effects of aging, stress concentrations, corrosion, etc., an actual tank failure distribution, i.e., the $R$-distribution, is likely to have both a lower mean and a larger standard deviation than the simple distribution of measured yield stress used here as the “failure model.” Nevertheless, given the information provided in the Challenge Problem, we have no choice but to use the probability of failure approximation of Eq. (17). We shall keep these caveats in mind as we interpret the results of calculations based on the information given in the Challenge Problem.

An FE model should provide a much better estimate of the probability of failure through enhanced fidelity; it can consider conditions like the ones described above, including, for example, variability in tank wall thickness and structural discontinuities. It can also evaluate stresses at multiple locations on the tank structure.

One economic analysis alternative considered in this paper, option 2, is to test all the storage tanks. When an ensemble of structures has been pretested to a threshold level of $rmin$, we are guaranteed that random failures will not occur at strengths below $rmin$.

Consider the performance of proof tests on an ensemble of structures. Each structure is loaded to the level $rmin$. When a structure fails, we replace it. When a structure survives the proof test, we retain it; we are certain that its strength is greater than $rmin$. However, the strength is still random. We assumed that both random load, $S$, and random strength, $R$, are normally distributed random variables with means and variances $(\mu S,\sigma S2)$ and $(\mu R,\sigma R2)$. The CDF of the proof tested structures is “truncated” normal with CDF

The probability of failure of the pretested structure (option 2) is

To understand the effect of pretesting structures, consider Fig. 1 and recognize that when $rmin=\u2212\u221e$, i.e., when no pretesting is performed, the probability of failure is Eq. (10). Figure 1 plots $pft/pf$ as a function of $(rmin\u2212\mu R)/\sigma R$ for three different values of $(\mu S\u2212\mu R)/\sigma R2+\sigma S2$, where $\sigma R=\sigma S$. (The labels on the curves are the values of $(\mu S\u2212\mu R)/\sigma R2+\sigma S2$.) Pretesting to establish a lower cutoff for strength always improves the probability of failure. The effect is more pronounced when the load and strength PDFs are spread further apart.

Pretesting storage tanks has two effects. First, it eliminates storage tanks that fail to pass the pretest where $r<rmin$, thereby diminishing the probability of failure, $pft$, during use of the remaining storage tanks. $pft$ decreases as $rmin$ increases. Second, it incurs the cost of tank replacements during the testing process for those tanks that fail the pretest. Denote the probability of failure of the pretest $ppt$. As $ppt$ increases, $pft$ decreases. Both effects are related to the value chosen for $rmin$. Because of the normal assumption made previously, the probability, $ppt$, is

for the curve in Fig. 2. As the stringency of pretesting increases, the chance of failure of tanks in the field decreases.

There are several reasons why the MysteryLiquid, Co., may wish to develop an FE model for the analyzing behavior of its tanks in lieu of testing. Among others, they may wish to possess a tool that informs them of the general behavior of a tank, a tool that may be useful in establishing a margin of failure for the ensemble of structures. Moreover, they may seek a tool which, upon calibration with data from an individual tank, can approximately predict the vulnerability of that tank to failure. Such a tool may guide tank testing and replacement of vulnerable tanks. These things affect the probability of failure of the ensemble.

The use of an FE model may mimic the use of pretesting, i.e., an FE model might be used to pre-analyze a tank (prior to continued service in the field) to determine whether or not the tank is vulnerable. To accomplish this, the FE model must be developed in a certain framework:

^{(1)}An FE model must be constructed. The model should enable the simulation of tank behavior including stochastic characteristics of tank material properties and geometry. If feasible, quantities like tank thickness should be modeled as random fields. The FE model should have the capacity to:^{(a)}Simulate and predict the probabilistic behavior of the ensemble of tanks.^{(b)}Simulate and predict the probabilistic behavior of individual tanks following recalibration with some of the characteristics, like wall thicknesses, of the tank.

^{(2)}Calibration tests on a sample of tanks must be performed to generate data to calibrate the ensemble FE model.^{(a)}Destructive tests might be run to generate data on material modulus of elasticity, $E$, Poisson's ratio, $\nu $, yield stress, $\sigma Y$, and tank wall thickness, $t$. Multiple coupons and their corresponding data might be obtained from each tank used for these purposes.^{(b)}Nondestructive tests might be run to generate tank geometric data, including lengths, $Lt$, radii, $Rt$, and thicknesses, $t$. (Thicknesses might be obtained using ultrasonic measurements.)

^{(3)}Validation tests must be run on a sample of tanks to generate data to be used as the basis for comparison in tests of the predictive accuracy of the ensemble FE model. During those tests, response quantities must be measured, including strains and displacements; in addition, test conditions, including boundary conditions, liquid specific gravities, liquid depths, tank pressures, and tank thicknesses, must be measured for transmission to FE modelers during simulations.^{(4)}The FE model can be used to make predictions about the probabilistic behavior of the ensemble of tanks, given appropriate data from the validation tests.^{(5)}Validation comparisons between the predictions from the FE model and the validation test outputs must be conducted to assess the predictive accuracy of the FE model. Steps 1–5 must be repeated until the FE model is validated to stringent standards.^{(6)}Upon recalibration of the FE model with the characteristics of individual tanks, the FE model will be capable of predicting the vulnerability of each tank. Tanks that are vulnerable to failure during use in the field must be replaced with new tanks.

When a model proves to be accurate it is because it has the proper form and resolution and it has been sufficiently and appropriately calibrated. However, the value of validation in step 5 is that it provides the evidence that the model is accurate and can be used in place of physical experiments. Through the completion of step 5, we have developed a better FE model than the one we started with. Here is what we mean by “better.” The original, simple, FE model was based on limited data and had (presumably) lower resolution than the current FE model. The current FE model is based on additional calibration data. For that reason, the variance of the sampling distribution of the probability of failure estimate is higher in the former model than in the current model. The mean of the probability of failure estimate obtained using the current model may be higher or lower than that obtained using the simple model, but the current estimate is certainly more accurate and the current model can certainly be used, confidently, for screening the tanks used by the MysteryLiquid, Co. (step 6).

The process enumerated, above, is similar to the pretesting process. However, the information obtained during pre-analysis is anticipated to be more comprehensive. During physical pretesting, we establish whether or not a tank can survive loading with a pressure that raises the response stress in an average tank to the level $rmin$. We do not know the response stress in the tank being tested because we do not know the wall thickness and other material parameters at the location of measurement. During FE analysis, we can predict stresses at any desired set of locations on the structure. Because the FE model has been recalibrated with information about an individual structure, like thicknesses at critical locations, then the degree of variability of critical stress predictions will be lower than it would be using the general stochastic model. The results of FE analysis will lead to a well-informed assessment of the vulnerability of a tank. That assessment should be better than the assessment obtained with a simple pretest.

When criteria for the assessment of tank vulnerability and tank replacement are established, every tank in the field can be analyzed, and tanks that do not satisfy the response criteria can be replaced. The number of tanks to be replaced depends on the assessment criteria; $ppm$ is the probability of tank replacement (analogous to $ppt$). The process outlined here will yield a probability of tank failure (option 3) in the ensemble, $pfm$ (analogous to $pft$). The probabilities $pfm$ and $ppm$ will obey a relation like the relation between $pft$ and $ppt$, the one shown in Fig. 2. However, it is anticipated that a high-quality model will yield a relation characterized by a curve that is lower than the one shown in Fig. 2, that is, $ppm<g(pfm/pf)$. In Sec. 5, we propose a model to characterize that difference.

Through careful specification of assessment criteria, we can establish tank replacement criteria that result in any desired level of probability of failure in the ensemble of tanks.

The objective in this paper is to optimize benefit–cost for three alternative courses of action (options 1, 2, and 3 in the “Introduction” section) for the MysteryLiquid, Co. As specified in Secs. 2 and 3, the business benefit, $B$, to the MysteryLiquid, Co., of chemical storage is the revenue resulting from the fees collected for the storage. The costs, $C$ (in this simplified analysis) come from the following:

^{(A)}tank replacement and site cleanup when a failure (accident) occurs during the course of liquid storage^{(B)}tank replacement when following pretest or pre-analysis, it is anticipated that a tank is “near” the failure threshold and a decision is made to replace the tank^{(C)}contract cancelation if a customer decides to move his chemical storage to another service supplier following one or more tank failures^{(D)}construction of a FE model, if that option is adopted associated costs are:^{(a)}Performance of calibration and validation tests on a sampling of tanks in the field.^{(b)}Performance of calibration on the ensemble model.^{(c)}Prediction analysis and validation assessment.^{(d)}Performance of a test on each tank to recalibrate the FE model for simulation of each tank. Running the FE model to gauge the fitness of each tank. Replacing tanks deemed insufficiently reliable for continued use.

Option 1 (no additional testing or modeling) is subject to costs arising from A and C. Options 2 (testing only) and 3 (modeling and testing) are subject to potential costs associated with A, B, and C. Option 3 requires, in addition, the costs associated with D. This section develops probability models for the events that lead to the costs associated with A, B, C, and D. The criterion for optimization of net benefit is that the expected value of net benefit be maximized. Section 6 presents the results of a numerical example that compares the three alternatives.

Tank failures may occur in connection with all options 1, 2, and 3. Physical failures or failures to satisfy response criteria may occur during pretesting; failures to satisfy response criteria may occur during pre-analysis; physical failures may occur later during tank use in the field. Any random experiment that establishes whether or not a tank fails is a Bernoulli trial [12,13] with probability of failure $p1$. (It is assumed that probability of failure of a tank at one site is the same as the probability of failure of a tank at any other location. The $p1$ used here represents the $pf,pft,pfm,ppt,ppm$, developed in Sec. 4. Tank experiments are assumed to be statistically independent.) The worldwide number of tanks is $Nta=Nlo\xd7Ntp$, where $Nlo$ is the number of storage locations and $Ntp$ is the number of tanks per location. The number of tank failures, $N$, is a random variable with a binomial distribution. Its PMF is

During pretesting and pre-analysis, each tank is tested (or analyzed) once; the binomial PMF provides a model of experiment outcomes. During tank use, the survival/failure of each tank is observed over a period of months, $Nmo$. Tank failures should be modeled as a random counting process with temporal parameter. A conservative simplification assumes that tank survival/failure can be observed on a monthly basis and then combined to obtain failures over the analysis period; the simplification is adopted here. The mean and variance of the random variable $N$ are $(\mu N,\sigma N2)=(Ntap1,Ntap1(1\u2212p1))$.

All options, 1, 2, and 3, are subject to the possibility that a storage tank will fail during use. Section 4 used a probability and statistics development to obtain estimates for three probabilities: (a) $pf$, the probability of failure of a tank that is not pretested or pre-analyzed (Eq. 17); (b) $pft$, the probability of failure of a tank that has been pretested (Eq. 19); and (c) $pfm$, the probability of failure of a tank that has been pre-analyzed with an FE model (Sec. 4.4 and Eq. (31)). When each of these three probabilities is used in place of $p1$ in Eq. (22), the PMF of the number of failures is obtained; it has the mean and variance listed following the equation and is the PMF referred to in Eq. (5).

When pretesting is performed on the storage tanks forming the ensemble of tanks used by the MysteryLiquid, Co., failures may occur in tanks that are marginal or substandard. The probability, $ppt$, of those failures was analyzed in Sec. 4.3. Likewise, when a tank is pre-analyzed using a calibrated FE model, the tank may be judged marginal or substandard. The probability, $ppm$, of a tank being taken out of service as the result of pre-analysis was analyzed in Sec. 4.4. When $ppt$ or $ppm$ is used in place of $p1$ in Eq. (22), the random variable $N$ counts the number of tanks that will fail the pretest or pre-analysis. For option 2, this provides the PMF used in Eq. (7); for option 3, this provides the PMF used in Eq. (8).

A factor introduced in Eq. (4) that may lead to loss-of-benefit for the MysteryLiquid, Co., is the potential for contract cancelation when one or more accidents occur either “on-site” (site currently under consideration) or “off-site” (another site). We assume that the probability of contract cancelation incorporates memory of accidents that occurred during previous months. A Markov chain (MC) [14] is used, in general, to incorporate finite-duration memory; therefore, we use it here with reference to accidents. The form of MC used here has vector states.

Let ${Xj,j=0,1,\u2026}$ be a discrete-state, discrete-parameter, indicator, and stationary random process [15] identifying contract continuity at a storage site; the random process is an MC. The discrete index, $j$, is a temporal parameter that counts months. Each random variable, $Xj$, has states $0$ and $1$ indicating contract status. $Xj=1$ indicates contract continuity during month $j$; $Xj=0$ indicates cancelation during month $j$. A canceled contract at a given site cannot be reinstated in the short term (reinstatement constraint). Therefore, $P(Xj=0,Xj+m=1)=0,j=0,1,\u2026,m0$.

Two quantities are required to define an MC model: the state transition probability matrix and the vector of initial state probabilities. For example, consider a two-step (month) memory. (The development is easily generalized.) Four events (because of the reinstatement constraint) identify allowable temporal transitions from one contract state to another

The transition probabilities between contract states depend upon whether or not an accident occurs on-site and/or off-site during the $jth$ month; therefore, we also define the random process ${Nj(m),j=0,1,\u2026,m=0,1}$. The random variable $Nj(m)$ counts the number of accidents that occur during month $j$, on-site ($m=0$) or off-site ($m=1$). The random process is assumed independent, identically distributed. The probability of an accident is always very small; we limit the range of realizations of the random variables, $Nj(m)$, to $[0,1]$. $Nj(m)=0$ when zero accidents occur during month $j$. $Nj(m)=1$ when one or more accidents occur during month $j$.

Matrices of state transition probabilities are required to solve the current problem, both for on-site and off-site accidents. Denote the generic transition probability matrix $P(\u2113,m),\u2113=0,1,m=0,1$. It has dimension $4\xd74$ with elements, $Pik(\u2113,m),i,k=1,\u2026,4,\u2113=0,1,m=0,1$. Index $\u2113$ encodes the nonoccurrence ($\u2113=0$) or occurrence ($\u2113=1$) of an accident. The $(i,k)th$ element of $P(\u2113,m)$ is

The transition probabilities establish conditional probabilities of going from one state, $Ek(j)$, to another state, $Ei(j+1)$. The transition probability matrix always takes the form

Ones in the transition probability matrix indicate transitions that are certain without regard for the values of $\u2113$ and $m$; zeros in the transition probability matrix indicate transitions that are impossible without regard for the values of $\u2113$ and $m$ because of the reinstatement constraint. The quantity $p0(\u2113,m)$ is the conditional probability that the on-site contract is canceled due to the occurrence of $\u2113$ accidents at the location indexed $m$. We take $p0(0,0)=p0(0,1)=0$ and $0<p0(1,1)<p0(1,0)<1$. That is, when no accident occurs, the contract will not be canceled; the probability of contract cancelation at a given site is greater when an on-site accident occurs than when an off-site accident occurs.

The second probability structure required to define the MC model is the PMF of the events $Ei(0),i=1,\u2026,4$. These are included in the $4\xd71$ vector $p0(m),m=0,1$. The event $Ei(j),i=1,\u2026,4,j=0,1,\u2026$ is statistically independent of the accident random process, ${Nj(m),j=0,1,\u2026,m=0,1}$; therefore, $P(Ek(j)\u2229Nj(m)=\u2113)=P(Ek(j))\xd7P(Nj(m)=\u2113)$. The PMF of the contract states is propagated with the formula

The marginal PMF of the events $Ei(j),i=1,\u2026,4,j=0,1,\u2026$, is a function of the PMFs, $pj(m),j=0,1,\u2026,m=0,1$, associated with on-site and off-site accidents. Because it was assumed that all storage sites have the same number of tanks and failure in all tanks is equally probable, the relative frequencies of on-site and off-site accidents are $1/Nloc$ and $(Nloc\u22121)/Nloc$, respectively. In view of that, the marginal PMF, $pj,j=0,1,\u2026$, of the events $Ei(j),i=1,\u2026,4,j=0,1,\u2026$, is

Let $Tc$ denote the month during which contract cancelation first occurs. Then, the CDF of $Tc$ is

where $(\u22c5)4$ is the fourth element in the vector. Contract cancelation during or before month $j$ is the complement of $E4(j)$. The PMF of $Tc$ is established by differencing its CDF

Performance of benefit–cost analysis requires prediction of a model's utility to accurately predict system behaviors and reduce probability of ensemble failure, as discussed in Sec. 4.4. This subsection develops an approach to express the predictive accuracy of a model in terms of its calibration and validation.

Numerous factors affect the predictive accuracy (quality) of an FE model. Among those are model resolution, the experimental data available to calibrate the model, and the accuracy of the calibrated model. Good predictive accuracy may be obtained by appropriate calibration and confirmed via validation with stringent requirements. Economic benefit–cost analysis is performed prior to creation of the model; therefore, the effects of modeling must themselves be modeled. This requires assumptions about predictive accuracy of FE models. The relationships developed here bear directly on the economics of modeling, calibration, and validation.

We assume, for simplicity, that model quality, $qmod$, is a function of the number of calibration and validation tests, only, and can be expressed

where $ni,i=1,2,3$ denote the numbers of nondestructive calibration tests, destructive calibration tests, and validation tests, respectively. The $bi,i=1,2,3$ are positive-valued scale parameters. The $ai,i=1,2,3$ are positive-valued shape parameters. $q\u03f5[0,1)$ and the model reflects diminishing returns [16]; increasing the $ni$ yields $qmod\u21921$.

Following the discussion in Sec. 4.4, we assume that if model quality is high, then the probability of failure predictions produced by the model will be superior to those obtained through pretesting. Until a model is constructed, calibrated, and validated, we cannot quantify the superiority of the FE model over simple pretesting. To accommodate this hoped-for superiority, we use Eq. (21) to relate $ppm$ to $pfm$, but with an additional factor (fractional) to obtain an effective probability of failure, $pfm,eff$, of a member of the ensemble. The model takes the form

where $\alpha \u03f5(0,1)$ is a parameter, and $pfm$ is the probability of failure of a tank defined in Sec. 4.4. As $qmod$ varies from zero to one, the coefficient on the right varies from one down to $(1\u2212\alpha )$. Therefore, the parameter $\alpha $ is a (complementary) measure of the best improvement in performance of the ensemble of tanks that is possible when an FE model is used to eliminate potentially defective tanks. The quantity $ppm$ appears as $p1$ in Eq. (22), and the binomial PMF in Eq. (22) is used to model the PMF in Eq. (8).

The cost of model construction enters into the benefit–cost optimization as a component in $Cex$ defined following Eq. (8). Denote that quantity $Cmc$.

Experimental data similar to those described in Ref. [1] might be used to calibrate the FE model parameters. Data from nondestructive and destructive tests are used for calibration. Let $Cnd$ and $Cdc$ denote the costs of a nondestructive and a destructive calibration test including the cost of model calibration. The total cost, $Cca$, of calibration is

The expression assumes one type, each, of destructive and nondestructive calibration tests; if multiple test types are specified, then the numbers and costs of all types must be included in Eq. (32). The numbers $n1$ and $n2$ influence both the cost of calibration testing and the probability of failure of a tank following FE analysis and failure prediction. $Cca$ is a component of $Cex$.

Pressure vessel loads different from those used for calibration of the FE model should be used during validation tests. Let $Cv1$ denote the cost of one validation test including the cost of a validation comparison. The total cost, $Cva$, of validation is

One type of validation test is assumed here. The number $n3$ influences both the cost of validation testing and the probability of failure of a tank following FE analysis and failure prediction. $Cva$ is a component of $Cex$.

This step would not be included in the usual validation process because an FE model is not usually meant to simulate the behavior of every structure in an ensemble. In the present application of option 3, the step is required because every tank used by the MysteryLiquid, Co., must be screened for its failure potential; tanks with performance predicted as marginal or substandard are replaced. If this step was not taken, then FE analyses could certainly be used to improve the estimate of probability of failure of tanks in the ensemble, but it could not be used to change the probability of failure of the ensemble through removal of tanks near the failure threshold.

There is a cost associated with testing all tanks to obtain data for recalibration of the validated FE model and a cost for running the recalibrated model to establish whether or not a specific tank needs replacement. Let $Cc2$ denote the cost of performing a recalibration test on each tank to obtain data to calibrate the validated model to the behavior of each tank, and let $Cmr$ denote the cost of running the phenomenological model to establish the vulnerability of each tank. The total cost, $Cpm$, of performing the tests and running the model is

Equation (9) lists, explicitly, diminishment of benefit due to customer loss of confidence and the cost of failures of tanks in the field and the cost of failures of tanks during pretesting (or premodeling). Another cost, $Cex$, is a catch-all for modeling and/or testing costs. We summarize these latter costs here.

For option 2, there is an additional cost of performing pretest on each tank in the field. That cost is

Option 3 includes several components in $Cex$ associated with developing a predictive FE model. They are (a) the cost of calibration, $Cca$; (b) the cost of validation, $Cva$; (c) the cost of performing recalibration tests on all tanks to obtain data to calibrate the validated FE model to the behavior of individual tanks and the cost of running the recalibrated FE models to predict vulnerability of individual tanks, $Cpm$; and (d) the cost of model construction, $Cmc$. Total extra costs are

Some of the terms on the right side, above, are functions of $ni,i=1,2,3$, the numbers of calibration and validation tests.

Equation (9) provides the generic form of net benefit for economic analysis of options 1, 2, and 3. The specific form for net benefit with option 1 is

We take this quantity to be the baseline against which net benefit of the two other options will be compared.

The specific form for net benefit with option 2 is

Because we are free to choose (within limits) the pretest level (and the implied values of $pft$ and $ppt$), $Bnet(2)$ can be maximized with respect to that level.

The specific form for net benefit with option 3 is

$Bnet(3)$ can also be maximized with respect to the failure criterion used during pre-analysis (and the implied values of $pfm$ and $ppm$), and, in addition, it can simultaneously be maximized with respect to $ni,i=1,2,3$, the numbers of calibration and validation tests.

Once $Bnet(2)$ and $Bnet(3)$ are optimized (maximized), they can be compared to $Bnet(1)$ and to each other. The greatest value indicates the economically logical course of action for the MysteryLiquid, Co.

Numerical results for the Challenge Problem, following options 1, 2, and 3, are provided. The probabilities, costs for hardware, software, tests, etc., and FE modeling quality parameters are specified here; the critical issue connected to costs is their relative values in dollars.

The parameter values for the preliminary example are listed in Tables 1–4. Model probabilities are listed in Table 1. The probability definitions refer to Eq. (25). Numerical quantities required during analyses are listed in Table 2. Costs are listed in Table 3. FE model quality parameters are listed in Table 4.

The costs of the recalibration tests, $Cc2$, and the FE model runs, $Cmr$, may seem low, but they represent the costs per tank. All the FE model runs may be performed together, and when the number of tanks is high, the costs may be very substantial. The calibration tests may be quite simple, involving, for example, measurements of tank thicknesses via ultrasonic measurements.

The number of tanks specified in the Challenge Problem is $450$. We analyze the problem where $113$ tanks are operated at each of the four locations ($452$ tanks). The maximum gross benefit is $2.71\xd7106dollars$. When the benefit–cost analysis is run as specified in Sec. 5.4, the following results are obtained:

where the latter two quantities are the results of optimization. $Bnet(2)$ and $Bnet(3)$ are within 10% of the highest net benefit. Both $Bnet(2)$ and $Bnet(3)$ are the results of optimizations on desired probability of failure, and $Bnet(3)$ is also the result of optimization on numbers of calibration and validation tests. The optimizations on desired probability of failure evaluate $Bnet$ for $[0.2,0.3,\u2026,1.0]pf$, where $pf$ is given in Eq. (17). The optimization on numbers of calibration and validation tests performs an exhaustive evaluation of $Bnet$ for $n1\u2208[1,\u2026,12],n2\u2208[1,\u2026,8],n3\u2208[1,\u2026,8]$.

Some results of the analysis that help explain net benefit values of the options are provided in Table 5.

The probabilities of failures in row 1 of Table 5 are very low; therefore, the two costs that rely directly upon these values—the expected cost of in-service failures and the expected cost of contract cancelation—are also very low. The expected cost of tank replacement for tanks that fail the pretest (option 2) and tanks that fail the pre-analysis (option 3) are relatively low; however, the corresponding cost is zero for option 1. These factors combine to yield net benefits for options 2 and 3 that are lower than the net benefit of option 1. The costs for model calibration and validation tests account for one each, nondestructive and destructive calibration test and one validation test for option 3, because these are the minima allowed, and the optimum net benefit calls for the minimum number of tests. Correspondingly, the model quality, $qmod$, is low ($5.15\xd710\u22123$). There is little economic advantage in having an FE model in this case, but also, there is little cost. By lowering its expected net benefit by about 10%, the MysteryLiquid, Co., could have an FE model.

There is a rationale for these results. When the number of tanks is small, and more importantly, when the probability of failure is low, the cost consequences of option 1 (no testing and no modeling) are insignificant compared to the benefit of doing business. As long as option 2 incurs a cost for pretesting all tanks and option 3 incurs all the costs associated with testing and constructing an FE model, these latter two options cannot compete with the net benefit of option 1, when failure probability is low and the population of tanks is small.

With reference to an issue raised in Sec. 4.4, it is interesting to note that the benefits and costs that lead to net benefit are simply estimates based on limited data, that is, the probabilities of failure in the top row of Table 5 are simply estimates that reflect uncertainty. In light of that fact, net benefit estimates include random and epistemic uncertainty and that uncertainty may be substantial. In the present case, the variability in net benefits is likely 10% (the difference between the first and second or third net benefits) or higher. If variability is as high as 10%, then the decision to implement option 2 or option 3 might be made in spite of the expected values in Eqs. (40)–(42).

There are conditions under which it is advantageous to either (1) pretest all the tanks in the field or to (2) create an FE model, calibrate it, validate it, and then pre-analyze all the tanks in the field. To understand those conditions, consider Fig. 3. The four figures show the normalized net benefit, i.e., $(Bnet)/Bmax$, of all the three options over a spectrum of failure probabilities and for different numbers of tanks. $Bmax$ is the highest benefit that could potentially be collected. The number of tanks increases through the four figures as specified in the insets of Fig. 3. The failure probabilities at which net benefit is analyzed in all four figures are $1.61\xd710\u22125,1.61\xd710\u22124,1.61\xd710\u22123$, and $1.61\xd710\u22122$. The normalized net benefits of options 1 and 2 remain the same in all four graphs. However, the normalized net benefit of option 3 increases with $Nta$ because the fraction of overall costs required for model construction, calibration, and validation decreases as the number of tanks grows. The net benefit of option 3 surpasses the net benefit of option 1 for failure probability, $pf$, slightly greater than $2\xd710\u22123$ when $Nta=4000$.

These latter results can be understood in light of two observations. First, when the failure probability, $pf$, increases, the expected costs of tank replacement due to in-service failures increases, and the expected cost of contract cancelation increases. By choosing the appropriate probabilities of failure, $pft$, and $pfm$, options 2 and 3 can diminish those costs without incurring too much added cost. Second, as the number of tanks owned by the MysteryLiquid, Co., increases, the costs of pretesting all tanks in option 2, or building an FE model and calibrating and validating it in option 3, become insignificant relative to the net benefit. These conclusions are likely to hold for all similar private businesses.

The final analysis summarized is one in which $pf=2.30\xd710\u22123$ and $Ntank=4000$. During optimization in option 2, the ensemble failure probability is $pft=1.61\xd710\u22123=0.7pf$. During optimization in option 3, the ensemble failure probability is $pfm=2.07\xd710\u22123=0.9pf$ and the effective failure probability is $pfm,eff=2.42\xd710\u22124=0.11pf$ (Eq. (31)). The optimum numbers of nondestructive and destructive calibration tests and validation tests are, respectively, $n1=8,n2=4,n3=4$. These yield a model quality of $qmod=0.96$. The net benefit results are

Although a comprehensive investigation of sensitivity of the net benefit and other outcomes to variation of parameters has not been conducted, a few additional analyses lead to the following conclusions for the problem described in the previous paragraph.

When $\alpha $ in Eq. (31) is reduced by 20%, net benefit is reduced by about 1%. (Parameter $\alpha $ near one reflects optimism that the model can be used to accurately reduce the tank probability of failure.)

When any of the model parameters $bi,i=1,2,3$ is increased to the next higher integer, the number of calibration tests

*n*, increases, but the net benefit remains essentially unchanged. (The reason is that the costs of calibration and validation tests are a small fraction of the net benefit.)_{I}

This paper has explored techniques for decision analysis using economic benefit–cost analysis of model validation and nonmodeling alternatives. The development was performed in the framework of the Challenge Problem defined in Ref. [1]. The net benefit for FE model development along with the calibration and validation testing that support it was compared to the net benefits for a “no additional testing and no additional modeling” option, and a “testing and no additional modeling” option.

When the benefit–cost problem parameters are those specified in the Challenge Problem, i.e., 450 tanks with ensemble probability of failure of $1.61\xd710\u22125$ (inferred during our analysis of the Challenge Problem data), then the optimal alternative is option 1, no additional testing and no additional modeling. That is, economics principles dictate that the course of action for the MysteryLiquid, Co., with greatest net benefit is to continue business without additional testing and without an FE model with its associated testing. However, as the probability of failure increases and the number of tanks increases, the FE model with calibration and validation testing becomes the preferred alternative. This conclusion hinges on our assumptions about the accuracy potential of an FE model discussed in Sec. 5 and Eq. (31) specifically.

Although the benefit–cost analyses performed in this paper consider a single scenario and, even with that, a very limited set of costs, it yields some conclusions that are generally applicable:

^{(1)}There are scenarios where neither testing nor validation should be done (based on economic principles).^{(2)}As the number of nominally identical units maintained by a business increases, the cost of development of a model for the behavior of those units becomes more economical.^{(3)}As the probability of failure of an ensemble of nominally identical units maintained by a business increases, the cost of development of a model for the behavior of those units becomes more economical.

The FE model described in this paper is one that would be calibrated using data from tanks in the ensemble of tanks owned by the MysteryLiquid, Co. Then, the FE model would be validated against the behavior of a sample of tanks drawn from the same ensemble. Following calibration and validation, the model would be suitable to make probabilistic predictions of tank responses. In order for the FE model to be an effective tool for the improvement of the reliability of the ensemble of tanks, though, we require more. We require that the FE model be suitable for a second calibration (i.e., recalibration), one that permits prediction of the behavior of a specific tank. That prediction would be used to gauge the vulnerability of tanks in the field and replace tanks that fail to satisfy some criteria. This action would decrease the probability of failure of the ensemble of tanks. This reduction in probability of failure would be a direct effect of use of the FE model and would diminish costs to the MysteryLiquid, Co. The requirement of an FE model described here may be one that extends to all companies that might benefit from possession and use of an “in-house” FE model of an ensemble of systems that are nominally identical.

We have demonstrated that it is relatively easy to assess the beneficial effect of pretesting on the probability of failure of an ensemble of structures; it is much more difficult to assess that same effect of pre-analysis with an FE model, especially when the FE model does not yet exist and must first be constructed, calibrated, validated, and recalibrated. In order to develop the net benefit connected with construction and use of an FE model, some measure of model quality and the improved capability of an FE model to predict vulnerability must be specified. That clearly requires knowledgeable personnel experienced in modeling, data analysis, and stochastic model analysis.

- $ai,i=1,2,3$, =
shape parameters of the model quality model

- $bi,i=1,2,3,$ =
scale parameters of the model quality model

- $B$ =
benefit

- $Bnet$ =
net benefit

- $Cca$ =
total cost of calibration

- $Ccu$ =
cost of cleanup

- $Cdc,Cnd$ =
cost of a destructive, nondestructive calibration test

- $Cex$ =
extra costs

- $CfF,1,CfF(Nmo)$ =
cost of failures during one, $Nmo$ time periods

- $CfM,CfT$ =
cost of failures due to pre-analysis, pretest

- $Cmc,Cmr$ =
cost of model construction, model run

- $Cpm$ =
total costs of recalibration and model runs

- $Cta$ =
cost of tank replacement

- $Cc2$ =
cost of performing a recalibration test

- $Cv1,Cva$ =
cost of one validation test and a prediction-test comparison, total cost of validation

- $Ei(j)$ =
event describing consecutive states of $$