What is calibration in IGW-NET?

Calibration is the process of adjusting model parameters so that the simulation matches observed data as closely as possible. In IGW-NET, the most common calibration target is Static Water Levels (SWLs) — measured water-table elevations from wells. The model starts with reasonable initial parameters (K, recharge, leakance values from the Data Center or user input), and calibration adjusts those parameters — typically via multipliers (K × 0.3, Recharge × 1.1, etc.) — until simulated heads at well locations match observed heads within acceptable tolerance. Calibration can be done manually (you try values, look at the calibration chart, iterate) or automatically via the Automatic Calibration tool powered by UCODE parameter estimation. Either way, the goal is the same: produce a model whose parameters are consistent with observed reality, not just with theoretical defaults.

What is a Static Water Level (SWL)?

A Static Water Level is a measurement of the water-table elevation at a well, taken when the well is not being pumped (so the water level has returned to its equilibrium or 'static' position). SWLs are recorded routinely when water wells are drilled — drillers measure and report the water level after drilling is complete. Across North America, state and provincial databases (Michigan Wellogic, USGS NWIS, Canadian provincial records) contain millions of SWL records. These records are the primary calibration target for regional groundwater models — they tell you what the water-table elevation actually is at thousands of points across the domain, and your simulated model should reproduce those values to within acceptable tolerance. IGW-NET accesses SWL records through the IGWServer data source; the Well Data Processing Tool lets you filter which SWLs to use as calibration observations.

What are the K multiplier and Recharge multiplier in IGW-NET?

The K multiplier and Recharge multiplier are global scalar adjustments applied to the entire K field and entire recharge field, respectively. They are IGW-NET's simplest calibration knobs. K multiplier 0.3 means every K value in the model is multiplied by 0.3 (the aquifer becomes 3.3× tighter); Recharge multiplier 1.1 means the recharge input is multiplied by 1.1 (10% more water entering the aquifer). The multipliers are configured in DomainAttr → Aquifer Attributes → Multiplying Factors with 'Calib' checkboxes that flag them as calibration parameters. You can set the multipliers manually (try 0.3, simulate, evaluate; try 0.5, simulate, evaluate) or have the Automatic Calibration tool search for optimal values. Critically, K multiplier and Recharge multiplier are not independently identifiable from head observations alone — the head field responds to the ratio K/R, so changing K and R proportionally produces the same heads (K-recharge non-uniqueness). To separate them, you need baseflow observations, lake stage observations, or other data that constrains the absolute magnitudes.

How do I read the calibration chart?

The calibration chart is a scatter plot of observed water levels (x-axis) versus simulated water levels (y-axis) at the observation well locations. A well-calibrated model has its scatter cloud centered on the 45-degree line where observed equals simulated, with symmetric scatter around that line. Signs to look for: (1) If the cloud is centered on the 45° line → model is unbiased; the parameters are approximately correct on average. (2) If the cloud is shifted above the 45° line → simulated heads are systematically higher than observed; model is over-predicting water levels; typical fix is to increase K multiplier or decrease Recharge multiplier. (3) If shifted below the 45° line → simulated is systematically lower; decrease K or increase Recharge. (4) If the cloud is tight around the line → low scatter, good fit. (5) If the cloud is very spread out → high scatter; the parameters don't explain the observations well, indicating either wrong parameter values, missing processes, or observation errors. The customizable confidence band overlay on the chart shows the statistical spread; a well-calibrated model has most of its data points inside the band.

What is Automatic Calibration in IGW-NET?

Automatic Calibration is an IGW-NET tool that runs parameter estimation automatically using UCODE — the USGS universal inverse code. Instead of manually trying parameter values and evaluating the calibration chart, you specify which parameters to calibrate (check the Calib boxes in Multiplying Factors: K multiplier, Recharge multiplier, etc.), provide observations (SWLs from the IGWServer or user-uploaded), and launch Automatic Calibration. UCODE runs many forward simulations with different parameter values, computes the objective function (sum of squared residuals between simulated and observed heads) for each, and uses a gradient-based optimization algorithm to find the parameter set that minimizes the objective. The output is the optimized multipliers plus diagnostic statistics (parameter uncertainty, sensitivity, correlation). You can run Automatic Calibration on the basic K and Recharge multipliers, or extend it to additional parameters (leakance multipliers, per-zone K factors for T-PROGS zones, etc.) if the observational data supports the added complexity. UCODE handles the optimization math; the modeler handles the problem setup and interpretation.

Why can't I calibrate K and Recharge independently from head observations alone?

Because the groundwater flow equation is linear in the K/R ratio, not in K and R separately. For a steady-state model with only head observations as calibration targets, any parameter set (K, R) that produces the correct heads can be scaled: (α·K, α·R) produces identical heads for any positive α. This is K-recharge non-uniqueness — an infinite family of (K, R) pairs all fit the same head observations equally well. Head data alone cannot distinguish them. To break this non-uniqueness, you need observations that depend on the absolute magnitudes, not just the ratio: (1) Baseflow observations from stream gauges — the flux of water discharging to streams scales with K·(head gradient), so given heads, baseflow tells you K independently. (2) Coupled lake or wetland stage observations — the lake water balance depends on fluxes, which fix K. (3) Pump test observations — aquifer responses to known pumping rates constrain absolute K. (4) Specific yield from transient data — storage responses constrain K independently of R. Without any of these, calibration can only nail down the K/R ratio, not K and R separately. This is why the Barron Lake coupled model calibrates against both well heads AND lake stages — the lake stages are what pin down the absolute K (and lake leakance) rather than just the K/R ratio.

What does multi-target calibration mean in IGW-NET?

Multi-target calibration fits the model to multiple different types of observations simultaneously — not just head SWLs at wells, but also lake stages (if Ch. 15 coupled lakes are active), baseflow rates, observed spring discharges, or other hydrologic data. The calibration objective function becomes a weighted sum of residuals across all observation types. Multi-target calibration is essential for coupled models (Ch. 15) because single-target calibration against heads alone suffers from K-recharge non-uniqueness; adding lake stages (which depend on fluxes, not just heads) breaks the non-uniqueness. The Barron Lake calibration is a canonical example: the UCODE run simultaneously fits 11 private well head observations AND the observed lake-level time series from LakeLevel2013.csv. Two observation types, different sensitivities to different parameters: well heads constrain aquifer K distribution; lake stages constrain lake leakance and the absolute-magnitude K (breaking K-R non-uniqueness). The jointly-calibrated model is far more trustworthy than either target alone would allow. Multi-target is the default approach for any coupled model or any model where the question depends on absolute rates rather than just heads.

What are common calibration pitfalls?

Several patterns that repeatedly cause trouble: (1) Over-fitting with too many parameters. If you calibrate 15 parameters with only 20 observations, the calibration is underdetermined — many parameter sets fit the data equally well, and the 'best' one is arbitrary. Parameter count should be ≤ 10-20% of observation count. (2) Using bad observations. SWLs from 1950 may not reflect current conditions; wells that were pumping when measured, wells in the wrong aquifer layer, transcription errors in driller reports. Filter ruthlessly through the Well Data Processing Tool. (3) K-recharge non-uniqueness. Calibrating K and R against heads only and claiming the specific K value is physical — it's not, you've only constrained the ratio. Use multi-target calibration with baseflow or lake stages. (4) Local minima in auto-calibration. UCODE finds a local minimum of the objective function, which may not be the global minimum. Try different starting points; watch for unphysical parameter values. (5) Systematic residual patterns. If all wells in one region are biased the same way, the parameter adjustment isn't addressing the real cause (could be unmodeled recharge variation, local heterogeneity, BC issues). Look at residuals on the map, not just in aggregate. (6) Trusting a perfect fit. A model that matches observations perfectly is suspicious — either observations are sparse enough to be trivially fit, or the model has been over-fit. Some residual scatter is normal and healthy.

Chapter 18 — Calibration | IGW-NET Users' Manual

Q: Is a calibrated model always a correct model?

No — and this is the single most important thing to understand about calibration. A calibrated model matches observations, but if the conceptual structure is wrong (wrong boundary conditions, missing features, wrong layering, wrong framework for the aquifer system), calibration will push parameters to unphysical values that compensate for the structural errors. The model fits the calibration data by coincidence of compensating errors; it matches the past but fails under prediction because the compensations don't carry over to new conditions. In the worst case, a carefully-calibrated structurally-wrong model is more dangerous than an un-calibrated one because the good calibration statistics give false confidence. The right workflow is: structure first, calibration second. Get the conceptual model right (layering, BCs, features, zonation) before turning to calibration. IGW-NET's real-time modeling, analysis, and visualization are designed around this — users are expected to spend most of their modeling time iterating on structure with immediate visual feedback, and to calibrate only after the conceptual model is defensible. Calibration refines a sound model; it cannot rescue a flawed one.

The quick read — 90 seconds

Structure first, calibration second. A calibrated model is not necessarily a correct model. If the conceptual structure is wrong (wrong BCs, missing features, wrong layering, wrong aquifer-system framework), calibration will compensate by pushing parameters to unphysical values that fit the observations — the model matches data but fails to predict. IGW-NET's real-time modeling, analysis, and visualization are designed so users spend most of their time building a structurally sound conceptual model; calibration refines a sound model, it doesn't rescue a flawed one.
Calibration means adjusting parameters so simulated heads match observed heads. Before calibration: reasonable guesses. After calibration: parameters consistent with observed reality. The model becomes defensible by being fit to data — provided the structure was sound to begin with.
Primary observation type: Static Water Levels (SWLs) — water-table elevations measured at wells, available from regional well databases (Michigan Wellogic, USGS NWIS, Canadian provincial records) through the IGWServer connection. Millions of SWL records exist across North America; IGW-NET gives you access via the Well Data Processing Tool.
Filter observations ruthlessly. Bad SWLs make bad calibrations. Remove: wells being pumped when measured, wells in the wrong aquifer layer, transcription errors, decades-old records if current conditions differ. Use the Well Data Processing Tool to sample, spatial-filter, and quality-filter before calibrating.
Primary parameters: the Multiplying Factors. DomainAttr → Aquifer Attributes → Multiplying Factors has checkboxes (K, Recharge, and others) marked "Calib". Each activates a scalar multiplier applied globally to that field. K × 0.3 tightens the aquifer 3.3×; Recharge × 1.1 adds 10% more water. Simple but powerful global adjustments.
The calibration chart is your primary diagnostic. Scatter plot of observed (x-axis) vs. simulated (y-axis). Well-calibrated = cloud centered on 45° line with symmetric scatter. Bias above/below = parameter adjustments needed. Wide scatter = model doesn't explain observations well.
Manual calibration loop: try multiplier values → simulate → view chart → adjust → repeat. Useful for building intuition and handling simple cases.
Automatic Calibration uses UCODE (USGS universal inverse code) for gradient-based parameter estimation. You flag parameters with Calib checkboxes, provide observations, launch Auto-Calibration; UCODE searches for the parameter set that minimizes the objective function (sum of squared residuals). Outputs optimized multipliers plus sensitivity/uncertainty diagnostics.
K-recharge non-uniqueness: head observations alone cannot separate K from Recharge. The head field depends on K/R, not on K and R individually. To fix the absolute magnitudes, you need additional observations that depend on fluxes — baseflow rates, lake stages, pump test responses.
Multi-target calibration fits multiple observation types simultaneously — well heads AND lake stages, heads AND baseflow. Essential for coupled models; breaks K-recharge non-uniqueness. The Barron Lake case calibrates against 11 private well SWLs plus observed lake level time series jointly.
Pitfalls: overfitting with too many parameters; bad observations passed through; claiming K is physical when only K/R is constrained; local minima in auto-cal; systematic residual patterns that signal real model problems; trusting a "perfect" fit.

18.1 What Calibration Is — and What "Calibrated" Means

Every groundwater model makes assumptions about parameters that are not directly measured everywhere in the domain. Calibration is the process of constraining those assumptions with observations, producing a model whose parameters are demonstrably consistent with observed reality rather than just with initial guesses.

18.1.1 Why calibrate

Consider a model at the start of a project. The aquifer K comes from the Data Center default (or from T-PROGS class K values from Ch. 17); the recharge comes from a climate raster; the stream leakance is the IGW-NET default for that stream order (Ch. 14). None of these values were measured at your specific site. They are starting points drawn from typical values and regional defaults — educated guesses that may be off by factors of 2, 5, or 10.

Without calibration, the model's output is an educated guess about the aquifer's behavior, built from educated guesses about the parameters. With calibration, the parameters have been adjusted until the model's simulated heads match observations at specific well locations. The model now has empirical grounding: the parameters are consistent with measured reality at those locations, and by extension (with appropriate caution) at locations between them.

18.1.2 What "calibrated" really means

A calibrated model is not a perfect model. It is a model whose parameters have been adjusted so that simulated values match observations within acceptable tolerance, where the tolerance depends on the question being asked.

For a regional water-supply assessment, matching heads to within a few meters may be adequate — regional-scale decisions don't depend on local-scale precision.
For a wellhead protection analysis, local head gradients matter more; tolerance might be tenths of meters.
For a contaminant transport study, local velocities are critical; tolerance should match the local gradient structure.
For a coupled lake-aquifer model (Ch. 15), lake stages need to match to within centimeters to constrain the water balance meaningfully.

"Calibrated" is a statement about fit quality relative to the problem's requirements, not a binary yes/no. Two calibrations can both be legitimate: one matching heads to 5 m tolerance for regional work, another matching heads to 0.2 m for a specific well-capture question. Both are calibrated — to different levels of precision, for different purposes.

The calibration question is a scope question

Before calibrating, ask: what question does this model need to answer, and what tolerance does that question require? Then calibrate to that tolerance. Chasing arbitrary precision — matching every well to within a centimeter — is usually overfitting or wasted effort. Matching only regional averages when local details matter is under-calibration. The right target is set by the problem, not by the calibration algorithm's capabilities.

18.1.3 Structure first, calibration second — why conceptual soundness matters more

A calibrated model is not necessarily a correct model. This is the single most important thing to understand about calibration, and the reason IGW-NET's whole interaction paradigm is designed the way it is.

Here is the failure mode. Suppose the conceptual model is structurally wrong — the layering scheme is inappropriate, a critical stream or lake is missing, the boundary conditions are misplaced, the fundamental framework for the aquifer system doesn't match reality. You run the model; simulated heads don't match observed heads. You turn to calibration. You adjust multipliers; you let UCODE search; eventually you find a parameter set that produces a well-behaved calibration chart with the scatter cloud centered on the 45° line. The RMSE looks good; the bias is gone.

What actually happened: the calibration found parameter values that compensate for the structural flaws. K was pushed unphysically low to make up for a missing constant-head boundary; recharge was inflated to balance a missing lake; leakance was tuned to mask inadequate layering. The parameters no longer represent real aquifer properties — they represent the combined effects of real properties plus structural error. The model fits the calibration data by coincidence of compensating errors.

The consequence: when you use this "calibrated" model for prediction — a new pumping scenario, a future climate condition, a contaminant spill not represented in the calibration observations — the prediction is wrong. The compensating errors that made the calibration fit don't necessarily compensate correctly under the new conditions. The model was fit, not understood; it matched the past, but cannot forecast the future. In the worst cases, a carefully-calibrated structurally-wrong model is more dangerously wrong than an un-calibrated one because the good calibration statistics give false confidence in its predictions.

Sound conceptual model, then calibration

Calibration refines a structurally sound model; it cannot rescue a structurally flawed one. Parameters tuned to compensate for structural errors match the calibration data by coincidence and fail under prediction. Investing heavily in structural soundness before calibration is not a luxury — it is a prerequisite for a predictive model.

The right order: get the conceptual model right first. Only then calibrate. Time spent on structural understanding pays back multiple times over during calibration and prediction; time spent on calibration before structural understanding is often wasted or worse than wasted.

Why IGW-NET is designed the way it is

IGW-NET's core interaction design — real-time modeling, interactive analysis, instantaneous visualization — is built around this principle. Most of the user's modeling time is intended to be spent building a structurally sound conceptual model, not on parameter-fitting mechanics. The platform supports this by:

Real-time solution response. Change the layering, change a boundary condition, add a stream, move a zone boundary — and the flow field updates in real time. You see immediately whether the change matters, and in what way. This is conceptual-model iteration at the speed of thought.
Interactive analysis tools. Cross-sections, particle tracks, water-budget reports, flow-net visualizations — all available inline, not after a batch post-processing step. The modeler can interrogate the model's structural choices rapidly.
Visualization that reveals structure. Head contours and velocity vectors show where the model is doing what. Anomalies (unexpected flow direction, unphysical head gradients) surface quickly and point back to structural issues that need addressing.
The Data Center as a structural scaffolding. Authoritative elevation, hydrography, climate, geology, and borehole data is available in a click. Structural choices (where do streams go? what are the layer depths? what's the recharge pattern?) start from defensible Data Center defaults rather than rough guesses.
Calibration arriving late in the workflow. The calibration tools (Calib checkboxes, Automatic Calibration, UCODE) are part of the standard feature set but are introduced only after the structural tools. The workflow's natural sequence is: build, explore, refine structure; then calibrate.

Contrast this with traditional batch-style modeling workflows, where setting up the model takes days or weeks of file preparation, running it takes hours, and visualization requires separate post-processing tools. In that environment, the temptation is to freeze the structure early (so the files don't have to be rebuilt) and spend most of the project's time on parameter calibration — precisely the workflow that produces well-calibrated, structurally-questionable models. IGW-NET inverts this: real-time interactivity collapses the cost of structural iteration, so structural exploration is cheap and calibration becomes the refinement step it should be.

Throughout the rest of this chapter, this principle is in the background. When §18.9 warns about "systematic residual patterns," what it really means is: residual patterns often signal structural problems that calibration can't fix. When §18.9 warns about "trusting a perfect fit," what it really means is: a perfect fit on a structurally-flawed model is a more dangerous thing than an imperfect fit on a sound one. Structure first; calibration second.

18.1.4 The anatomy of a calibration

Every calibration workflow, whether manual or automatic, has the same structure:

Identify observations

What data does the model need to match? Most often: SWLs from wells. Sometimes also: lake stages, baseflow rates, spring discharges, pump-test drawdowns.

Identify parameters to adjust

Which model inputs will be changed in calibration? Typically the K multiplier and Recharge multiplier (IGW-NET's primary calibration knobs). For multi-zone T-PROGS models (§17.5), the per-zone class K values. For coupled lakes, the lake leakance.

Run the simulation with current parameter values

Produce simulated heads (and other outputs) at the observation locations.

Compute residuals

Residual = simulated − observed at each observation location. Positive residual = model over-predicts; negative = under-predicts.

Evaluate fit quality

Statistics (RMSE, R², mean bias) and visualizations (the calibration chart) show how well the model fits overall, and whether there are systematic patterns in the residuals.

Adjust parameters

Either manually (try a different K multiplier) or automatically (let UCODE search). Go back to step 3. Iterate until fit is acceptable.

The rest of this chapter covers these steps in detail.

18.2 Observations — Static Water Levels and Beyond

Calibration needs observations. The primary observation type for regional groundwater modeling is the Static Water Level (SWL) — a measurement of the water-table elevation at a well when the well is not being pumped. IGW-NET provides access to millions of SWL records through the IGWServer connection, plus support for other observation types for specific needs.

18.2.1 What a Static Water Level is

When a water well is drilled, the driller measures the water level in the newly-completed well — the elevation to which water rises in the well when it is not being pumped. This is the Static Water Level, recorded along with the well's coordinates, depth, screen interval, and lithology log. SWLs are a required reporting element in most jurisdictions; regional databases accumulate them as tens of thousands to millions of records per state or province over decades of well drilling.

For groundwater modeling, SWLs are the most abundant, geographically distributed, and directly comparable type of observation available. They directly measure what the model computes — water-table elevation — at point locations across the domain. A typical regional model might use hundreds to thousands of SWLs scattered across the model domain as the calibration target set.

18.2.2 SWL sources in IGW-NET

The IGWServer connection provides access to regional SWL databases:

United States — state water-well databases (Michigan Wellogic, similar databases for other covered states in approximately 10 states total) plus USGS NWIS records
Canada — all provincial databases
Other regions — users supply SWL data in CSV format with coordinates, elevations, and dates

The same regional databases that provide borehole lithology for T-PROGS (Ch. 17 §17.6) also provide SWLs for calibration. A single data request typically returns both the lithology logs and the SWL measurements for the same wells.

18.2.3 Other observation types

SWLs are primary, but not the only calibration target. Other observation types used in IGW-NET calibration:

Observation type	What it constrains	Typical source
Lake stage time series	Lake leakance; absolute-magnitude K (breaks K-R non-uniqueness); lake water balance components	Gauge records, LakeLevel CSVs (like LakeLevel2013.csv in the Barron Lake case)
Baseflow / stream discharge	Absolute-magnitude K; stream leakance; recharge rate	USGS stream gauges; baseflow separation from flow records
Pump test drawdown	Local K and storage parameters; anisotropy	Site-specific aquifer tests; published regional pump tests
Spring discharge	Discharge rate from specific springs; regional flow components	Site measurements; hydrologic monitoring programs
Transient head time series	Specific storage and specific yield; seasonal response; stress response	Monitoring well hydrographs; agency water-level networks

Multi-target calibration (§18.8) uses several of these simultaneously. A coupled lake model calibrates against both SWLs and lake stages; a regional water-budget model might combine SWLs, baseflow, and spring discharge.

18.3 Filtering Observations — The Well Data Processing Tool

Not every SWL is a good calibration observation. Raw databases contain records of varying quality — wells that were being pumped when measured, wells screened in the wrong aquifer layer, transcription errors, historical records that don't reflect current conditions. Uncritical use of raw SWL data produces bad calibrations. The Well Data Processing Tool lets you filter aggressively before calibrating.

18.3.1 The setup dialog

Calibration Chart setup with IGWServer data source selected, Server Data Filters dialog open, and the Well Data Processing Tool dialog showing random sampling of 10000 wells configured — Figure 18.1Setting up the calibration observations. IGWServer as the data source; Server Data Filters dialog for query-level filtering (by region, by date range, by well type); Well Data Processing Tool for post-query filtering (random sampling, spatial density, quality filters). This example shows a random sample of 10,000 wells being configured as calibration observations.

18.3.2 What to filter, and why

Sensible observation filtering removes or adjusts records in these categories:

Wells being pumped when measured. A pumped well's water level is drawn down and does not represent the static regional water table. Unless the driller's report confirms quiescent conditions, default to excluding clearly-active wells.
Wells screened in the wrong aquifer layer. In multi-layer models (Ch. 10), a deep well in a confined aquifer should be compared against simulated heads in the appropriate layer, not the unconfined surface layer. Depth filtering ensures each SWL is compared against the right simulated head.
Very old records. An SWL measured in 1965 may not reflect current conditions if the aquifer has been substantially pumped or climate has shifted. For current-condition modeling, filter to recent records (last 10-20 years, typically). For historical modeling, use date-matched observations.
Obvious errors. SWLs reported as -99, 0, or the same as the land surface elevation are data-entry problems. Filter out records with implausible values.
Duplicate or clustered wells. A dense cluster of wells at one location doesn't provide independent calibration information — they all measure the same local head. Random sampling (sampling tool options) or spatial thinning spreads observations across the domain.
Wells in known-anomalous conditions. Wells near heavy pumping centers, near injection wells, or in specifically-disturbed hydrologic settings may not represent the model's assumed conditions. Exclude these from regional calibration; consider them separately.

18.3.3 The random sampling option

A common filtering approach: from a dense regional database with 50,000+ SWL records, randomly sample 5,000–10,000 for calibration use. This:

Spreads observations approximately uniformly across the domain
Avoids over-representing areas with historically heavier drilling
Keeps the calibration computation tractable (10,000 observations × many parameter sets is manageable; 50,000 × many is slow)
Provides plenty of observations to constrain modest parameter counts (10,000 observations for 10 parameters is well over the 20% rule of thumb)

The random sample should be reproducible — IGW-NET uses a seed so re-running the same configuration produces the same sample. This lets you iterate the calibration meaningfully.

Filter, don't cheat

There's a temptation to filter out SWLs that "disagree" with the model — to keep only observations that the model fits well. This is cheating, not filtering. Legitimate filtering removes observations that are known-bad (pumped wells, wrong layer, errors). Systematic removal of observations the model can't match is systematically biasing the calibration toward whatever the model already says; it doesn't improve the model, it hides the problem.

Use geographic or depth filtering based on a priori information about the well. Don't use residual-magnitude filtering except to spot-check specific outliers that have a non-model-related cause.

18.4 Parameters — The Multipliers

IGW-NET's primary calibration knobs are the Multiplying Factors — scalar adjustments applied globally to specific model fields. This section covers which multipliers exist, where to configure them, and how to activate them for calibration.

18.4.1 What a multiplier is

A multiplier is a global scalar applied to an entire field. K multiplier 0.3 means every K value in the model — whether from a single bulk K, from scattered-point interpolation, from T-PROGS class values, or from T-PROGS zones — is multiplied by 0.3. The spatial pattern is preserved; only the absolute magnitude changes.

Multipliers are chosen rather than per-cell parameters because:

Parametric simplicity. One multiplier per field means few calibration parameters. A model with K multiplier and Recharge multiplier calibrated has 2 parameters; calibrating every cell's K independently would be thousands of parameters. The multiplier approach is tractable; the per-cell approach is almost always under-constrained.
The spatial pattern is often known better than the magnitude. T-PROGS gives you the relative K distribution; Data Center recharge gives you the spatial recharge pattern; what's uncertain is the overall magnitude. A global multiplier captures exactly that.
Calibration uniqueness. Fewer parameters means more identifiable parameters. Each multiplier can be well-constrained by the calibration data.

18.4.2 The standard multipliers

IGW-NET's primary multipliers, configured in DomainAttr → Aquifer Attributes → Multiplying Factors:

Multiplier	What it scales	Typical calibration range
K multiplier	Hydraulic conductivity field (bulk K, scattered-point K, T-PROGS class K values — everywhere K is specified)	0.1 to 10 (factor of 100 spread); values outside this range suggest the base K is substantially wrong
Recharge multiplier	Groundwater recharge rate (uniform value or raster from Data Center)	0.5 to 2 (factor of 4 spread); recharge is better-constrained by climate data so large adjustments are suspicious
Surface Drain Leakancy multiplier	The Level-1 DEM-drain leakance (Ch. 5 §5.8, Ch. 14 §14.2.1)	0.1 to 10; adjusts how easily aquifer drains to the land surface
Stream / Lake leakance multipliers	Level-2 explicit feature leakances (for models with drawn streams and lakes)	0.1 to 10; adjusts SW-GW exchange strength

18.4.3 Activating multipliers with Calib checkboxes

Each multiplier has an associated Calib checkbox in the Multiplying Factors section. Checking the box flags that parameter as one the user wants to calibrate:

Calib checked, manual workflow: You adjust the multiplier value yourself, simulate, view the calibration chart, iterate. The checkbox marks it as a "calibration parameter" for reporting but doesn't automatically vary it.
Calib checked, Automatic Calibration workflow: UCODE varies this multiplier as part of the parameter search. All Calib-checked multipliers are jointly optimized; unchecked multipliers stay at their current values.
Calib unchecked: The multiplier keeps its current value (typically 1.0) and is not varied by Automatic Calibration. Use for parameters you want to fix.

Start with K and Recharge only

For most groundwater modeling, start by calibrating just the K multiplier and Recharge multiplier. Two parameters are easy to understand, easy to calibrate, and capture the dominant magnitude uncertainty in most models. Only add more calibration parameters (leakances, per-zone K) when the two-parameter calibration demonstrably can't fit the observations and you have evidence that the additional parameters are constrained by the data.

Adding parameters doesn't improve calibration quality; it just spreads the fit error around differently. Each new parameter needs supporting observations to be justified.

18.5 The Calibration Chart — Reading Observed vs. Simulated

The calibration chart is the primary visual diagnostic for calibration work. A scatter plot of observed versus simulated values, it shows immediately whether the model fits the data and, if not, in what systematic ways it's off.

18.5.1 Reading the chart

Calibration chart showing observed versus simulated head values as a scatter cloud; left panel with default chart settings, right panel with customized settings showing a confidence band overlay — Figure 18.2The calibration chart before calibration. Observed heads on the x-axis; simulated heads on the y-axis. The 45° line (observed = simulated) represents a perfect fit. Left panel shows default settings. Right panel adds a confidence band overlay — a statistical spread showing the expected variability. This model's scatter cloud is visibly offset above the 45° line, indicating the model systematically over-predicts water levels. The bias pattern points toward parameter adjustments (raise K, lower Recharge).

18.5.2 What the chart patterns mean

Pattern	Interpretation	Typical fix
Scatter cloud centered on 45° line with moderate spread	Well-calibrated model; parameters are correct on average	No adjustment needed
Cloud shifted above 45° line	Model over-predicts heads; water is accumulating in the model more than observed	Increase K multiplier (lets water flow out more easily) OR decrease Recharge multiplier (less water entering)
Cloud shifted below 45° line	Model under-predicts heads; water is draining too readily	Decrease K multiplier OR increase Recharge multiplier
Tight scatter (low spread around the cloud)	Model explains the observation-to-observation variability well; head field structure is correct	Good sign
Wide scatter (high spread)	Model doesn't explain the observation-to-observation variability; head field structure is off, or observations have errors	Investigate: observation quality? Spatial patterns in residuals? Unmodeled heterogeneity?
Curved or tilted cloud shape	Non-linear mismatch; the model systematically deviates from the 1:1 line at high and low head values	Often a sign of wrong K at one end of the spectrum, or wrong boundary conditions affecting one region
Bimodal or clustered cloud	Different well populations being fit differently; possibly a regional difference the single-zone model can't capture	Consider regional zonation (Ch. 17 §17.5); investigate whether clusters correspond to geologic boundaries

18.5.3 After calibration

Calibration chart after parameter adjustment showing the data cloud now centered on the 45-degree line with K multiplier 0.3 and Recharge multiplier 1.1, tight confidence band encompassing most observations — Figure 18.3The calibration chart after calibration. With K multiplier 0.3 and Recharge multiplier 1.1, the data cloud is now centered on the 45° line — the systematic bias from Figure 18.2 is removed. The confidence band still shows some scatter (individual observations don't all fall on the line), but the mean is unbiased and the statistical spread is consistent with expected observation uncertainty. This is what "calibrated" looks like visually.

18.5.4 Summary statistics

In addition to the visual chart, IGW-NET reports summary statistics:

RMSE (Root Mean Square Error) — the square root of the mean squared residual; a single-number summary of fit quality. Lower is better; report what "good" means for your problem's scale.
Mean bias — mean(simulated − observed). Zero bias means no systematic over- or under-prediction; non-zero bias signals the cloud is off the 45° line.
R² — fraction of observation variance explained by the model. High R² means the model captures the observation-to-observation variability; low R² means it doesn't.
Observation count — how many observations contributed to the statistics. More observations = more trustworthy statistics.

18.6 The Manual Calibration Loop

Manual calibration is the traditional approach — the modeler adjusts parameters, simulates, evaluates the chart, iterates. It's slower than Automatic Calibration but builds intuition and handles cases that automated tools don't handle well. For many projects, a round of manual calibration before switching to Automatic is the right sequence.

18.6.1 The loop structure

Start with defaults

Initial K multiplier = 1.0, Recharge multiplier = 1.0. Run the simulation; view the calibration chart.

Observe the systematic bias

Is the cloud above or below the 45° line? How far? Use §18.5.2 to identify the likely parameter adjustment.

Make a substantial adjustment

Don't nudge multipliers by 5%; the signal is usually weaker than the effect. Try K multiplier 0.5 or 2.0, Recharge multiplier 1.5 or 0.7. Re-simulate.

Re-evaluate and refine

Did the bias go down? If yes, you went in the right direction. If the bias reversed (now below the line instead of above), you overshot. Back off halfway. Re-simulate.

Iterate to convergence

Usually 3–8 iterations get you to a visually-unbiased calibration chart. Further refinement is diminishing returns.

Check residual patterns

A calibrated chart shows the cloud centered on 45°, but does the residual pattern on the map make sense? Systematic spatial patterns in residuals indicate real model issues that multiplier adjustments can't fix. Investigate.

18.6.2 When manual calibration is enough

For simple models with 1–3 calibration parameters and well-behaved observations, manual calibration is efficient and adequate. Cases:

Regional flow models with K and Recharge multipliers
First-pass calibration of a new model (get to "approximately right" before Automatic Calibration refines)
Models where the modeler has strong physical intuition and wants to see how parameter changes affect the solution

18.6.3 When to escalate to Automatic

Move to Automatic Calibration (§18.7) when:

More than 3–4 parameters are being adjusted jointly — manual search becomes combinatorially painful
Parameter interactions are strong and non-intuitive
You need formal uncertainty estimates (UCODE produces parameter standard errors)
You want to use the parameter estimation output as input to stochastic analysis (Ch. 19)

18.7 Automatic Calibration — UCODE Parameter Estimation

Automatic Calibration uses UCODE (USGS Universal inverse COde) to search for the parameter set that minimizes the objective function — the sum of squared residuals between simulated and observed values. The Automatic Calibration tool wraps UCODE in an IGW-NET workflow so the modeler doesn't have to manage UCODE input files directly.

18.7.1 What UCODE does

UCODE is the USGS-supported universal inverse code for parameter estimation in environmental models. Given:

A forward model that takes parameters as input and produces simulated values (IGW-NET's groundwater solve)
A set of observations at specific locations
A list of parameters to estimate (the Calib-checked multipliers and any per-zone K values)

UCODE runs many forward simulations, each with slightly different parameter values, computing how the simulated-observed residuals change with each parameter. Using gradient information, it iteratively searches for the parameter set that produces the smallest total residual sum (the objective function).

18.7.2 The IGW-NET Automatic Calibration workflow

Check Calib boxes for the parameters to calibrate

DomainAttr → Aquifer Attributes → Multiplying Factors → check K, Recharge, and/or other multipliers. UCODE will vary the checked ones.

Set up observations via Well Data Processing Tool

Filter SWLs as described in §18.3. Confirm the observation set is in place before launching calibration.

Launch Automatic Calibration

Via the Automatic Calibration tool in the Simulation Tools area (may be under a specific button or menu option depending on your IGW-NET version — typically labeled "Automatic Calibration" or "Auto-Cal"). Provide initial parameter values and expected ranges.

UCODE runs iterations

Each iteration is a forward simulation plus gradient computation. Typical runs take 10-50 iterations depending on parameter count and observation count. Progress is displayed.

Review the results

UCODE outputs: optimized parameter values, parameter standard errors (uncertainty), parameter correlations, and residual statistics. Also, the calibrated calibration chart (the one you'd see with the optimized parameters).

Accept, reject, or refine

If the optimized parameters are physically reasonable and the fit is acceptable, accept them as the new model defaults. If parameters hit arbitrary bounds or look unphysical (K multiplier = 100), investigate — you may have a non-uniqueness issue or a bad observation. Possibly re-run with different starting points or constrained bounds.

18.7.3 What UCODE gives you beyond parameter values

The key diagnostics UCODE produces, beyond the optimized parameters themselves:

Parameter standard errors. Uncertainty estimate for each calibrated parameter. A K multiplier of 0.3 ± 0.05 is tightly constrained; 0.3 ± 0.5 is barely constrained. The standard errors tell you which parameters the data really pins down.
Parameter correlations. Pairwise correlations between calibrated parameters. High correlation (> 0.9 or < −0.9) means the two parameters can't be separated by the data — the classic case is K multiplier and Recharge multiplier, which are typically strongly correlated when only head observations are used (K-recharge non-uniqueness, §18.8).
Parameter sensitivity. How much the objective function changes per unit change in each parameter. Low sensitivity = the parameter doesn't affect the fit much = data doesn't constrain it well.
Observation residuals. Per-observation residuals at the optimized parameter values. Large residuals at specific wells flag observations that the calibrated model can't match — investigate whether the observation is bad or the model is missing local structure.

Auto-cal output: not just parameter values, but the whole inverse problem

Manual calibration gives you "parameters that make the chart look good." Automatic Calibration gives you parameters plus the whole inverse problem structure: which parameters are well-constrained, which are not, which are correlated, which observations are fit well, which are not. This is substantially richer information and often reveals that the fit obtained manually was only one of many possible fits. When you move from manual to automatic, you gain not just better parameter values but better understanding of the calibration problem itself.

18.8 Multi-Target Calibration — Fitting Multiple Observation Types

A single observation type (usually head SWLs) often can't fully constrain the model. Head observations suffer from K-recharge non-uniqueness; lake stages or baseflow add the flux-based information needed to separate them. Multi-target calibration fits multiple observation types simultaneously, producing more identifiable and trustworthy parameters.

18.8.1 K-recharge non-uniqueness — the canonical case

A steady-state groundwater flow equation is linear in the K/R ratio. Given a fixed spatial pattern of K and R, scaling both by the same factor α produces identical head fields: if (K, R) produces head h(x), then (αK, αR) produces exactly the same h(x). Head observations alone cannot distinguish them.

Consequence: you can "calibrate" a model against SWLs and match them perfectly with K multiplier 0.3 and Recharge multiplier 1.0, OR with K multiplier 0.6 and Recharge multiplier 2.0, OR with K multiplier 0.15 and Recharge multiplier 0.5. All three fit heads equally well; they differ in how much water is actually moving through the aquifer, which is exactly what you usually want to know.

Why head-only calibration is dangerous

A model calibrated only against head SWLs, with K and Recharge multipliers both active, has arbitrary absolute magnitudes — you've pinned down K/R but not K or R individually. The model will fit heads, but its predictions of: (1) stream baseflow, (2) well yields, (3) time to contamination, (4) lake water balance, (5) any flux-based quantity, will be wrong by whatever factor α actually is. Always add at least one flux-based observation (baseflow, lake stage, pump test) before claiming a calibrated model in absolute terms.

18.8.2 How lake stages break non-uniqueness

In a coupled lake model (Ch. 15), lake stage is governed by the lake's water balance:

dV_lake/dt = Ins − Outs

The Ins and Outs include GW-to-lake flux (gaining cells) and lake-to-GW flux (losing cells), each computed as leakance × (h_lake − h_aq). These fluxes depend on absolute heads and K values, not just the K/R ratio. Matching observed lake stage fixes the absolute magnitudes.

Specifically: if the model has too-low K, the lake-aquifer exchange is too slow, and observed lake stage dynamics (drawdown response, recharge response) are not reproduced — regardless of how well heads match. Lake stage observations provide a flux-based constraint that cannot be satisfied by arbitrary K/R scaling.

18.8.3 The Barron Lake multi-target calibration

Barron Lake final simulation results with map view, SW Lake Budget Chart showing inflow and outflow components, lake level chart comparing simulated blue line with observed red dots for the 135-day simulation period, and cross-section — Figure 18.4The Barron Lake calibrated model displays three simultaneously-consistent result types: a lake level chart (simulated vs. observed, tracking closely); a lake water budget chart (all Ins and Outs itemized); and the aquifer head / cross-section display. The model is calibrated against both the 11 private well heads AND the observed lake-level time series — both are fit jointly, breaking the K-recharge non-uniqueness that head-only calibration would suffer from.

The Barron Lake case study (Ch. 15 §15.4) is a canonical multi-target calibration:

Observation type 1: 11 private well SWLs. Constrain the spatial distribution of aquifer heads; establish the head pattern and hydraulic gradient directions.
Observation type 2: Observed lake stage time series (LakeLevel2013.csv, 135-day record). Constrain the lake water balance; fix the absolute magnitudes of K, lake leakance, and recharge.

Without the lake stage observations, the 11-well calibration would be non-unique — many (K, Recharge) combinations would fit. Adding the lake stage pins down the absolute scale; the resulting calibrated model is trustworthy for both well-drawdown predictions and lake-level predictions.

18.8.4 Multi-target weighting

When combining observation types, the calibration objective function weights them:

Objective = w_heads × Σ(h_sim − h_obs)² + w_stage × Σ(stage_sim − stage_obs)²

The weights (w_heads, w_stage) balance how much each observation type matters. Standard practice: weight by the inverse of observation variance — observations with smaller expected errors get more weight. Or: weight to equalize the contribution of each observation type to the objective function, so 10 lake-stage observations don't get drowned out by 500 head observations.

UCODE in IGW-NET handles the weighting automatically based on observation-type defaults; advanced users can override weights per observation if specific observations are known to be especially reliable or unreliable.

18.9 Common Pitfalls

Calibration has failure modes that look like success — charts that look calibrated but aren't, parameter values that seem physical but aren't identifiable, fits that are "too good." This section collects the traps that repeatedly catch modelers.

18.9.1 Overfitting

If you calibrate N parameters with M observations, you need M much larger than N for the calibration to be well-posed. Rule of thumb: M ≥ 5–10 × N, and preferably more. Calibrating 15 parameters with 20 observations gives you 20 equations in 15 unknowns plus noise — effectively fitting the noise, and the "best" parameter set is arbitrary among many that fit equally well.

Signs of overfitting

Parameter values at arbitrary-looking bounds (K multiplier exactly at 10.0 or 0.1)
Very high parameter correlations in UCODE output (>0.95)
Very low parameter sensitivities (parameter doesn't affect the objective much)
Running UCODE twice from different starting points gives very different parameter values
Residuals fit exactly at every observation (zero scatter) — if this happens, the calibration is over-specified

Fix: reduce the number of calibration parameters. Accept that some are unidentifiable with available data. Bring in additional observation types before adding more parameters.

18.9.2 Bad observations passed through

If your observation set contains many bad records (pumped wells, wrong-layer, errors), the calibration tries to fit them anyway. The result: parameters that fit bad data. The apparent calibration quality looks fine, but the parameters are wrong.

Signs: unusually high scatter on the calibration chart, systematic residual patterns that don't match any plausible geologic or hydrologic feature, parameter values that conflict with literature expectations. Fix: revisit the observation filtering (§18.3), apply stricter quality criteria, re-calibrate.

18.9.3 K-recharge non-uniqueness hidden as "calibrated"

Already covered in §18.8.1. A model calibrated with head observations only and both K and Recharge multipliers active is not a fully-calibrated model — only K/R is fixed. Claiming specific K values from such a calibration is misleading. Always add flux-based observations before publishing absolute K values.

18.9.4 Local minima in Automatic Calibration

UCODE finds a local minimum of the objective function — possibly not the global minimum. If the objective function is non-convex (has multiple basins), the starting parameter values determine which basin you end up in.

Fix: run Automatic Calibration from several different starting points (K multiplier = 0.3, 1.0, 3.0, etc.) and compare results. If they all converge to the same parameter values, you've likely found the global minimum. If they converge to very different values, the problem is non-unique or has multiple local minima and you need more data to disambiguate.

18.9.5 Systematic residual patterns

A calibration chart centered on 45° is necessary but not sufficient. Check residual patterns spatially:

Plot residuals on the model map. Are they spatially random, or do wells in one region cluster to positive residuals and wells in another to negative?
Systematic regional bias indicates the model is missing something in that region — unmodeled recharge variation, a geologic boundary the single-zone model doesn't capture, boundary condition issues, or unrepresented pumping.
The multiplier adjustment that fixes the aggregate bias doesn't fix the regional pattern; that requires either Ch. 17 regional zonation, different BCs, or better representation of whatever is regionally different.

More broadly: systematic residual patterns are usually the signature of structural problems the calibration can't fix with parameter tuning. When you see a residual pattern, revisit §18.1.3 — parameters can compensate for structural errors in an average sense, but the compensation is spatially imperfect and the residual pattern is the visible leftover. Going back to improve the conceptual model is almost always the right response.

18.9.6 Trusting a perfect fit

A calibration chart where all observations fall exactly on the 45° line (zero scatter) should make you suspicious. Real SWL observations have measurement uncertainty of 0.3–1 m (driller-reported), plus the real aquifer has heterogeneity that a model can't capture perfectly. Zero scatter means either (a) very few observations (trivially fit any way), (b) observations that are actually model-produced or trivially consistent, or (c) over-fitting.

Good calibrations have 1–3 m RMSE for regional models, some residual scatter, and a few outliers that the model can't quite match. That looks realistic. Zero-scatter fits look suspiciously clean; investigate. Remember from §18.1.3: a "perfect fit" on a structurally-flawed model is a more dangerous thing than an imperfect fit on a sound one — the good calibration statistics give false confidence in predictions that may still be very wrong.

18.9.7 Calibration without model-purpose awareness

Calibrating to the tightest possible RMSE regardless of purpose is a trap. A model calibrated to match heads within 0.1 m everywhere may still be wrong for transport predictions if the velocity field is not properly constrained. A regional water-supply model calibrated to match specific well hydrographs may be over-specified for the regional-averaging question it's meant to answer.

Always revisit §18.1.2: what question is this model answering, what tolerance does it require? Calibrate to the answer's needs, not to an abstract "best fit" target.

18.9.8 Calibrating before structural soundness

This is the meta-pitfall that subsumes several of the others (§18.9.3, §18.9.5, §18.9.6). The pattern: jump to calibration while the conceptual model still has unresolved structural issues — layering not fully worked out, boundary conditions not yet defensible, features not yet at the right resolution, zones not yet aligned with geology. Calibration "succeeds" because multipliers absorb the structural errors; you get a calibrated-looking model. Then a new scenario, a new stress, or a new observation exposes that the model was never structurally correct, and all the calibration effort has to be redone.

The signature symptoms: calibrated parameters that drift far from physically-reasonable values (K multiplier > 5 or < 0.1 for an aquifer whose literature values are around 1); parameters that swing widely between successive calibration runs as you add or remove observations; residuals with obvious spatial structure; calibration that "works" for one scenario but fails when evaluated on any other data.

The fix: go back to conceptual modeling. Use IGW-NET's real-time interactivity to try different structural configurations; use visualization to build intuition about what the aquifer is actually doing; confirm the structure is defensible on hydrogeologic grounds before letting UCODE search for parameter values. The time investment in conceptual soundness pays back multiple times during calibration and prediction. This is the entire reason IGW-NET is designed around real-time interaction (§18.1.3) — to make conceptual-model iteration cheap, so the structure is right before calibration is asked to do things it can't.

To go deeper

Chapter 17 §17.5 — Regional zonation — per-zone class K values as additional calibration parameters; when systematic residual patterns warrant zonation.
Chapter 15 §15.4 — Barron Lake case study — the canonical multi-target calibration with lake stages + well heads jointly.
Chapter 19 — Stochastic Modeling → — next chapter. Monte Carlo parameter sampling; uncertainty quantification; UCODE's parameter uncertainty output as input to stochastic analysis.
Case Study: Mancelona TCE Plume — regional calibration example with K multiplier 0.3 + Recharge multiplier 1.1 producing the calibrated chart in Figure 18.3.
Case Study: Barron Lake Coupled Model — multi-target calibration with well heads and lake stages.
Realtime help: Automatic Calibration — the operational reference for the Auto-Calibration tool.