Earth Science

AI-Driven QA/QC: Catching Geochemistry Data Errors Before They Cost You a Resource

March 12, 2026 · 8 min read

If you ask a senior exploration geologist what kills more junior projects than bad geology, the answer is bad data. Specifically: assay databases that look fine until the resource consultant gets in there, finds a systematic standard bias in the 2024 sample batches, and tells you the western half of your maiden resource is going to need a redo. That conversation has happened thousands of times across the industry, and it happens because QA/QC review on geochemical data is one of those jobs that everyone agrees is essential and nobody has time to do thoroughly.

The pattern is consistent. The QP signs off on the QA/QC plan. The field crew inserts standards, blanks, and duplicates at the prescribed rates. The lab returns the batches. A junior geologist or database administrator spot-checks them. Months later, when someone finally builds a proper QA/QC summary for a technical report, the systematic issues surface — and by then a lot of decisions have been built on top of the bad data. AI tooling doesn't solve the underlying organizational problem, but it makes the spot-check workflow dense enough that the systematic issues surface in the same week the batch returns, not the same quarter.

What QA/QC Actually Needs to Catch

A modern exploration QA/QC program is checking for a small number of failure modes, repeatedly, across every batch. Certified Reference Materials (CRMs, or "standards") test whether the lab's calibration is returning the certified value within accepted tolerance. Blanks test whether sample-to-sample contamination is occurring in the prep stream. Field duplicates test whether the sampling itself is repeatable. Pulp duplicates test whether the lab's subsampling is repeatable. Lab repeats — if you can get them — test whether the lab's instrument is repeatable on the same pulp.

Each of these checks has a defined statistical signature. A standard that fails should fall outside two or three standard deviations of its certified mean. A blank that fails should return values significantly above the lab's detection limit. A duplicate pair that fails should have relative error above a threshold set by element and grade range (commonly 30% half-absolute relative difference for grades well above detection, with looser limits near detection). The math is well established and unchanged for decades.

What's hard isn't the math. What's hard is doing the math consistently, every batch, while also tracking long-term drift, batch-to-batch consistency, and the relationship between QA/QC failures and the production drillholes interleaved with them. The job is not one statistical test — it's hundreds of small tests with audit trails connecting them to specific samples, batches, drill holes, and decisions.

Where AI Earns Its Keep

The clearest ROI is automated batch-level review. A simple pipeline ingests the lab's results file as soon as it arrives, joins it to the sampling database, flags every standard that's out of tolerance, every blank that's elevated, every duplicate pair that's out of acceptable range, and produces a one-page batch report. This is not machine learning in any meaningful sense — it's rule-based automation. But it eliminates the gap between batch return and batch review, which is where most QA/QC failures hide.

A more interesting layer sits on top: pattern detection across batches. ML methods are good at finding things humans miss in large multi-batch datasets. Three real applications are now well established. First, anomaly detection on the standards time series — a clustering or isolation forest method on standard returns will detect drift earlier than a hard two-sigma rule, because it picks up patterns of drift rather than waiting for an individual failure. Second, duplicate-pair regression — comparing a duplicate against the original on a log-log plot and flagging pairs that fall outside the data's empirical correlation curve, rather than a fixed percentage. Third, batch fingerprinting — clustering batches by multi-element distribution to detect prep or instrument contamination events that don't show up in any single standard or blank but distort the geochemistry of the production samples.

None of these is novel statistics. They've all appeared in published studies for at least a decade. What's new is that the tooling to run them on every incoming batch — within hours of receipt, with no PhD-level statistical work required — is now accessible to a junior with a Python-literate geologist or a one-time consultancy engagement.

The "Hidden Bias" Problem

The most expensive QA/QC failures aren't the obvious ones. They're the systematic biases that pass per-batch checks but produce a meaningful skew in the aggregated data. Lab A's gold assays for a particular drill program might consistently return 5% lower than a second lab's check assays of the same pulps. That kind of bias rarely triggers a per-batch flag — every individual standard returns within tolerance — but it shifts the resource estimate when discovered. ML-based comparison of lab-to-lab umpire samples, or paired analysis of pulp duplicates between labs, surfaces these biases early.

The second class of hidden bias is sample-stream contamination from one drill hole carrying over to the next. This shows up as elevated baseline values in an entire batch following a high-grade interval, and a sharp drop in the next batch. It's not flagged by individual blanks if the blank position in the sample stream happens to miss the contamination event. Multi-element pattern recognition across consecutive batches catches it because the pattern of contamination is multi-element, not just elevated gold.

The third is post-acquisition reconciliation. When a junior acquires a project with historic drilling, the geochemical data is rarely directly comparable to recently collected data — different labs, different decades of analytical methods, different reporting protocols. A QA/QC pipeline that explicitly accounts for these is the difference between a usable historic dataset and a dataset that gets thrown out and re-drilled at significant expense.

What a Real Implementation Looks Like

The minimum useful version of this is three things wired together. One: a webhook or scheduled poll that ingests new lab results files from the lab's portal, parses them, and writes to a structured database. Two: a rules engine that runs the standard QA/QC checks on every incoming batch and writes the results to that same database. Three: a daily or per-batch email summary that goes to the project geologist, the database manager, and the QP, flagging anything outside tolerance with the relevant sample numbers, drill holes, and assay values.

That stack — built once — handles 80% of the operational value. Everything beyond it (anomaly detection on time series, multi-batch pattern detection, automated bias analysis between labs) is incremental work layered on top of the same database. The mistake teams make is trying to build the sophisticated layer before the basic layer is reliable. Get the rules engine running first. Add ML where the rules engine misses things, not as a replacement for it.

What This Doesn't Solve

The fundamental QA/QC problem isn't analytical, it's organizational. A perfect automated pipeline still requires that field crews insert standards, blanks, and duplicates at the prescribed rates; that QPs review and respond to flags rather than ignoring them; that the lab is selected on quality rather than turnaround time and price alone. Software can flag failures but it cannot enforce program design or compel response. The most sophisticated QA/QC stack in the industry is worthless if its alerts go to an inbox no one reads.

The second thing it doesn't solve is data lineage in legacy datasets. If your project has thirty years of drill data from multiple operators, multiple labs, and multiple analytical protocols, the assay database is a compilation of varying quality whether or not you have modern tooling. AI-assisted reconstruction of legacy data quality is possible but is itself a substantial project — and the result is a confidence-weighted dataset rather than a clean one.

A Practical Starting Point

If your shop has no automated QA/QC review today, the right first move is not a sophisticated ML pipeline. It is the rules engine described above, built around your specific QA/QC plan and your lab's specific output format. Two weeks of focused work by someone with database and Python skills, working with the project geologist and QP to encode the existing manual checks, gets you to per-batch automated review. That alone catches most operational failures.

From there, the next step is multi-batch summary reporting — automated production of the QA/QC summary tables that you'll need for the next technical report, refreshed continuously rather than as a quarter-end scramble. That second layer takes the same database and adds visualizations and reporting on top.

The ML layer — anomaly detection on standards, regression-based duplicate review, batch fingerprinting — is the third step. It is genuinely valuable, but the order matters. Build the rules engine first, automate the reporting second, add ML pattern detection third. Teams that skip the first two steps spend money on machine learning systems that have nothing to detect anomalies against, because the underlying QA/QC compliance was already broken.

Conclusion: QA/QC as Infrastructure

The best version of QA/QC is invisible. It runs continuously in the background, flags issues within hours of batch receipt, produces report-ready summaries on demand, and quietly improves the credibility of every downstream decision the company makes. That version is achievable today with modest investment — well under the cost of a single drill hole — and the payoff compounds across every subsequent program.

The version most juniors live with today is the opposite: a manual process that lags batch returns by weeks, surfaces issues at the worst possible time, and quietly erodes confidence in the database. The technical gap between the two versions is small. The organizational gap — actually committing to QA/QC as infrastructure rather than as a compliance checkbox — is the harder one. If your team is ready to close that gap, our free workflow audit covers exploration data workflows, or book a call to discuss what a per-project pilot looks like.

AI Beyond Earth Science

The AI techniques used in earth science apply to any data-heavy business. See how we help companies across industries automate their workflows.

View All Services →

Ready to Automate?

Book a free consultation and we'll map out a custom automation plan for your business.

Book a Consultation Take the Free Workflow Audit

What QA/QC Actually Needs to Catch

Where AI Earns Its Keep

The "Hidden Bias" Problem

What a Real Implementation Looks Like

What This Doesn't Solve

A Practical Starting Point

Conclusion: QA/QC as Infrastructure

AI Beyond Earth Science

Ready to Automate?

Related Posts

AI in Geochemical Analysis: Multivariate Methods for Vectoring to Mineralization

AI in Soil Geochemistry: From XRF Anomaly Maps to Pathfinder Element Detection

Machine Learning in Resource Estimation: Where It Helps, Where It Hurts, and What the QPs Are Saying