The volume of data created by ground investigation and materials testing can be huge but, when used correctly, the AGS data format can ensure the power of this information. The British Drilling Association has worked with GE to highlight some common errors that prevent this from happening.
We live in an increasingly data centric world and we are consuming data at a rate, yet our understanding of the data we generate does not always take advantage of its true potential. To do this, data needs to be standardised, readable and easily accessible – and the ground investigation industry has long had a method to do just that.
The AGS (Association of Geotechnical and Geoenvironmental Specialists) data format was implemented in 1992 and is used to consolidate information generated by laboratories, engineers, drillers, technicians, designers and geologists into one single data file system.
Today the AGS data format is the standard not just in the UK but in the US, Australia, Singapore and many other countries around the world. This is testament to the dedicated volunteers from across the industry who have worked hard to promote the format and also to the ease of use of the format and its accessibility.
The primary purpose of the AGS data format is to transfer information from one geotechnical system to another. The most frequently used ground investigation systems are data management programs which translate ground investigation data to produce logs, 3D models and analyse geotechnical and geochemical data. Each data management system has its own pros and cons but, whichever one you may be familiar with, it should always automatically check the AGS data before importing and exporting data.
However, there are still some occasions when the format or the data goes wrong and finding where the problem lies may not be immediately obvious but there are some common errors that can be easily dealt with. The types of errors encountered can be divided into two categories – data errors and data format errors.
Subhead: Data errors
The AGS format is designed as a transfer medium from one system to another. To validate the structure of an AGS file AGS checkers are available (detailed on the AGS data format website) however, though these may ensure an AGS file is structurally correct, the data it contains could be complete gobbledegook. This is where it is the responsibility of the data managers, engineers, the drillers, laboratories and technicians to do everything they can to prevent bad data being included in their datasets.
These errors may be as simple as spelling a colour incorrectly, or serious enough that the incorrect placement of a decimal point for in-situ tests may jeopardise an entire project, leading to unnecessary costs and programme delays. Samples greater than the depth of the hole, core recoveries of 1,000%, boreholes plotted somewhere in space around the north pole and holes drilled in the year 2119 are all examples of situations that are unlikely to happen, yet it is these kinds of errors that users are most likely to encounter.
Humans are all susceptible to making these kinds of mistakes, either through an accidental key stroke, misinterpretation of handwriting or simply under pressure from time constraints. Technology can help minimise these errors as well as save time and paperwork by reducing the double handling (rewriting) of data. Software already exists to aid with primary data collection, but the industry can expect to see a seismic shift in the coming years when it comes to moving to a digital system over paper techniques. The British Drilling Association has previously stated the benefits of switching some tasks to digital systems and development continues into other roles.
How many of us are guilty of “as above” or “see previous” on logsheets when it comes to monotonous information such as serial numbers, dates, staff, units and methods etc? This seemingly redundant information is known as metadata and is invaluable not just when errors occur (by tracing data back to its origin) but also in allowing statistics to be generated quickly and accurately. It also substantially increases confidence in the data generated as it removes any ambiguity about specifics. Yet, many metadata fields in AGS files are left empty. Digital systems will also help collect more data than ever before by automatically filling out these data fields.
The software used to manage AGS data will determine how data is validated. If users are unsure of what validation protocols are available, then the software developer can usually provide tips and advice on how stricter controls can look for these types of data errors.
Understanding the data format
To understand format errors, it is necessary to look at how the data is presented and how it is structured. The AGS version 4.0 data format is written in plain text, readable using any text editor and compatible across all operating systems.
Data is separated into groups which are represented by four letter codes, signifying the table the data is to be stored in, ie SAMP for samples, LOCA for locations and so on.
Next, there are the identifiers HEADING, UNIT and TYPE. These indicate what group field the data should be stored in, whether they have a unit associated with them (metres, degrees etc) and what format they should be stored in (text, decimal places, from a list of values etc) respectively. The main chunk of the data follows on after the identifiers. All the data is separated using a comma (,) and encased in double quotes (“DATA”).
Data format errors
Data format errors usually arise from code missing from the abbreviation (ABBR), type (TYPE), unit (UNIT) and dictionary (DICT) groups. The ABBR, TYPE and UNIT groups are a summary of all codes, units and data types used within the AGS file and the DICT group defines custom groups and fields outside of the AGS standard. If a code or group is used in the data but not correctly assigned in the ABBR, UNIT, TYPE and DICT groups, then this will result in an error.
This error is particularly common with laboratories and cone penetration testing (CPT) where innovative methods are developed for testing and use custom codes and tables not yet approved by the AGS. This can also mean that techniques for testing are measured with greater precision. Resulting in the data presented in the file not matching the format stipulated in the group identifiers or the properties in the data management software (if a conversion is not applied automatically on import/export).
Errors can also occur if the amount of data fields presented does not exactly match the number of fields defined in the heading, unit and type. This can be caused by the addition or deletion of a comma, double quote or other special character which will result in a malformation of the data file. The data management software will be unable to discern exactly what field relates to the data group and throw an error.
Groups also require key fields to be populated and these key fields must exactly match data with the associated group and if the associated key field is missing then an error is encountered. For example, you cannot have a geotechnical result without the sample information populated and you cannot have a sample without the location information group populated.
Errors can also occur where the versions of AGS differ when identifier fields have been altered to allow for more detailed information. This is particularly noticeable when converting geochemical data from AGS 3.1 to AGS 4.0.
Understanding how these errors can occur at the early stages can help ensure that data stored in the AGS format can be used to its full potential, which is becoming increasingly important as the sector moves further into the digital age.
- This article was written by British Drilling Association technical standards sub-committee member Ben Swallow and WYG data manager Paul Hadlum.
To demonstrate these errors some examples can be seen in the AGS data below with problems highlighted in red.
ags table 2