How the U.S. Army’s ERDC Uses DataOps in R&D
July 27 2021
The U.S. Army’s Engineer Research and Development Center (ERDC) “helps solve our nation’s most challenging problems in civil and military engineering, geospatial sciences, water resources, and environmental sciences.” The center has experts, researchers, scientists, and engineers working on R&D in the most advanced areas.
The level of research conducted at ERDC requires massive amounts of data and technical resources. One facility that aids in the research is the Department of Defense Supercomputing Resource Center (DSRC), which is home to Cray XT3 and XT4 supercomputers, some of the most powerful and fastest supercomputers connected across the Defense Research and Engineering Network (DREN).
With thousands of researchers creating, accessing, creating models with, and running simulations on data, discovery and governance were big challenges. Finding datasets relevant to a research area was hard unless you happened to be the one who created the data.
Organizing that level of data was also a challenge. By virtue of the work at ERDC, there was a wide range of data types: hydrology, geospatial, test structures, equipment performance, etc. And with security of the utmost importance, user management and data access was a top priority.
Previously, the entire data management process, from importing and validating to cleaning and storing, was manual. ERDC wanted to set data policies, automate as much as possible, and streamline data access—all while conserving valuable supercomputer resources.
Managing Big Data in a Supercomputer Environment
Having worked on a NASA project with similar goals, Geocent was already experienced in laying out a design pattern for the architecture that would help ERDC use data to its fullest. We took time to understand their specific goals and put a system into place using a collection of tools to implement DataOps practices.
The central tool was EROSTAT (Engineered Resiliency on STAT), Geocent’s open, hybrid, and multi-cloud enabled solution that intelligently secures, unifies, and governs data.
By installing EROSTAT at each data location, the data is virtualized and the limits of physical storage are abstracted away. The resulting data grid acts like one big data store. This was a solution to connect all of ERDC’s data, regardless of storage location, to make management and access as streamlined as possible.
This infrastructure means ERDC researchers can query across all locations for the data they need, and the platform takes care of accessing the data wherever it’s stored. Administrators can determine where to store the data—which location, server, and database—without having to worry about access. Researchers can use the data without having to think about where the data is stored.
Data Access, Optimization, and Automation
In order to use the advanced simulations, models, artificial intelligence (AI), and machine learning (ML) that are so important to the research at ERDC, the data for the algorithms has to have a solid foundation. These tools are only as good as the data that goes into them. DataOps is the way to ensure a strong data foundation.
When new data is created at ERDC, it goes through a repeatable process:
- Ingest
- Validate (e.g., does this look like other familiar datasets?)
- Transform (e.g., unstructured/binary to SQL)
- Cleanse (e.g., remove sensitive data)
- Share/Organize to make more discoverable for downstream users
This pipeline at ERDC used to be manual and time-consuming. Now, every step is automated—the process is instrumented, including auditing events. The happy path—where everything goes right—requires zero human intervention.
When errors occur, data is off-ramped into quarantine for a subject matter expert (SME) to review, diagnose, and correct the issue. Experts can spend their valuable time on solving problems rather than doing routine data movement.
Automation with auditing also creates data of its own. ERDC administrators can create rich dashboards to understand how data moves through the pipeline, where issues are most likely to occur, and how healthy the data is overall. No one has to spend time looking for issues—problems are brought up as they happen.
Built-in Policy Engine
In government work, policies change, including data governance policies. The ERDC infrastructure now has a built-in policy engine that automates processes based on policies and can easily change as policies do.
The policy engine is embedded at the data layer level so as rules are run, access to data—by users, integrations, and APIs—is retained. This structure also forces integrated tools to adhere to the established policies without having to build the policies into each tool. This provides built-in security and helps with governance.
An example of an ERDC policy: with the supercomputer’s extra fast load latency, we don’t want low-priority data stored in the most important resources. If data on the supercomputer hasn’t been accessed in the past 30 days, it will be moved to lower-priority storage. That way, the most accessed data stays on the hot tiers.
Without automation, implementing this policy would be tedious. Because of the built-in policy engine, administrators can instruct the system to automatically move data after 30 days to save resources.
With the open data infrastructure, even when data is moved, integrations and connections are maintained. End users, the researchers, don’t know (or, frankly, care) where exactly the data is stored because they can still access it, same as before.
Implementing DataOps has helped ERDC:
- Improve data access, security, and discoverability
- Automate data processes and policies
- Focus SME time on high-value activities
- Optimize supercomputer resources
We want to learn more about your DataOps challenges and how we can help you achieve your modernization mission. Contact us to start a conversation.