Kathryn Ball didn’t move from geology to data engineering on a whim.
“It’s called survival. With all the changes in the industry, you have to reinvent yourself,” she said.
Ball grew up scouting for minerals in western North Carolina, earned a degree in geology and is now subchapter head of data engineering for Chevron in Houston.
Along the way in her career, she’s worked as lead data scientist for Aramco-Americas and manager of advanced analytics for Devon Energy.
She called that switch from geoscience “a left turn in my career” and described her formative experiences in data engineering as “a painful time. But I got to learn about systems, about processes, and how data has to be to do analytics.”
Computing and data skills are now the number-one, most in-demand skills for industry worldwide, Ball noted, and that needs to be a wake-up call for petroleum geoscientists.
“It’s no longer good enough to be proficient in geology, in reservoir. You also have to be proficient in data and how to speak about data,” she said.
Data and Decision-Making
That’s a challenge for many geoscientists. Like other specialties, data science has its own jargon, its own terminology, its own basic concepts. Learning how to communicate effectively about geology in terms of data, to talk with computing and data application experts, is an acquired skill.
Ball sees the effects in her own job when she talks data to petroleum geoscientists and engineers.
“I call it ‘the blank stare.’ I’m getting better at social cues, like, ‘They don’t understand a word I just said,’” she noted.
The challenge goes two ways, with data scientists in oil and gas required to absorb some fundamentals of geology, geophysics and petroleum engineering. The disciplines can seem widely different, especially in heavily quantitative, data-oriented areas.
On both sides, “people don’t understand the discipline, and that’s where the weakness is,” Ball observed.
She has no doubt that data skills will be increasingly important for petroleum professionals. One reason is the sheer amount of data being captured and used in oil and gas today, in almost every part of the industry.
Ball said she’s sometimes amused when other professionals refer to their own industries as “data heavy,” because they use so much less data than the oil industry. A single seismic reading can require a terabyte of data, she noted.
Proliferation of automation throughout the industry, in everything from automated drilling rigs to intelligent oil fields, will also drive increasing demand for computer and data skills, as will evaluating masses of data generated by sensors linked to the Internet of Things, Ball observed.
“The other part is that there’s so much written resource out there. You can train the computer to read that for you,” to digest and extract and condense information, she said.
An ultimate ability is using data to inform decision making, knowing how to assess and apply the value in data toward accomplishing results, Ball said.
“At the end of the day, it’s the value in the data. People say data is the new oil. That’s not really true. It’s understanding how to make decisions based on data that’s the new oil,” she said.
Learn the Art of Data Wrangling
Geoscientists who want to improve their data skills can draw on industry training courses as well as a variety of free online resources, according to Ball. The trend is toward making computing and data tools easier to use for everyone, she noted.
“I’m always suggesting Python or R (computing languages). There’s even Orange, which is drag-and-drop and you don’t have to code. Learn how to wrangle data,” Ball recommended.
Most of all, “get beyond the world of Excel,” she said.
Applying data to decisions involves understanding both the nature of the data available and the nature of the question involved. Is it a linear or non-linear problem, a network or decision-tree solution?
“Something that drives me nuts is, don’t do something just because it’s cool, like using a neural network,” Ball said.
“You can always get an answer. But it might not be the right answer,” she added.
A future challenge for data science in the oil and gas industry involves gathering high-quality data and making it available for analysis in a reliable and consistent way – a job for the data engineer. The term “data wrangling” gets used a lot.
“The biggest challenge is the data engineering. So much attention has been paid to the data science side. Data engineering is kind of the pipeline that gives you the data for data science,” Ball noted.
“Who owns that data and how can it be used to make a decision? If the data’s bad, who’s going to fix it?” she said.
A Natural Fit
For both geoscientists and data scientists, Ball emphasized the word “passion.” She said her own career grew out of her dual passions – first, a fascination with rocks and minerals, and second, a strong drive to understand how the world works, especially by using mathematics and data.
Ball said North Carolina was “was heaven for finding minerals.”
“My dad was gold panning and I got fired from panning because I was looking at the rocks too much. Just a love of rocks – I love mineralogy,” she explained.
Her view of data in geoscience might seem a bit contrarian, because she sees data science as a natural fit with exploration geology. It’s all about understanding how the world is put together.
“I think it’s a gift on how we’re wired. And geologists should take more advantage of it, because geologists are detectives,” she said.
Ball thinks data skills will become a must-have for petroleum geoscientists working in the energy industry of the future. In fact, those skills are increasingly vital in the industry today.
“All of the petro-technical professionals will have that as part of their training,” she predicted.
“It’s not something to be feared,” Ball said. “It’s something to have fun with.”