Data compression
for geophysics never lived up to its early promise, because it promised
more than it could ever deliver.
Since that time, however, improved techniques have
made practical data compression a reality. It now helps the petroleum
industry manage today's enormous seismic datasets.
"A 10-terabyte dataset, these days, is very easy
to acquire. To manage 100 terabytes of data isn't so easy," said
Anthony A. Vassiliou, president of GeoEnergy Inc. in Tulsa. "It's
like trying to manage a large army."
GeoEnergy provides seismic data compression, processing
and data management services for the oil and gas industry.
Founded in 1998, the company utilizes a state-of-the-art
data compression technology originally developed for other industries.
The same concepts used by GeoEnergy have been employed
in designing the Stealth bomber, providing images in telemedicine
and storing fingerprint files for the FBI, according to Vassiliou.
GeoEnergy could serve as a case study for small vendors.
Even though it has built a successful business by
licensing data compression software, that may not be its future.
A seismic shift in the oil and gas industry is leading
Vassiliou to plan a new direction for his company.
Reality Sets In
Data compression employs algorithms to shrink the size
of a dataset, saving space and reducing transmission times.
Here's an example of how it works: Designate the
start of a repeating sequence of numbers with the letter s, and
repetitions with the letter r. The set 123456712345671234567 then
can be written s1234567rr.
Define a numerically sequential set of integers with
the first integer, the letter t and the last integer, and the same
set can be written s1t7rr.
Algorithms in data compression may be a thousand
times more sophisticated, but the idea remains exactly the same
— to shrink the space taken up by a set of data.
Techniques that lose information during data compression
are called "lossy." Those that lose no perceptible information are
called "lossless."
At first, seismic data compression landed a reputation
for being unacceptably lossy. It also suffered from over-reaching
for high compression ratios.
"When this technology first came out, people were
talking about ratios of 100:1, 200:1," Vassiliou said. "That was
not correct."
Faced with a technology that might lose data and
add noise, at nowhere near the promised compression ratios, geophysicists
felt underwhelmed.
AAPG member Norman J. Hyne serves as geological advisor
for GeoEnergy. Hyne is a Tulsa consultant and well-known geology
instructor who teaches short courses around the world.
"The first thing that happens when you show this
to geophysicists is they say, 'We've seen this before. They're losing
data. It's not accurate.' This is the first accurate data compression
for geophysics," he said.
With accurate and efficient compression software,
a company can sharply reduce the time needed to transmit even a
very large dataset, Hyne noted.
"If you have a seismic ship off of West Africa and
you want to move the data to Houston, you are decreasing the time
literally by the compression ratio," he said, "so at a 10:1 ratio
it takes one-tenth the time."
Reality, Sandwiched
Data compression now targets realistic reduction
ratios, 5:1 or 10:1 on up. Lossless ratios are significantly lower.
GeoEnergy claims "essentially lossless compression"
at 1.5:1 to 3:1.
"I believe in the future we will get 8-bit data compressed
with a ratio of 2.25:1, lossless. We'll have zero loss up to 4:1
with 16-bit data," Vassiliou said.
Algorithms in data compression come from advanced
mathematics. GeoEnergy's technology partner is Fast Mathematical
Algorithms and Hardware (FMA&H), a company founded in 1989 by
Yale University professors and mathematicians Ronald Coifman and
Vladimir Rokhlin.
Vassiliou said GeoEnergy holds the exclusive right
to commercialize FMA&H technology for petroleum-related uses.
"This technology has been around for a long time
and became applied to the petroleum industry only after it had been
applied to several industries before," he said.
A long time, in this context, means since the 1980s.
At Yale, Coifman began to apply his wave-packet theories to real-world
problems.
An early commercial application came in 1989, when
the FBI applied wavelet quantization to condense fingerprint files,
Vassiliou said.
"It was very important because the FBI and law-enforcement
offices all over the country could exchange fingerprint information
very quickly," he said. "At that time, of course, there was no Web."
In 2000, Coifman received the United States National
Science Medal for his ground-breaking work.
Real World, Real Fast
"Seismic data compression is composed of three major
steps," Vassiliou said.
- In
the first step the seismic data are decorrelated using a multiresolution
transform such as wavelet or local cosine transform, which are very
efficient computationally. The result of the decorrelation of the
original data is a rather flat histogram of the data.
- In the second step the transformed data are quantized, which in
very broad terms means a conversion from analog to digital, or a
conversion from floating-point numbers to integer numbers.
- In the final step, the encoding, the redundancy achieved through
quantization is used to derive in a probabilistic manner symbols
related to the quantized numbers.
"A very simple example of encoding is the Morse code,
where symbols corresponding to a very frequent quantized number
are coded with a short code, whereas symbols corresponding to a
very rarely encountered quantized number are coded with a long code,"
he said.
By comparing information in a compressed dataset
with the original set, Vassiliou can determine what information
has been lost and represent that visually on a computer screen.
At low compression ratios, the screen appears blank.
As the ratio increases, a faint dot pattern of scattered residual
data points appears.
"As you process more and more, those random residuals
get attenuated, they get thrown out," he said. "When you subtract
them from the original data, no information is lost. What we're
losing is essentially random noise."
In addition to accuracy, compression speed becomes
important when working with extremely large datasets. Vassiliou
said 16-bit data can be decompressed at 20 megabytes per second,
and 32-bit data can be decompressed at 40 megabytes per second,
faster than the write-to-drive speeds of desktop computers.
"The disc can't keep up at speeds of 40 megabytes
per second," he observed. "You are straining the limits of the hardware
at these speeds."
Vassiliou said time slices can be pulled from the
compressed data for study, and x,y,z coordinates from well information
can be entered to define and display a horizon.
"You compress the data only once," he said, "but
you decompress it many times, for different applications."
Reality: Round Two
Geophysicists now see data compression as a useful
tool for storage and transmission, with lingering skepticism about
the practicality of processing compressed data.
"In the earth sciences we drown in data. When we
look at seismic data, we have more data bits than an imaging satellite
generates," said Norm Neidell, a Houston consultant in seismic high
technology.
"One of the problems in data compression is that
compression for storage is straightforward," he said, "but the subsidiary
question is, 'Can we do the analysis on the compressed data?' That's
a much harder question."
Without a proven method of processing data in condensed
form, the data must be decompressed for analysis.
"If you have to decompress, you lose a lot of the
advantage," he said.
Sven Treitel, a geophysics consultant in Tulsa, sees
"huge advantages" coming from seismic compression.
"I think it's an idea that has finally reached maturity,"
Treitel said. "There were skeptics at first, and there still are."
Treitel said some geophysicists think of the full
seismic set as "sacred" and won't consider sacrificing even the
smallest amount of data. He doesn't share that view.
"Seismic recordings are, by their very nature, noisy,"
he said. "The compression step results in loss of signal and loss
of noise."
While "gains in signal processing have not been proven,"
data compression offers clear benefits for storage and transmission,
Treitel said.
"Actually, the original idea for wavelet transforms
came from a geophysicist named Jean Morlet, who was at that time
in geophysics with Elf Aquitane," he noted.
"Mathematicians tend to forget that a geophysicist
first came up with this idea," he added. "They need to be reminded."
Real, Real Gone
Vassiliou came to data compression after studying crosshole
seismic tomography at Mobil Research and Development Corp. He later
worked in research at Amoco.
"In 1998 I decided to go out and see if the industry
would be interested in this (compression) technology," he said.
"And you remember what 1998 and 1999 were like in the industry."
As Vassiliou made sales calls on major oil companies
and service-and-supply companies, he discovered a problem. It was
hard to find people who had both expertise and time.
"The major issue is when the people who will be using
the software will have time to test the software," he said.
Tulsa petroleum engineer and long-time oilman Wayne
E. Swearingen serves on GeoEnergy's board of directors. He wasn't
surprised by the demands on in-house expertise.
"What we see is that these big companies have more
interest in outsourcing than they had before," he said.
"I think this is probably the way big companies are
going to operate in the future," he said. "Very few of these companies
have applied mathematicians."
That shift made Vassiliou rethink GeoEnergy's direction,
to "go into a service mode."
"We are looking to offer prestack time and prestack
depth migration service, which will provide fast turnaround at low
cost," he said. "The turnaround is very difficult to achieve due
to the difficulty of building efficiently the velocity model."
As he expands the scope of the business, Vassiliou
said he may have to move to Houston — another sign of the times.
He thinks data compression will continue to be well-received.
"We've had zero complaints during the last three
years," he said. "That's the most compelling thing."