Shrinking Seismic Not an Easy Task

Avoiding 'Lossy' Is the Key

Data compression for geophysics never lived up to its early promise, because it promised more than it could ever deliver.

Since that time, however, improved techniques have made practical data compression a reality. It now helps the petroleum industry manage today's enormous seismic datasets.

"A 10-terabyte dataset, these days, is very easy to acquire. To manage 100 terabytes of data isn't so easy," said Anthony A. Vassiliou, president of GeoEnergy Inc. in Tulsa. "It's like trying to manage a large army."

GeoEnergy provides seismic data compression, processing and data management services for the oil and gas industry.

Founded in 1998, the company utilizes a state-of-the-art data compression technology originally developed for other industries.

The same concepts used by GeoEnergy have been employed in designing the Stealth bomber, providing images in telemedicine and storing fingerprint files for the FBI, according to Vassiliou.

GeoEnergy could serve as a case study for small vendors.

Even though it has built a successful business by licensing data compression software, that may not be its future.

A seismic shift in the oil and gas industry is leading Vassiliou to plan a new direction for his company.

Reality Sets In

Data compression employs algorithms to shrink the size of a dataset, saving space and reducing transmission times.

Here's an example of how it works: Designate the start of a repeating sequence of numbers with the letter s, and repetitions with the letter r. The set 123456712345671234567 then can be written s1234567rr.

Define a numerically sequential set of integers with the first integer, the letter t and the last integer, and the same set can be written s1t7rr.

Algorithms in data compression may be a thousand times more sophisticated, but the idea remains exactly the same — to shrink the space taken up by a set of data.

Techniques that lose information during data compression are called "lossy." Those that lose no perceptible information are called "lossless."

At first, seismic data compression landed a reputation for being unacceptably lossy. It also suffered from over-reaching for high compression ratios.

"When this technology first came out, people were talking about ratios of 100:1, 200:1," Vassiliou said. "That was not correct."

Image Caption

After years of promise, practical data compression is becoming a reality for geoscientists. In the example here, the left image shows a migrated stacked seismic data set. In the center, the results of the compression/decompression at a 20:1 compression ratio are displayed. The figure at the right shows the difference between the original and the compressed/decompressed at a compression ratio of 20:1. All the results have been plotted using the same amplitude scale.

Please log in to read the full article

Data compression for geophysics never lived up to its early promise, because it promised more than it could ever deliver.

Since that time, however, improved techniques have made practical data compression a reality. It now helps the petroleum industry manage today's enormous seismic datasets.

"A 10-terabyte dataset, these days, is very easy to acquire. To manage 100 terabytes of data isn't so easy," said Anthony A. Vassiliou, president of GeoEnergy Inc. in Tulsa. "It's like trying to manage a large army."

GeoEnergy provides seismic data compression, processing and data management services for the oil and gas industry.

Founded in 1998, the company utilizes a state-of-the-art data compression technology originally developed for other industries.

The same concepts used by GeoEnergy have been employed in designing the Stealth bomber, providing images in telemedicine and storing fingerprint files for the FBI, according to Vassiliou.

GeoEnergy could serve as a case study for small vendors.

Even though it has built a successful business by licensing data compression software, that may not be its future.

A seismic shift in the oil and gas industry is leading Vassiliou to plan a new direction for his company.

Reality Sets In

Data compression employs algorithms to shrink the size of a dataset, saving space and reducing transmission times.

Here's an example of how it works: Designate the start of a repeating sequence of numbers with the letter s, and repetitions with the letter r. The set 123456712345671234567 then can be written s1234567rr.

Define a numerically sequential set of integers with the first integer, the letter t and the last integer, and the same set can be written s1t7rr.

Algorithms in data compression may be a thousand times more sophisticated, but the idea remains exactly the same — to shrink the space taken up by a set of data.

Techniques that lose information during data compression are called "lossy." Those that lose no perceptible information are called "lossless."

At first, seismic data compression landed a reputation for being unacceptably lossy. It also suffered from over-reaching for high compression ratios.

"When this technology first came out, people were talking about ratios of 100:1, 200:1," Vassiliou said. "That was not correct."

Faced with a technology that might lose data and add noise, at nowhere near the promised compression ratios, geophysicists felt underwhelmed.

AAPG member Norman J. Hyne serves as geological advisor for GeoEnergy. Hyne is a Tulsa consultant and well-known geology instructor who teaches short courses around the world.

"The first thing that happens when you show this to geophysicists is they say, 'We've seen this before. They're losing data. It's not accurate.' This is the first accurate data compression for geophysics," he said.

With accurate and efficient compression software, a company can sharply reduce the time needed to transmit even a very large dataset, Hyne noted.

"If you have a seismic ship off of West Africa and you want to move the data to Houston, you are decreasing the time literally by the compression ratio," he said, "so at a 10:1 ratio it takes one-tenth the time."

Reality, Sandwiched

Data compression now targets realistic reduction ratios, 5:1 or 10:1 on up. Lossless ratios are significantly lower.

GeoEnergy claims "essentially lossless compression" at 1.5:1 to 3:1.

"I believe in the future we will get 8-bit data compressed with a ratio of 2.25:1, lossless. We'll have zero loss up to 4:1 with 16-bit data," Vassiliou said.

Algorithms in data compression come from advanced mathematics. GeoEnergy's technology partner is Fast Mathematical Algorithms and Hardware (FMA&H), a company founded in 1989 by Yale University professors and mathematicians Ronald Coifman and Vladimir Rokhlin.

Vassiliou said GeoEnergy holds the exclusive right to commercialize FMA&H technology for petroleum-related uses.

"This technology has been around for a long time and became applied to the petroleum industry only after it had been applied to several industries before," he said.

A long time, in this context, means since the 1980s. At Yale, Coifman began to apply his wave-packet theories to real-world problems.

An early commercial application came in 1989, when the FBI applied wavelet quantization to condense fingerprint files, Vassiliou said.

"It was very important because the FBI and law-enforcement offices all over the country could exchange fingerprint information very quickly," he said. "At that time, of course, there was no Web."

In 2000, Coifman received the United States National Science Medal for his ground-breaking work.

Real World, Real Fast

"Seismic data compression is composed of three major steps," Vassiliou said.

  • In the first step the seismic data are decorrelated using a multiresolution transform such as wavelet or local cosine transform, which are very efficient computationally. The result of the decorrelation of the original data is a rather flat histogram of the data.
  • In the second step the transformed data are quantized, which in very broad terms means a conversion from analog to digital, or a conversion from floating-point numbers to integer numbers.
  • In the final step, the encoding, the redundancy achieved through quantization is used to derive in a probabilistic manner symbols related to the quantized numbers.

"A very simple example of encoding is the Morse code, where symbols corresponding to a very frequent quantized number are coded with a short code, whereas symbols corresponding to a very rarely encountered quantized number are coded with a long code," he said.

By comparing information in a compressed dataset with the original set, Vassiliou can determine what information has been lost and represent that visually on a computer screen.

At low compression ratios, the screen appears blank. As the ratio increases, a faint dot pattern of scattered residual data points appears.

"As you process more and more, those random residuals get attenuated, they get thrown out," he said. "When you subtract them from the original data, no information is lost. What we're losing is essentially random noise."

In addition to accuracy, compression speed becomes important when working with extremely large datasets. Vassiliou said 16-bit data can be decompressed at 20 megabytes per second, and 32-bit data can be decompressed at 40 megabytes per second, faster than the write-to-drive speeds of desktop computers.

"The disc can't keep up at speeds of 40 megabytes per second," he observed. "You are straining the limits of the hardware at these speeds."

Vassiliou said time slices can be pulled from the compressed data for study, and x,y,z coordinates from well information can be entered to define and display a horizon.

"You compress the data only once," he said, "but you decompress it many times, for different applications."

Reality: Round Two

Geophysicists now see data compression as a useful tool for storage and transmission, with lingering skepticism about the practicality of processing compressed data.

"In the earth sciences we drown in data. When we look at seismic data, we have more data bits than an imaging satellite generates," said Norm Neidell, a Houston consultant in seismic high technology.

"One of the problems in data compression is that compression for storage is straightforward," he said, "but the subsidiary question is, 'Can we do the analysis on the compressed data?' That's a much harder question."

Without a proven method of processing data in condensed form, the data must be decompressed for analysis.

"If you have to decompress, you lose a lot of the advantage," he said.

Sven Treitel, a geophysics consultant in Tulsa, sees "huge advantages" coming from seismic compression.

"I think it's an idea that has finally reached maturity," Treitel said. "There were skeptics at first, and there still are."

Treitel said some geophysicists think of the full seismic set as "sacred" and won't consider sacrificing even the smallest amount of data. He doesn't share that view.

"Seismic recordings are, by their very nature, noisy," he said. "The compression step results in loss of signal and loss of noise."

While "gains in signal processing have not been proven," data compression offers clear benefits for storage and transmission, Treitel said.

"Actually, the original idea for wavelet transforms came from a geophysicist named Jean Morlet, who was at that time in geophysics with Elf Aquitane," he noted.

"Mathematicians tend to forget that a geophysicist first came up with this idea," he added. "They need to be reminded."

Real, Real Gone

Vassiliou came to data compression after studying crosshole seismic tomography at Mobil Research and Development Corp. He later worked in research at Amoco.

"In 1998 I decided to go out and see if the industry would be interested in this (compression) technology," he said. "And you remember what 1998 and 1999 were like in the industry."

As Vassiliou made sales calls on major oil companies and service-and-supply companies, he discovered a problem. It was hard to find people who had both expertise and time.

"The major issue is when the people who will be using the software will have time to test the software," he said.

Tulsa petroleum engineer and long-time oilman Wayne E. Swearingen serves on GeoEnergy's board of directors. He wasn't surprised by the demands on in-house expertise.

"What we see is that these big companies have more interest in outsourcing than they had before," he said.

"I think this is probably the way big companies are going to operate in the future," he said. "Very few of these companies have applied mathematicians."

That shift made Vassiliou rethink GeoEnergy's direction, to "go into a service mode."

"We are looking to offer prestack time and prestack depth migration service, which will provide fast turnaround at low cost," he said. "The turnaround is very difficult to achieve due to the difficulty of building efficiently the velocity model."

As he expands the scope of the business, Vassiliou said he may have to move to Houston — another sign of the times. He thinks data compression will continue to be well-received.

"We've had zero complaints during the last three years," he said. "That's the most compelling thing."

You may also be interested in ...