January 9, 2014
Why GZip Doesn't Help Direct Proxy's Performance
For direct OSB proxies, passing XML content as GZip will not improve the end-to-end performance due to extra time required for gzipping and un-gzipping.
There is a service I have to deal with which is a constant headache.
The service returns a huge, 1Mbyte or more, XML with all kinds of information.
The web application that calls the service experiences a very noticeable delay, which affects the user experience. Web developers measured that the average delay time for them is about 500ms, and the delay grows proportionally to the response size.
The size is obviously the culprit, and I gotta do something about it!
GZip The Response! It Must Help!
Internet has solved the same problem long time ago. When the web page is big, server compresses the payload with gzip which often reduces the size by factor of 10.
Since SOAP runs on top of HTTP, the gzip solution is available to it as well. I just need to make OSB to pass-through the gzipped content without trying to decompress it. In fact, it does so by default, but it strips the “Content-Type: gzip” header. Missing header breaks the caller, but this is easy to fix though.
Here’s what I did:
I made sure I do not touch the payload, even for logging, so OSB will not try to parse the gzipped stream
I sent “Accept-Encoding: gzip” header to the backend service
I passed all headers from the backend service back to the caller(**)
Since the backend service supports gzip, it should serve me the gzipped content now. And it did!
Yay! The response size was 50 times smaller than original. I was expecting at least 10 times improvement in the call time.
And… Nothing
The performance test indeed has shown an improvement … of about 1%. Huh?
Looking Outside Of OSB Box
Now it was obvious my mental model of where the time is spent was wrong.
I painstakingly measured all the steps the calling application and the backend had to do to communicate. They did quite a lot of work outside of simple sending the data!
Sender had to build DOM of the message first, then serialize it into XML string, and only then write it into the wire.
Receiver had to parse the XML string into DOM.
When GZip was used, sender had to compress the XML string before sending it, and the recipient had to decompress it before parsing.
Here’s how the request/response cycle looked before implementing GZip:
Caller | In-Flight+OSB | Backend | Time |
---|---|---|---|
Building Request DOM | 6ms | ||
Serializing Request DOM into XML | 15ms | ||
Sending request via OSB | 5ms | ||
Parsing request into DOM | 5ms | ||
Querying database | 55ms | ||
Bulding response DOM | 82ms | ||
Serializing response DOM into XML | 237ms | ||
Sending response via OSB | 31ms | ||
Parsing response into DOM | 166ms | ||
TOTAL | 602ms |
Note that largest time hog is not the network, but building, serializing and parsing XML! Using gzip does not affect these steps, and so has little effect on the overall delay:
Caller | In-Flight+OSB | Backend | Time |
---|---|---|---|
Building Request DOM | 6ms | ||
Serializing Request DOM into XML | 15ms | ||
Sending request via OSB | 5ms | ||
Parsing request into DOM | 5ms | ||
Querying database | 55ms | ||
Bulding response DOM | 82ms | ||
Serializing response DOM into XML | 237ms | ||
GZipping the response | 18ms | ||
Sending response via OSB | 5ms | ||
Un-GZipping the response | 2ms | ||
Parsing response into DOM | 166ms | ||
TOTAL | 596ms |
Gzip indeed cut the transfer time some 6 times, but the backend had to spend some time to compress the payload, and the caller had to spent a bit more time to decompress it, which killed almost all the savings.
The most troublemaking steps though - XML processing - left unaffected.
Conclusion
The size was indeed the root cause, but its effect was outside of OSB. The optimization efforts should have been focused on making the data size smaller.
About Me
My name is Vladimir Dyuzhev, and I'm the author of GenericParallel, an OSB proxy service for making parallel calls effortlessly and MockMotor, a powerful mock server.
I'm building SOA enterprise systems for clients large and small for almost 20 years. Most of that time I've been working with BEA (later Oracle) Weblogic platform, including OSB and other SOA systems.
Feel free to contact me if you have a SOA project to design and implement. See my profile on LinkedIn.
I live in Toronto, Ontario, Canada. Email me at info@genericparallel.com