Web Services, Output Formats and GZIP Compression
For the development at musik.sendung.de I have recently explored some of the texts on REST and RESTful web services. The topic is interesting for me both as a consumer as well as a (possible future) publisher of an API.
I guess one thing that should concern publishers as well as consumers of web services is the reasonable use of bandwidth.
Some web services offer the resource output in various flavors. They serve not only XML but also plain text (usually comma or tab seperated values), JSON data structures for immediate use within JavaScript or serialized PHP arrays or objects to serve the same purpose within PHP. Actually I love publishers who serve plain text, not only because it’s easy to parse in almost any language, but it’s also lightweight.
However, the chosen output format can have quite some influence on the amount of data to be transferred. I did a not-at-all-representative test with one kind of API of mine serving a list of 20 table items in four formats. The same information encoded in different data formats was quite different in size:
- XML: 15.5 KB
- PHP: 13.3 KB
- JSON: 10.2 KB
- CSV: 5.1 KB
When compressed using gzip, the output data becomes a lot smaller. The chart illustrates how compression affects the different data formats. The large bars indicate the original size, the small bars indicate size after compression. Absolute figures aren’t important here, so there is no actual scale.

Not bad, I think. Gzip compression saves between 73% and 85% of the bandwidth. The effect is largest in XML because the tags create a lot of redundancy. CSV, on the other hand, has only little redundancy, since there is almost no markup contained in the format.
When the effect of output compression is so big, why isn’t it supported more widely? Which one of the popular web service APIs support it? [See update below] None that I know of.
Is the bandwith usage for e.g. Amazon’s or Ebay’s API so marginal compared to everything else they serve? Probably. But that doesn’t have to be the case for every publisher.
One could argue that whoever provides an API should be able to afford the bandwidth. I’d respond: That’s a lame argument. When the whole web becomes a conglomerate of web service APIs, everybody who cares should be able to participate. And those who have a popular weblog know how bad it can get when increasing popularity means increasing bandwidth toll.
What about the additional CPU cost of compression? That could be an issue. Those who have their servers running on high CPU usage already would definitely have to test this.
On the other hand, since downloads can be handled faster, server threads can finish up faster and thus help to save server RAM. Theoretically.
For those who consume a web service, compression could be an issue when it’s not handled transparently by their client library. For example, PHP’s fopen() or file_get_contents() functions do not send the according “Accept-Encoding” headers when opening an HTTP URL.
For the AJAX-based web service clients running on the latest browsers it shouldn’t be an issue, though. Negotiation an decompression are handled transparently by the browsers.
Web services are usually not connected randomly, but only when a user/machine requests certain data (and that data is not cached on the consumer side). Especially when it comes to large datasets being transferred, compression could reduce the time used to complete an action based on the received data. Thus compression can help to improve application performance.
Updates
Which APIs support it? As soon as I find some, I will add them to this list.
- Google Base Data API: Yes
- Amazon E-Commerce Service: No
- Yahoo! Mail Web Services: Maybe (documentation doesn’t contain info, but one example shown contains the Content-encoding: gzip response header.
Some links:
- Speed Web delivery with HTTP compression – A look at the page-delivery effects of data compression in HTTP 1.1
- Webperformance.org compression resources
- mod_deflate – Compression module fpor Apache 2
- Squeezing SOAP – GZIP enabling Apache Axis
- Matt Chotin: Enabling gzip compression for [flex] data services
4 Comments
Nur so am Rande: Auch die Basecamp API beispielsweise scheint keine Komprimierung zu unterstützen http://www.basecamphq.com/api/
Thanks Marian, your results are very interesting, iam using Json with PHP but without compression, but i will have to enable it.
Sorry, the comment form is closed at this time.