The Un-Official Proxomitron Forum

Full Version: Proposed Accept-Encoding header filter change.
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Code:
[HTTP headers]
In = FALSE
Out = TRUE
Key = "Accept-Encoding: 2 GZip first|specified     10.12.30 [srl] (d.1) (Out) MOD"
URL = "^$TST(volat=*.encoded:1.*)|$TST(keyword=*.a_web.*)"
Match = "$TST(s_AccEnc=?*)|(?*,&((*, )++(gzip$SET(1=,gzip)|x-gzip$SET(2=,x-gzip)|deflate$SET(3=,deflate)))+{1,3}$SET(s_AccEnc=\1\3)$TST(s_AccEnc=,\5)$SET(s_AccEnc=\5))"
Replace = "$GET(s_AccEnc)$SET(s_AccEnc=)"

Exceptions lists entry

Code:
## Specify Accept-Encoding header     $SET(s_AccEnc=)$SET(0=s_AccEnc:.)
##
## Accept-Encoding header filter's default behaviour:
## "gzip", "x-gzip", and "deflate" methods may be decompressed by the Proxomitron.
## When any of these methods are present, other methods are removed from the header.
## When any of these methods are present, order is gzip,x-gzip,deflate
## When not present, the header is not altered by the filter.
##
## Use $SET(s_AccEnc=) to modify the filter's behaviour.
## Example:
##   Specify gzip at foobar.com/:
##   foobar.com/ $SET(s_AccEnc=gzip) $SET(0=s_AccEnc:gzip.)
## OR
## Edit "$SET(s_AccEnc=\1\2\3)" in the "Accept-Encoding: 2 GZip first|specified" header filter.
##

Reason: http://prxbx.com/forums/showthread.php?t...&pid=15187
Current filter, "Accept-Encoding: 2 GZip only 07.11.16 [srl] (d.1) (Out)", causes a server at IBM, http://www-31.ibm.com/storage/cn/disk/ds5020/specs.shtml , to send Opera a chunked deflate stream that the Proxomitron can't handle and may send an empty Accept-Encoding header.


Test and report, please.

Note: to get the header filters to order properly, I must import, save, exit the Proxomitron, start the Proxomitron, click "Config" on the Proxomitron's main screen, click "OK", and save again. When in doubt, open the config in an editor and verify.

Edit: 10.12.29 pack200-gzip issue "*(gzip" changed to "(*, )++(gzip"
Edit: 10.12.30 x-gzip removed by default "(\1\2\3)" changed to "(\1\3)"
1. Empty Accept-Encoding is permitted. See http://tools.ietf.org/html/rfc2616#section-14.3

2. The filter changes
Code:
Accept-Encoding: pack200-gzip, compress

to
Code:
Accept-Encoding: gzip

Which I think is not expected.

3. Why not remove x-gzip?

Besides the issue that Graycode has pointed out, x-gzip seems to be outdated. See http://en.wikipedia.org/wiki/HTTP_compression & http://tools.ietf.org/html/rfc2616#section-3.5
(Dec. 29, 2010 03:36 PM)whenever Wrote: [ -> ]1. Empty Accept-Encoding is permitted. See http://tools.ietf.org/html/rfc2616#section-14.3

The empty header could be sent when the User-Agent wasn't sending an empty header. Doesn't seem like the thing to do?

(Dec. 29, 2010 03:36 PM)whenever Wrote: [ -> ]2. The filter changes
Code:
Accept-Encoding: pack200-gzip, compress

to
Code:
Accept-Encoding: gzip

Which I think is not expected.

I forgot about that one. I'll plug the hole, thanks.

(Dec. 29, 2010 03:36 PM)whenever Wrote: [ -> ]3. Why not remove x-gzip?

Besides the issue that Graycode has pointed out, x-gzip seems to be outdated. See http://en.wikipedia.org/wiki/HTTP_compression & http://tools.ietf.org/html/rfc2616#section-3.5

Lack of knowledge. I thought about that. Also considered changing x-gzip to gzip when gzip is not present.
However, Opera still sends x-gzip and I found an app that added support years after its creators considered x-gzip to be deprecated. Why?

Am also considering forcing gzip only when deflate and gzip are present and or throwing a flag for deflate compressed files.

The new filter allows the user to do as they wish.
Question is what should the default be?
(Dec. 29, 2010 05:21 PM)JJoe Wrote: [ -> ]The empty header could be sent when the User-Agent wasn't sending an empty header. Doesn't seem like the thing to do?

Just got your point. Your fix is reasonable.

(Dec. 29, 2010 05:21 PM)JJoe Wrote: [ -> ]However, Opera still sends x-gzip and I found an app that added support years after its creators considered x-gzip to be deprecated. Why?

Opera users suffer pain for its sending x-gzip.

I don't know why they stick with x-gzip but rare cases shouldn't be handled by a filter's default behaving. Exception list is for that purpose.

(Dec. 29, 2010 05:21 PM)JJoe Wrote: [ -> ]Am also considering forcing gzip only when deflate and gzip are present and or throwing a flag for deflate compressed files.

Doesn't put gzip first fix the issue?

(Dec. 29, 2010 05:21 PM)JJoe Wrote: [ -> ]The new filter allows the user to do as they wish.
Question is what should the default be?

I think the default should be behave as what most popular browsers like IE/FF do, because most servers/apps will try to keep compatibility with them too.
(Dec. 30, 2010 08:31 AM)whenever Wrote: [ -> ]
(Dec. 29, 2010 05:21 PM)JJoe Wrote: [ -> ]However, Opera still sends x-gzip and I found an app that added support years after its creators considered x-gzip to be deprecated. Why?

Opera users suffer pain for its sending x-gzip.

Changed. Opera will likely argue that incorrectly configured servers aren't their problem.

(Dec. 30, 2010 08:31 AM)whenever Wrote: [ -> ]
(Dec. 29, 2010 05:21 PM)JJoe Wrote: [ -> ]Am also considering forcing gzip only when deflate and gzip are present and or throwing a flag for deflate compressed files.

Doesn't put gzip first fix the issue?

Not necessarily. I think the server can still choose to send deflate.
"gzip,x-gzip,deflate" is supposed to mean 'use one of these, gzip is best choice"

(Dec. 30, 2010 08:31 AM)whenever Wrote: [ -> ]
(Dec. 29, 2010 05:21 PM)JJoe Wrote: [ -> ]The new filter allows the user to do as they wish.
Question is what should the default be?

I think the default should be behave as what most popular browsers like IE/FF do, because most servers/apps will try to keep compatibility with them too.

Perhaps, but Opera does things that IE and FF don't. I haven't any experience with Unite or the mail client and very little with Opera. The defaults shouldn't break anything important.
(Dec. 30, 2010 03:15 PM)JJoe Wrote: [ -> ]
(Dec. 30, 2010 08:31 AM)whenever Wrote: [ -> ]
(Dec. 29, 2010 05:21 PM)JJoe Wrote: [ -> ]However, Opera still sends x-gzip and I found an app that added support years after its creators considered x-gzip to be deprecated. Why?

Opera users suffer pain for its sending x-gzip.

Changed. Opera will likely argue that incorrectly configured servers aren't their problem.

The Opera browser has some unique quirks, among them is its indication of 'deflate' before 'gzip'. That part isn't new in version 11, it's caused occasional problems over many versions. The Opera developers love to scream "Standards" in your face but I've had more than one support thread disappear after asking for "common sense" to be applied or suggesting they don't comprehend a section of some RFC. I used to be a big supporter but more recently they've made that impossible.

(Dec. 30, 2010 03:15 PM)JJoe Wrote: [ -> ]
(Dec. 30, 2010 08:31 AM)whenever Wrote: [ -> ]
(Dec. 29, 2010 05:21 PM)JJoe Wrote: [ -> ]Am also considering forcing gzip only when deflate and gzip are present and or throwing a flag for deflate compressed files.

Doesn't put gzip first fix the issue?

Not necessarily. I think the server can still choose to send deflate.
"gzip,x-gzip,deflate" is supposed to mean 'use one of these, gzip is best choice"

Yes, the server can still choose deflate in that situation. But most seem to use gzip, especially when it's listed first. Unfortunately there isn't a way to tell the server whether to use RFC1950 or RFV1951 deflate.

Note too that 'x-gzip' is just a synonym for 'gzip'. In general, it seems servers haven't bothered with it because most browsers (other than Opera) don't mention it in their 'Accept-Encoding'.

The problem site that triggered this thread (www-31.ibm.com/ etc) encoded some data using RFC1950 deflate specification when used by Opera. Other browsers aren't impacted because the server picks up on their having 'gzip' first. IE and other Microsoft products do not support the RFC1950 deflate format but they do accommodate the RFC1951 format.

As you mentioned in the other thread, people had apparently overcome some 'deflate' decompression problems by looking for a specific 2-byte prefix value. The technical details are in section 2.2 of RFC1950. When a server uses a 32K window size for deflation, and if there is no preset dictionary, and if it uses a common default compression level, then the 1st 2 bytes will be 0x78 and 0x9C. So, when encountering 'deflate' decompression issues with software like IE, some people decided to look for those particular 2 bytes and skip over them as a way to emulate RFC1951 raw inflation. That seems to be what Scott may have done with Proxo. One thing I don't understand though is what they'd do for the trailing CRC bytes that RFC1950 has but RFC1951 doesn't.
http://connect.microsoft.com/VisualStudi...df-streams
http://www.subbu.org/blog/2008/03/ie7-deflate-or-not

Well the particular IBM server in question uses RFC1950 but with a 4K compression window instead of the more common 32K window. Its first 2 bytes come out 0x48 and 0xC7, so "These aren't the droids you're looking for" and Proxo's decompression failed. IMHO that's not necessarily Proxo's fault since IE and other apps would also fail to decompress it if presented with that situation. I see it as more Opera's fault due to its less-than-stellar choice of header values, or somewhat the server's fault for using a compression method that even everyone's IE couldn't decypher.

A side effect of a compressed stream is that they can aid in detecting the end of a server's response. When the server does not use 'chunked' and does not provide an explicit 'Content-Length', then the decompression process can provide an end-of-content signal for the stream. That can be important when there's full-bore pipelining (as only Opera does by default). That's the reason I question what happens to the 4 bytes of CRC on the tail when using a special 2-byte prefix to detect RFC1950, dropping those bytes, and then treating the remaining stream as RFC1951.

(Dec. 30, 2010 03:15 PM)JJoe Wrote: [ -> ]The defaults shouldn't break anything important.

Probably not, yet it may only be useful on rare occasions. The issue identified involves a quirky browser like Opera, a server that's willing to choose 'deflate' over 'gzip' compression, and for that server to have chosen RFC1950 vs. RFC1951 deflate format, and for that server to use a compression window that doesn't yield the magic 2-byte prefix detection that's apparently already in Proxomitron.
(Dec. 30, 2010 06:22 PM)Graycode Wrote: [ -> ]Well the particular IBM server in question uses RFC1950 but with a 4K compression window instead of the more common 32K window. Its first 2 bytes come out 0x48 and 0xC7, so "These aren't the droids you're looking for" and Proxo's decompression failed.

I did find other deflate (RFC1950 and RFC1951) streams that the Proxomitron could handle and reached a similar conclusion after comparing all the streams in wireshark.

Thanks for the details.
Reference URL's