Post Reply 
Content-Type Issue?
Jun. 22, 2010, 09:01 PM (This post was last modified: Jun. 24, 2010 01:56 AM by JJoe.)
Post: #31
RE: Content-Type Issue?
(Jun. 22, 2010 08:37 PM)Inferno Wrote:  This time the IncludeExclude-U.ptxt is missing. I added the following code to Exceptions-U.ptxt:

Code:
## force specified content-type            $SET(hRealCT=IncEx: cont/type)
## ----------------------------------------------------------------------------
$SET(hRealCT=IncEx: application/json)

Is that right?

Nope.

Name and format changes for the alpha.
sidki_l_2009-02-13 has IncludeExclude-U.ptxt
sidki_ 2009-05-24 has Exceptions-U.ptxt

For sidki_ 2009-05-24 something like:

Code:
prxbx.com/forums/syndication.php*type\=atom $SET(hRealCT=filter)$FILTER(1)

I think. You need to replace the prxbx url expression with your own.

EDIT:
Having quickly looked at the filters in SIDKI 2009-02-13 (UPDATE 06-06), I'm wondering if the documentation is correct?
Should it be "$SET(hRealCT=IncEx: cont/type)"?

EDIT2: Correction. See post 37 http://prxbx.com/forums/showthread.php?t...3#pid14443
for more.
Add Thank You Quote this message in a reply
Jun. 22, 2010, 09:38 PM
Post: #32
RE: Content-Type Issue?
(Jun. 22, 2010 08:51 PM)JJoe Wrote:  The browsers that I tested show the download dialog with or without the Proxomitron. Think
Yes, the download dialog is a normal behavior. The only important thing is, that proxo should have modified the content. So if you download and open the file all words called "human" should be replaced by "alien" (of course only when proxo did the job right).

(Jun. 22, 2010 08:51 PM)JJoe Wrote:  For sidki_ 2009-05-24 something like:

Code:
prxbx.com/forums/syndication.php*type\=atom $SET(hRealCT=filter)$FILTER(1)

I think. You need to replace the prxbx url expression with your own
Sorry for asking all the foolish questions but I still do not exactly know what to do. I use sidki_l_2009-05-24 at the moment.

If I understand right:
Code:
abc.domain.com/request/update.rq.php* $SET(hRealCT=filter)$FILTER(1)
But how does it know to filter application/json? I'm fairly confused now.No Expression
A simple explanation would be fine, or a ready for use filter/expression Angel
Add Thank You Quote this message in a reply
Jun. 22, 2010, 10:31 PM (This post was last modified: Jun. 22, 2010 10:58 PM by JJoe.)
Post: #33
RE: Content-Type Issue?
(Jun. 22, 2010 09:38 PM)Inferno Wrote:  If I understand right:
Code:
abc.domain.com/request/update.rq.php* $SET(hRealCT=filter)$FILTER(1)
But how does it know to filter application/json? I'm fairly confused now.No Expression
A simple explanation would be fine, or a ready for use filter/expression Angel

Hmmm..

I added

Code:
[HTTP headers]
In = TRUE
Out = FALSE
Key = "Content-Type: 1a Filter application/json"
Match = "( ) application/json $SET(hRealCT=FILTER)$FILTER(1)(^)"

Log window showed

Code:
+++GET 4894+++
GET /application-json.cgi HTTP/1.1
Host: perl.host-ed.net
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.20) Gecko/20081217 Firefox/2.0.9.9
Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Referer: http://perl.host-ed.net/
Connection: keep-alive

+++RESP 4894+++
HTTP/1.1 200 OK
Date: Tue, 22 Jun 2010 22:18:52 GMT
Server: Apache
Transfer-Encoding: chunked
Content-Type: application/json
Cache-Control: public, max-age=86400
Last-Modified: Tue, 22 Jun 2010 22:19:12 GMT; PrxMsg: added
Match 4894: Top All Mark: Start     04.07.11 (multi) [sd] (d.r)
Match 4894: Top All Mark: End     06.12.25 [sd] (d.r)
Match 4894: Top JS Mark: Start     09.06.12 (multi) [sd] (d.r)
Match 4894: Top JS: Mark End     07.04.02 [sd] (d.r)
Match 4894: Top HTML Mark: Start     09.06.12 (multi) [sd] (d.r)
Match 4894: Top HTML Mark: End     07.10.24 [sd] (d.r)
+++CLOSE 4894+++

Appears to work for me.
I'll test some more after I post and get back.

HTH

EDIT:
It works for me.

Proxo found and filtered an application/json file coming from a google module before I checked the application-json.cgi file.

At http://perl.host-ed.net/application-json.cgi human was changed to alien.
CromePlus opened the cgi file.

Hope this helps and sorry it took so long.
Add Thank You Quote this message in a reply
Jun. 23, 2010, 12:17 AM (This post was last modified: Jun. 23, 2010 12:18 AM by Inferno.)
Post: #34
RE: Content-Type Issue?
First off, thank you very much JJoe for having the patience of a saint. Thumbs Up

(Jun. 22, 2010 10:31 PM)JJoe Wrote:  Hope this helps and sorry it took so long.
No need to say sorry! I'm very glad someone is helping me with such a stupid problem at all.
The complexity of the whole thing is so immense that it's almost impossible to consider all factors.

It works for http://perl.host-ed.net/application-json.cgi now BUT unfortunately a new problem appeared (which could be the reason for my general problem). If I switch proxo to bypass or even if I disable the proxy server in my browser, "human" is changed to "alien". I bet this is a caching problem! And maybe that kind of caching also happens on the website I'm trying to modify. Seriously...there is no end in sight. I'm sure sooner or later I will find a solution to achieve what I want. In my opinion the problem is not worth spending another 7 days to find a solution.

Of course, further suggestions are welcome, but I give up for today.
Add Thank You Quote this message in a reply
Jun. 23, 2010, 01:25 AM
Post: #35
RE: Content-Type Issue?
I wouldn't say stupid. After all I have evidently dealt with something like this already and more than once. Wink

In my haste and rust, I failed to notice that the Accept-Encoding header was not removed. So, if this was your problem, you still have it.

(Jun. 23, 2010 12:17 AM)Inferno Wrote:  In my opinion the problem is not worth spending another 7 days to find a solution.

I don't think it'll take that long and the puzzle that I can see has me hooked.
Add Thank You Quote this message in a reply
Jun. 23, 2010, 03:52 AM (This post was last modified: Jun. 23, 2010 03:56 AM by JJoe.)
Post: #36
RE: Content-Type Issue?
(Jun. 23, 2010 01:25 AM)JJoe Wrote:  I don't think it'll take that long and the puzzle that I can see has me hooked.

OK here is one way.

Code:
[HTTP headers]
In = TRUE
Out = FALSE
Key = "Content-Type: 7a Filter application/json [mona]  (In)"
URL = "(^$LST(Mem-Encode))"
Match = "( ) application/json&$SET(3=$GET(uHost)$GET(uPort)\p\q)$ADDLST(Mem-Encode,$WESC(\3)(^?)&\$FILTER(1))$JUMP(\u)$LOG(RRESP $DTM(c) : re-request "\u" w/o Accept-Encoding header)(^)"

Opened http://perl.host-ed.net/application-json.cgi in ChromePlus

and the Log window shows

Code:
+++GET 1911+++
GET /application-json.cgi HTTP/1.1
Host: perl.host-ed.net
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/533.99
Accept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Encoding: gzip,deflate
Accept-Language: en-US,en;q=0.8
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
Connection: keep-alive
JumpTo: http://perl.host-ed.net/application-json.cgi
RESP 1911 : re-request "http://perl.host-ed.net/application-json.cgi" w/o Accept-Encoding header

+++RESP 1911+++
HTTP/1.1 200 OK
Date: Wed, 23 Jun 2010 03:14:10 GMT
Server: Apache
Transfer-Encoding: chunked
Content-Type: application/json
Cache-Control: public, max-age=86400
Last-Modified: Wed, 23 Jun 2010 03:14:29 GMT; PrxMsg: added
+++CLOSE 1911+++
BlockList 1912: in Mem-Encode, line 1
BlockList 1912: in User-Agents, line 45

+++GET 1912+++
GET /application-json.cgi HTTP/1.1
Host: perl.host-ed.net
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/533.99
Accept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-US,en;q=0.8
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
Connection: keep-alive
BlockList 1912: in Mem-Encode, line 1

+++RESP 1912+++
HTTP/1.1 200 OK
Date: Wed, 23 Jun 2010 03:14:12 GMT
Server: Apache
Transfer-Encoding: chunked
Content-Type: application/json
Cache-Control: public, max-age=86400
Last-Modified: Wed, 23 Jun 2010 03:14:30 GMT; PrxMsg: added
Match 1912: Top All Mark: Start     04.07.11 (multi) [sd] (d.r)
Match 1912: Top All Mark: End     06.12.25 [sd] (d.r)
Match 1912: Top JS Mark: Start     09.06.12 (multi) [sd] (d.r)
Match 1912: Top JS: Mark End     07.04.02 [sd] (d.r)
Match 1912: Top HTML Mark: Start     09.06.12 (multi) [sd] (d.r)
Match 1912: Top HTML Mark: End     07.10.24 [sd] (d.r)
Match 1912: Human to Alien Test
Match 1912: Human to Alien Test
Match 1912: Human to Alien Test
Match 1912: Human to Alien Test
Match 1912: Human to Alien Test
Match 1912: Human to Alien Test
+++CLOSE 1912+++

Reload and the log window shows

Code:
BlockList 1968: in Mem-Encode, line 1
GET 1968 : Cache-Control killed: max-age=0
GET 1968 : If-Modified-Since: Prox Field stripped: added
BlockList 1968: in User-Agents, line 45

+++GET 1968+++
GET /application-json.cgi HTTP/1.1
Host: perl.host-ed.net
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US) AppleWebKit/533.99
Accept: application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5
Accept-Language: en-US,en;q=0.8
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.3
If-Modified-Since: Wed, 23 Jun 2010 03:14:30 GMT
Connection: keep-alive
BlockList 1968: in Mem-Encode, line 1

+++RESP 1968+++
HTTP/1.1 200 OK
Date: Wed, 23 Jun 2010 03:19:40 GMT
Server: Apache
Transfer-Encoding: chunked
Content-Type: application/json
Cache-Control: public, max-age=86400
Last-Modified: Wed, 23 Jun 2010 03:19:59 GMT; PrxMsg: added
Match 1968: Top All Mark: Start     04.07.11 (multi) [sd] (d.r)
Match 1968: Top All Mark: End     06.12.25 [sd] (d.r)
Match 1968: Top JS Mark: Start     09.06.12 (multi) [sd] (d.r)
Match 1968: Top JS: Mark End     07.04.02 [sd] (d.r)
Match 1968: Top HTML Mark: Start     09.06.12 (multi) [sd] (d.r)
Match 1968: Top HTML Mark: End     07.10.24 [sd] (d.r)
Match 1968: Human to Alien Test
Match 1968: Human to Alien Test
Match 1968: Human to Alien Test
Match 1968: Human to Alien Test
Match 1968: Human to Alien Test
Match 1968: Human to Alien Test
+++CLOSE 1968+++


Simple and quick:
Proxo checks the list Mem-Encode to see if it has seen this address before.
-Yes 'Accept-Encoding: 1 Kill if Filter-Forced' removes the Accept-Encoding header.
-No 'Content-Type: 7a Filter application/json' checks the Content-Type header
--if type is application/json add the address to the Mem-Encode and re-request (JumpTo:) the page.
---$JUMP() re-request allows 'Accept-Encoding: 1 Kill if Filter-Forced' to remove the Accept-Encoding header.
--if type is not application/json continue

Problems noted (so far):
You need another filter like sidki's 'Accept-Encoding: 1 Kill if Filter-Forced'
Change your mind and you have to clear Mem-Encode.
Slows things down because all files are checked against the list and first time matches are requested twice.
Removing the header removes some anonymity

HTH

And...

Have fun
Add Thank You Quote this message in a reply
Jun. 23, 2010, 03:25 PM
Post: #37
RE: Content-Type Issue?
(Jun. 22, 2010 09:01 PM)JJoe Wrote:  For sidki_ 2009-05-24 something like:

Code:
prxbx.com/forums/syndication.php*type\=atom $SET(hRealCT=filter)$FILTER(1)

I think. You need to replace the prxbx url expression with your own.

EDIT:
Having quickly looked at the filters in SIDKI 2009-02-13 (UPDATE 06-06), I'm wondering if the documentation is correct?
Should it be "$SET(hRealCT=IncEx: cont/type)"?


"$SET(hRealCT=IncEx: cont/type)" is good for modifying content-types, e.g.:
Code:
myserver.com/myfile.txt $SET(hRealCT=IncEx: text/html)

I think the user-friendly fix for filtering encoded $TYPE(oth) isn't documented at all in the 2009-02-13 config. However, it is the same fix as for 2009-05-24. Good for filter-forcing specific documents, versus entire content-types.

Looks like you've solved the problem anyway. If not, something like this might work:
Code:
[HTTP headers]
In = TRUE
Out = FALSE
Key = "Content-Type: 5e Filter application/json (In)"
Match = "application/json$SET(hRealCT=filter)$FILTER(1)PrxFail$TST()"

...because "Content-Type: 7 Sel. Types to Mem-Encode" is triggered by "hRealCT=filter", and "Accept-Encoding: 1 Kill if Filter-Forced" is scanning the "Mem-Encode" memory-only list.

(While "|json" is part of the "Protect" entry in Content-Types.ptxt, i don't think that this is a problem, because the new filter should override "Content-Type: 1 Manage listed Types". I'm not sure though.)
Add Thank You Quote this message in a reply
Jun. 23, 2010, 07:50 PM
Post: #38
RE: Content-Type Issue?
(Jun. 23, 2010 03:25 PM)sidki3003 Wrote:  If not, something like this might work:
Code:
[HTTP headers]
In = TRUE
Out = FALSE
Key = "Content-Type: 5e Filter application/json (In)"
Match = "application/json$SET(hRealCT=filter)$FILTER(1)PrxFail$TST()"

A filter like that appeared to work but I wondered

Code:
[HTTP headers]
In = TRUE
Out = FALSE
Key = "Content-Type: 7 Sel. Types to Mem-Encode     07.10.15 [mona] (d.0) (In)"
URL = "$TST(hRealCT=filter)(^$TST(uExt=gif|ico|jpe+g|pjpeg|png|tiff+|wmf))"
Match = "$IHDR(Content-Encoding:( ) (?*)\2)&\1&($TST(volat=*.encoded:1.*)$SET(0=\1; permitted: 2nd request)|$SET(3=$GET(uHost)$GET(uPort)\p\q)$ADDLST(Mem-Encode,$WESC(\3)(^?))($TST(uProt=\4&https:)$TST(keyword=*.i_ssl_h:[12].*)$SET(5=http://https-px-.)|$SET(5=\4//))(^$TST(volat=*.post:1.*))$SET(0=\1$JUMP(\5\3))$LOG(RRESP $DTM(c) : Compressed "\1" re-requested without Accept-Encoding))($TST(volat=*.log:2*)$ADDLST(Log-Main,[$DTM(d T)]\tHDR_In CT \2 \t\0 \t\u)|)"
Replace = "\0"

if the Content-Encoding header is always sent for compressed content?
If not, "Content-Type: 7 Sel. Types to Mem-Encode" would fail to match and the filters would not be able to modify the content?

An old page http://schroepl.net/projekte/mod_gzip/browser.htm says

Quote:Processing compressed content works if the browser has requested compressed content; otherwise it ignores the HTTP header Content-Encoding: gzip although it would be able to decompress the content.


And Thanks.
Add Thank You Quote this message in a reply
Jun. 24, 2010, 01:47 AM
Post: #39
RE: Content-Type Issue?
(Jun. 23, 2010 07:50 PM)JJoe Wrote:  if the Content-Encoding header is always sent for compressed content?

According to http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html:

Quote:If the content-coding of an entity is not "identity", then the response MUST include a Content-Encoding entity-header (section 14.11) that lists the non-identity content-coding(s) used.
Add Thank You Quote this message in a reply
Jun. 24, 2010, 03:26 AM
Post: #40
RE: Content-Type Issue?
(Jun. 24, 2010 01:47 AM)whenever Wrote:  
Quote:If the content-coding of an entity is not "identity", then the response MUST include a Content-Encoding entity-header (section 14.11) that lists the non-identity content-coding(s) used.

Thanks. I think I agree that it should be sent but is it always and does it always survive the journey and does Inferno's program care.

Can be hard to help when you can not see but maybe we learned and enjoyed (478 views).

So Accept-Encoding: identity;q=0, gzip, deflate forces the server to compress the file, or return a 406 (Not Acceptable).

The RFCs make my brain hurt.

Have fun.
Add Thank You Quote this message in a reply
Jun. 24, 2010, 04:07 AM
Post: #41
RE: Content-Type Issue?
(Jun. 24, 2010 03:26 AM)JJoe Wrote:  I think I agree that it should be sent but is it always and does it always survive the journey and does Inferno's program care.

Inferno was referring to AJAX, so it doesn't know and doesn't care. The decoding would be done by the browser within which the AJAX is running. Content encoding is generally end-to-end, so it usually survives the journey up to the browser, but before the AJAX sees it.

In the case of my proxy I usually remove compression for filtered pages. Compression helps transmission to get it to the PC, but when the proxy is running on the same PC or LAN as the browser then re-compressing doesn't make much sense. When that happens, the proxy needs to adjust or remove the Content-Encoding header. I don't know whether Proxo re-compresses or adjusts the header and not re-compress.

(Jun. 24, 2010 03:26 AM)JJoe Wrote:  So Accept-Encoding: identity;q=0, gzip, deflate forces the server to compress the file, or return a 406 (Not Acceptable).

The Accept-Encoding specifications will allow but not force a server to compress content. Non-encoded content (identity) is always an option, and is what would be assumed if none of the encoding options that are acceptable to the client browser were also acceptable to the server. Usually the client browser sends every server the same set of Accept-Encoding that it knows how to handle, yet only a small percentage of all requests are actually encoded by servers.

RFCs make my brain hurt too. Smile!
Add Thank You Quote this message in a reply
Jun. 24, 2010, 01:24 PM (This post was last modified: Jun. 25, 2010 03:16 AM by JJoe.)
Post: #42
RE: Content-Type Issue?
(Jun. 24, 2010 04:07 AM)Graycode Wrote:  In the case of my proxy I usually remove compression for filtered pages. Compression helps transmission to get it to the PC, but when the proxy is running on the same PC or LAN as the browser then re-compressing doesn't make much sense. When that happens, the proxy needs to adjust or remove the Content-Encoding header. I don't know whether Proxo re-compresses or adjusts the header and not re-compress.

Proxomitron removes compression and the Content-Encoding header for filtered pages.

I don't think any of the sites (pages) that I frequent use ajax.

(Jun. 24, 2010 04:07 AM)Graycode Wrote:  
(Jun. 24, 2010 03:26 AM)JJoe Wrote:  So Accept-Encoding: identity;q=0, gzip, deflate forces the server to compress the file, or return a 406 (Not Acceptable).

yet only a small percentage of all requests are actually encoded by servers.

I read that in a post and thought I remembered old questions from dial-up modem users that might be related.

Edit: added "dial-up" but people who pay per byte might also want to receive more compressed content.
Add Thank You Quote this message in a reply
Jun. 24, 2010, 05:47 PM
Post: #43
RE: Content-Type Issue?
(Jun. 24, 2010 01:24 PM)JJoe Wrote:  I don't think any of the sites (pages) that I frequent use ajax.

One example is to view http://sports.yahoo.com/
During that page load it will use AJAX to fetch some data from (http) sports.yahoo.com/dynamic/ticker
The headers for that are:
Code:
HTTP/1.1 200 OK
Date: Thu, 24 Jun 2010 16:58:44 GMT
P3P: policyref="http://info.yahoo.com/w3c/p3p.xml", CP="CAO DSP COR CUR ADM DEV TAI PSA PSD IVAi IVDi CONi TELo OTPi OUR DELi SAMi OTRi UNRi PUBi IND PHY ONL UNI PUR FIN COM NAV INT DEM CNT STA POL HEA PRE LOC GOV"
Cache-Control: public, max-age=600, stale-while-revalidate=300
Vary: Accept-Encoding
Content-Type: application/json;charset=utf-8
Content-Encoding: gzip
Age: 140
Via: HTTP/1.1 ct3.ycs.mud.yahoo.net (YahooTrafficServer/1.17.23.1 [cMsSfW]), HTTP/1.1 r3.ycpi.mud.yahoo.net (YahooTrafficServer/1.18.5 [cHs f ])
Server: YTS/1.18.5
Content-Length: 1161
Connection: keep-alive

Another example is to view: http://maps.yahoo.com/
That page got some data from URL (http) maps.yahoo.com/services/bizloc/america/bizloc?q=&intl=us&mag=15&zoom=15&rn=1277399551540
The headers for that indicate both chunked and compressed:
Code:
HTTP/1.1 200 OK
Date: Thu, 24 Jun 2010 17:15:35 GMT
P3P: policyref="http://info.yahoo.com/w3c/p3p.xml", CP="CAO DSP COR CUR ADM DEV TAI PSA PSD IVAi IVDi CONi TELo OTPi OUR DELi SAMi OTRi UNRi PUBi IND PHY ONL UNI PUR FIN COM NAV INT DEM CNT STA POL HEA PRE LOC GOV"
Cache-Control: private
Connection: keep-alive, close
Vary: Accept-Encoding
Transfer-Encoding: chunked
Content-Type: application/json; charset=utf-8
Content-Encoding: gzip
Note the stupid Connection: header. Apparently RFCs hurt Yahoo's brains even more than ours. Banging Head
Add Thank You Quote this message in a reply
Jun. 24, 2010, 07:57 PM
Post: #44
RE: Content-Type Issue?
i'm seeing app/json's for the login process for a client's account...
interestingly, i can't seem to "half ssl" that account without a popup certificate warning from my browser...
Add Thank You Quote this message in a reply
Jun. 25, 2010, 03:14 AM
Post: #45
RE: Content-Type Issue?
(Jun. 24, 2010 05:47 PM)Graycode Wrote:  Another example is to view: http://maps.yahoo.com/

Thanks. Lots of examples here.

Removing the Content-Encoding header at http://maps.yahoo.com/ does break some (maybe most) things but other things appear to still work. For now, I'll guess that those things that still work only need the info in the URL.

The Content-Encoding header appears to be required.

Next question, for later, is what is in those json files that might be worth the effort to modify.
Add Thank You Quote this message in a reply
Post Reply 


Forum Jump: