Cut: Chained Ad Path URLs
|
Mar. 21, 2009, 12:13 PM
(This post was last modified: Apr. 02, 2009 08:50 PM by sidki3003.)
Post: #1
|
|||
|
|||
Cut: Chained Ad Path URLs
One of the new filters in the 2009 configs is "<script>: Cut: Chained Ad Path URLs", required to deal with concatenated scripts, which get more and more popular:
Code: <script src="http://myserver.com/load?adscript.js,requiredscript.js,trackingscript.js"></script> Example: http://www.spike.com/ So far so good. However, lately i also see concatenated offsite scripts: Code: <script src="http://myserver.com/load?http%3A//adserver.com/x.js,requiredscript.js,http%3A//trackingserver.com/y.js"></script> Example: http://kirstenrokz.buzznet.com/user/ Below filter tests each chained component against the complete ad-list combo (AdHosts-J, AdDomains, etc.). I'm not sure whether the recursive expressions are correct and sufficiently robust, hence "WIP". Code: [Patterns] The benefit of extending the filter as described becomes especially obvious if you look at the second filter hit (as well as the resulting script) on latter example page, after adding below entry (found via Ghostery) to AdHosts-J: Code: # Ads - Lotame edit: "WIP" flag removed. |
|||
Mar. 21, 2009, 12:39 PM
Post: #2
|
|||
|
|||
RE: Cut: Chained Ad Path URLs
the above post is showing this line:
Code: {},addBehavior:function(){},addInterest:function(){},addMedia:function(){},? is that 8203 supposed to be there? i can't seem to find it in any HTML Code Table... |
|||
Mar. 21, 2009, 12:50 PM
Post: #3
|
|||
|
|||
RE: Cut: Chained Ad Path URLs
As long as the forum's code tag handles things correctly, all is fine. The real source code doesn't matter.
"​" usually triggers a word break ( http://www.quirksmode.org/oddsandends/wbr.html , 2nd para). |
|||
Mar. 21, 2009, 01:11 PM
Post: #4
|
|||
|
|||
RE: Cut: Chained Ad Path URLs
(Mar. 21, 2009 12:50 PM)sidki3003 Wrote: As long as the forum's code tag handles things correctly, all is fine. The real source code doesn't matter. that seems to depend upon your OS, or more specifically, your text editor... i cut-and-paste the above via Notepad and can not save the file because the pasting pastes a "square character" in place of that 8203 and i get a "This file contains characters in Unicode format which will be LOST if you save this file as an ANSI encoded text file" upon attempted save... so i cancel the save and track down "why" - it's that 8203... |
|||
Mar. 21, 2009, 01:14 PM
Post: #5
|
|||
|
|||
RE: Cut: Chained Ad Path URLs
Ahh okay, i didn't know that, thanks.
You should end up with a list entry that looks exactly as posted. The line indents are especially important. |
|||
Mar. 21, 2009, 02:36 PM
(This post was last modified: Mar. 21, 2009 02:41 PM by lnminente.)
Post: #6
|
|||
|
|||
RE: Cut: Chained Ad Path URLs
Hi Sidki, i found this in my logf (log file) of large urls:
http://mail.yimg.com/d/combo?/mg/5_1_20/...s/fcues.js |
|||
Mar. 21, 2009, 03:01 PM
(This post was last modified: Mar. 21, 2009 03:15 PM by sidki3003.)
Post: #7
|
|||
|
|||
RE: Cut: Chained Ad Path URLs
Oh - thanks.
I haven't seen this script concatenation with separate query params thus far, only with a comma, once or twice with a semicolon, always within the same query param. If this method is also used for adscript/required-script mixtures, it would be interesting how it's embedded in the page ("&" or "&", etc.). It's important that chained ad paths are intercepted in the page code (vs. headers), because otherwise other anti-adscript filters could be triggered by a chained ad path, which would also remove required components. |
|||
Mar. 21, 2009, 09:33 PM
(This post was last modified: Mar. 21, 2009 09:33 PM by lnminente.)
Post: #8
|
|||
|
|||
RE: Cut: Chained Ad Path URLs
Hi Sidki i have an more general idea, i'm thinking we could create a filter wich could log the name of the functions inside a script coming from an ad source. Later we process that log file and create a list of blocking functions.
In that way it wouldn't matter how the script is served to us, also scripts programmed to broke pages if they are not loaded could be fixed by us instead of blocking the full script file. Let me know if you trust in this idea... |
|||
Mar. 21, 2009, 10:06 PM
(This post was last modified: Mar. 21, 2009 10:38 PM by sidki3003.)
Post: #9
|
|||
|
|||
RE: Cut: Chained Ad Path URLs
Well, as far as sidki-configs are concerned, the approach is generally multi-layered where possible.
Regarding concatenated scripts: 1 - First try to cut ad/tracking paths. 2 - Then see if the individual modules have introductory comments which match a list of known ad/tracking comments. 3 - Then see if the contained function (or argument) names match an AdKeys-J entry. 4 - Then see if the function body contains ad strings. I don't see a way around point 1. The original reason why i wrote this filter was to prevent subsequent "block scripts by URL" filters from matching and blocking the whole enchilada, required modules included. Besides, i like that filter. I assume that you have something like point 3 in mind. That's fine, but, personally, i doubt that it's sufficient. Which reminds me... there's an updated version of this filter, too: Code: [Patterns] |
|||
Mar. 21, 2009, 10:24 PM
Post: #10
|
|||
|
|||
RE: Cut: Chained Ad Path URLs
Nice!! That was exactly the idea, a list of forbidden functions. Veeeery well
|
|||
Apr. 02, 2009, 08:48 PM
Post: #11
|
|||
|
|||
RE: Cut: Chained Ad Path URLs
Removing "WIP" flag from discussed filter...
|
|||
« Next Oldest | Next Newest »
|