The Un-Official Proxomitron Forum

Full Version: Getting rid about &sid= in urls ?
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hi,

Even when you have cookies disabled, most forums can't stop tracking visitors, when they cannot plant a cookie on your system they add &sid=<string of chars and numerics> after your links.

Is there a header filter which could remove that part from the requested url please ?
here's one for Google News...
Code:
Name = "Google News remove id string from off-site links Apr16,2007 {JJoe}"
Active = TRUE
Multi = TRUE
URL = "$TYPE(htm)news.google."
Bounds = "<a\s*>"
Limit = 256
Match = "\1\sid=$AV(*)\2"
Replace = "\1\2"
(Sep. 01, 2009 07:33 PM)Toppy Wrote: [ -> ]Is there a header filter

I don't remember all the specifics of the '&' or the '<string of chars and numerics>'.
You could start with

Code:
[HTTP headers]
In = FALSE
Out = TRUE
Key = "! : JUMP:  Remove &sid=<string of chars and numerics> 090901 (out)"
URL = "\0\&sid=[^&]+\1&$JUMP(http://\0\1)"

OR

Sidki's set allows you to handle this in IncludeExclude-U.
I think you could add something like

Code:
\0\&sid=[^&]+\1&$JUMP(http://\0\1)

This assumes all HTTP and is
untested.

HTH
(Sep. 02, 2009 04:48 AM)JJoe Wrote: [ -> ]Sidki's set allows you to handle this in IncludeExclude-U.
I think you could add something like

Code:
\0\&sid=[^&]+\1&$JUMP(http://\0\1)

COOL!
it never dawned on me to use IncExc-U that way...
Hi,
yet another possible solution is the filter by chAlx named "Delete forum IDs from links [ALX]" original topic (in russian).

It's pretty smart at parsing query strings and deleting unwanted params, such as "sid" , "phpsessid" etc. whatever you want, based on list.

More to it, if the query string happens to contain several params that you don't want to be there, the filter deletes them all at once. If you want to get rid of param with some particular value(s) just specify it in SIDList.txt. Also it keeps track of issues such as whether removed param was the last one in the query string, if so then removes trailing "?"


here it is:

Code:
[Patterns]
Name = "Delete forum IDs from links [ALX]"
Active = TRUE
Multi = TRUE
URL = ""
Bounds = "$NEST(<(a|base|form|link|embed)\s,>)"
Limit = 512
Match = "(*(\s (src|href|action)=))\0 $AVQ("
        "  ([^?#]+)\1"
        "  (\?)"
        "  ("
        "    (\&amp;|\&|;|) [^&;\'\"#]+ &&"
        "    ("
        "      (\&amp;|\&|;|) $LST(SIDList) (^?) $SET(5=1) |"
        "      ( $TST((\2)=?) | (\&amp;|\&|;|(^$TST((\5)=1))) ) \# $SET(2=?)"
        "    )"
        "  )+"
        "  $TST((\5)=1)"
        "  \8"
        ")"
        "\9"
Replace = "\r\n <ins by=delete_forum_ids></ins> \r\n \0\1\2\@\8\9 \r\n"


content of my SIDList.txt (for the sake of example):
Code:
# List of URL params to be deleted from links (NOADDURL)
# Used by: Delete forum IDs from links [ALX]

(start_point|start|st|search|cc|fl|topicdays|all|from)=([#0]|)
(show|q)=([#0]|)
(hardset|phpsessid|sessid|sid|shmid|s|sd|sk|rndnum|rnd|highlight|hl|lighter)=*
(postdays|postorder|order|prune_day|daysprune)=*
(sort_by|sort_key|topicfilter|DokuWiki|tstart|define_user|msRange)=*
(bm)=([#1]|)
(goto)=(lastpost|)

$URL(http://(www.|)(prxbx|macdevcenter|onlamp|oreillynet|xml|ocwforums).com/)(page)=([#1]|)

(id|sort)=([#0]|)

$URL(http://(www.|help.|)(google|blogger).com/)(^$TST((\1)=*topic.py))topic=*
$URL(http://(www.|help.|)(google|blogger).com/)ctx=sibling
$URL(http://(www.|)google.com/)(sa)=(N|)

$URL(http://forum.sysinternals.com/)(^$TST((\1)=*forum_topics.asp))PN=*
$URL(http://forum.sysinternals.com/)$TST((\1)=*(search_form|registration_rules|login_user).asp)FID=*
$URL(http://forum.sysinternals.com/)(TPN|PN)=([#0:1]|)
Reference URL's