Post Reply 
Removing plaint text elements problems
Aug. 27, 2008, 10:29 AM
Post: #1
Removing plaint text elements problems
Hi. I'm trying to create some simple filters that remove some plain text elements from specific pages. Basically I want to turn some indirect links into direct links by removing the bold parts of oldadress.com/out.php?newurl from the page source.

[Image: bfilterdc6.jpg]
(I set the URL to * for testing purposes btw)

The above does not work, but I'm totally new to this so I thought maybe there are too many characters that need to be escaped. So I changed the filter to replace the plain word "blog" (because there are lots of occurrences of that word on that page). Oddly enough, now the word blog gets replaced on all pages I visit but the one I want!

I am not sure if it's okay to tell the site as it is a pron site. If this is not ok, maybe I could ask generally: what can make a simple text replacement filter NOT work on one page when it is working on others?

Thanks!

PS: BTW, I did enable the filter, hit reload, emptied the cache, etc. Smile!
Quote this message in a reply
Aug. 27, 2008, 02:45 PM
Post: #2
RE: Removing plaint text elements problems
There could be several reasons why search/replace may fail to work. The text may be composed by JavaScript, the URLs may be given in relative form, the content may be loaded from a different domain, etc.

You can send me the actual URL in a private message. You will have to register on this site to do so. You could also try to email it to [email protected], but there are good chances it will be classified as spam.
Add Thank You Quote this message in a reply
Aug. 27, 2008, 04:49 PM
Post: #3
RE: Removing plaint text elements problems
Yes, pm sent! Thanks for helping Smile!
Add Thank You Quote this message in a reply
Aug. 27, 2008, 05:43 PM
Post: #4
RE: Removing plaint text elements problems
I have identified the problem.
Some HTML code begins with a Byte Order Mark. BFilter supports that. The HTML code on the site in question begins with two BOMs, which is very unusual. BFilter doesn't support that because I just didn't think about such a possibility. It's trivial to fix, and I'll do it right after writing this, but the release process takes too much effort, so don't expect a fixed version to be released soon.

However, the problem only affects content type auto-detection. It should work if you remove _HTML_OR_XHTML_ condition. In that case it's strongly advised to filter by URL, because servers don't always report the correct content type.
Add Thank You Quote this message in a reply
Aug. 27, 2008, 05:58 PM
Post: #5
RE: Removing plaint text elements problems
Good to see it wasn't my stupidity. Smile!

Works great now, thank you!
Add Thank You Quote this message in a reply
Post Reply 


Forum Jump: