What irks me is that when you're looking up directions on how to do something in .htaccess most of the instructions read like an old grimoire - if you don't know most of the answer already you can't figure out what the heck their solution is about. (But then a lot of Perl advice is the same way.)
For me, it's about downloading the file offline and printing it on paper or widening the terminal/ssh window enough that I will see the whole rule set at one glance... Helps to see where it fails... All the escaping of unsafe characters makes it very long and hard to see what the rule is actually meant to do.
Posted by: Dandello Posted on: Sep 19th, 2014 at 3:38pm
What irks me is that when you're looking up directions on how to do something in .htaccess most of the instructions read like an old grimoire - if you don't know most of the answer already you can't figure out what the heck their solution is about. (But then a lot of Perl advice is the same way.)
Posted by: Monni Posted on: Sep 19th, 2014 at 3:16pm
Especially as I didn't originally write those rewrites... The old tech guy did write them, but they didn't work correctly with YaBB 2.5, so I had to update them with the new URL scheme...
Posted by: Dandello Posted on: Sep 19th, 2014 at 3:08pm
We also have to figure out how to keep Guardian from trashing those settings. Make sure Guardian doesn't trash those settings.
Posted by: Dandello Posted on: Sep 19th, 2014 at 3:04pm
Figuring out .htaccess rewrites is never fun.
Posted by: Monni Posted on: Sep 19th, 2014 at 2:59pm
Okay - is this something we (YaBB) can (or should) add to .htaccess during conversion for older forums?
Well... For converting forums from maybe pre-2.5 YaBB, it is something that helps to reduce "false" 404 errors in Apache error log... I haven't yet figured out how to "permanently" fix all the regex in .htaccess to work correctly as Attachments still require two rewrites...
I stripped out the host part of the rewrite target, which forces the redirect to be external, so it does update the crawler cache with new URL.
Posted by: Dandello Posted on: Sep 19th, 2014 at 1:45pm
Okay - is this something we (YaBB) can (or should) add to .htaccess during conversion for older forums?
Posted by: Monni Posted on: Sep 19th, 2014 at 9:57am
It all comes to fact that how many forums have really old posts that are linked from external websites... As the directory structure was changed around 2.5, all older links have to be rewritten in Apache at least once. If I figured it out correctly, Apache really does choke on semicolon during rewrite unless the query string validity check is explicitly disabled using NE (NoEscape). What comes to issue with "+" sign it is more about strict standards compliance, where actual script have to check validity of query parameter using both space and plus sign as full URL encoding of query string is optional, and not mandatory. Clients are actually permitted to URL decode safe characters on all links on pages and return the link to server in plain US ASCII.
Posted by: Dandello Posted on: Sep 19th, 2014 at 4:23am
I agree. The problem with issues like this is that they're not experimentally reproducible. And if they can't be reproduced, you don't know if it's fixed.
Posted by: Monni Posted on: Sep 19th, 2014 at 3:41am
That's not from YaBB's error log, is it? Because if it is there's something more seriously wrong with that error string that a mis-encoded semicolon.
And if all those errors are as old as the date indicates, I personally wouldn't worry about it.
Eh... in English that error string would be "This field only accepts numbers from 0-9"... But that you should know already... Didn't make sense to post the Finnish version of the string as that would be what reads in error log.
Anyways... if the hack I made in .htaccess works for most non-malicious users, I'm not too worried about the cases where it doesn't work for malicious users
Posted by: Dandello Posted on: Sep 18th, 2014 at 8:01pm
That's not from YaBB's error log, is it? Because if it is there's something more seriously wrong with that error string that a mis-encoded semicolon.
And if all those errors are as old as the date indicates, I personally wouldn't worry about it.
Posted by: Monni Posted on: Sep 18th, 2014 at 7:18pm
Occasionally 'invisible' characters slip in that can't be seen in text editors or logs but Apache and Perl see as part of the encoding that REALLY mess things up.
A 301 is a redirect, so - since it's throwing a YaBB error eventually - the parsing error is happening after the redirect.
When you see these errors, how old is the message they're trying to get to? I'm betting they're pretty old ones.
And what's the exact error message YaBB is giving?
Posted by: Dandello Posted on: Sep 18th, 2014 at 5:53pm
Occasionally 'invisible' characters slip in that can't be seen in text editors or logs but Apache and Perl see as part of the encoding that REALLY mess things up.
A 301 is a redirect, so - since it's throwing a YaBB error eventually - the parsing error is happening after the redirect.
When you see these errors, how old is the message they're trying to get to? I'm betting they're pretty old ones.
And what's the exact error message YaBB is giving?
Posted by: Monni Posted on: Sep 18th, 2014 at 3:06pm
but it seems Apache has troubles accepting ";" in URL, even if it is URL encoded
But it's not consistent. That's what's irksome. So I don't think it's an Apache issue, I think it's a browser issue as it's encoding (or not decoding) something that shouldn't be visibly encoded at all. My suspicion is there was/is a browser or harvester that mis-encoded query strings and that bad code is still in search engines and/or bookmarks.
So if YaBB accepted attachments with '+' in them in the past it was a bug then. I doubt there's an 'in YaBB' fix for it.
1. I can see from Apache access log that the incoming URL is correct, but instead of code 200, it throws 301, which restarts the query string parsing... That's where it goes wrong. I tried to force NoEscape in Apache directory-level config, but don't know if it is the 100% working solution...
2. If I try to access attachments containing "+" with Chrome, I can see the attachment, but with clients that use strict standards following URL encoding, it doesn't work.
Posted by: Dandello Posted on: Sep 18th, 2014 at 2:48pm
but it seems Apache has troubles accepting ";" in URL, even if it is URL encoded
But it's not consistent. That's what's irksome. So I don't think it's an Apache issue, I think it's a browser issue as it's encoding (or not decoding) something that shouldn't be visibly encoded at all. My suspicion is there was/is a browser or harvester that mis-encoded query strings and that bad code is still in search engines and/or bookmarks.