Nothing wrong with putting them in a text file and attaching them.
That's what I usually do with longer spans of text... but fixing my own typos on text that is attached as a file takes a lot longer than fixing inlined text
Posted by: Dandello Posted on: Mar 24th, 2015 at 5:06pm
Nothing wrong with putting them in a text file and attaching them.
Posted by: Monni Posted on: Mar 24th, 2015 at 4:30pm
And YaBB thoroughly trashes what's inside the code div when converting to e-mail text.
Tomorrow or so I'll look at true html e-mails again.
YaBB trashes what's inside the code div even before converting to e-mail text I'm just going to not post anymore regex patterns because it just don't come out right Too many slashes for YaBB to grok
Posted by: Dandello Posted on: Mar 24th, 2015 at 4:13pm
and so does the one from Regular Expressions Cookbook page 426.
if you leave out ? after \s, it fails on tags like <br/>, but works for <br />...
on the other hand... YaBB chokes on \s when replying, strips off \, it's tricky because it has to match either 1 character or 2 characters because the next rule only matches 3 characters or more as it has to check the start of tag isn't mixed-case.
Posted by: Dandello Posted on: Mar 24th, 2015 at 2:47pm
Sequence (?/...) not recognized in regex; marked by <-- HERE in m/(?isx)<(([a-z]+|[A-Z]+)( ?/ <-- HERE ?|[^a-zA-Z<>][^<>]*[^/<>]/?)|/([a-z]+|[A-Z]+))>/
Code
Sequence (?...) not recognized in regex; marked by <-- HERE in m/(?isx)<(([a-z]+|[A-Z]+)( ? <-- HERE /?|[^a-zA-Z<>][^<>]*[^/<>]/?)|/([a-z]+|[A-Z]+))>/
is set to global, it does work to remove just tags from code like this:
Code (HTML)
<p class="class">>>>test<<<</p><br />
Posted by: Dandello Posted on: Mar 24th, 2015 at 1:35pm
Code
Sequence (?/...) not recognized in regex; marked by <-- HERE in m/(?isx)<(([a-z]+|[A-Z]+)( ?/ <-- HERE ?|[^a-zA-Z<>][^<>]*[^/<>]/?)|/([a-z]+|[A-Z]+))>/
Code
Sequence (?\...) not recognized in regex; marked by <-- HERE in m/(?isx)<(([a-z]+|[A-Z]+)( ?\ <-- HERE /?|[^a-zA-Z<>][^<>]*[^\/<>]\/?)|\/([a-z]+|[A-Z]+))>/
Which means that the issue with autolink urls will break it. Looking at how that section of code evolved I think we can probably remove the
Code
$thismessage =~ s/<.*?>//g;
lines in while leaving the
Code
$thismessage =~ s/[.*?]//g;
that's a few lines above it because the chances of an errant ']' in YaBB is smaller than an errant '>'. (Plus that's what's been used, supposedy successfully for ages, in PM notifications.)
The supposed best solution is to use something like HTML:: Parser to remove the HTML tags.
It is not big trouble to fix both YaBB and HTML tag detection as long as we know which characters can appear at start of tag and which characters can appear at end of tag... This is how I changed the HTML tag detection to detect everything else except comments, because those have very specific restrictions that make the regex very long...
Posted by: Dandello Posted on: Mar 23rd, 2015 at 9:01pm
Which means that the issue with autolink urls will break it. Looking at how that section of code evolved I think we can probably remove the
Code
$thismessage =~ s/<.*?>//g;
lines in while leaving the
Code
$thismessage =~ s/\[.*?\]//g;
that's a few lines above it because the chances of an errant ']' in YaBB is smaller than an errant '>'. (Plus that's what's been used, supposedy successfully for ages, in PM notifications.)
The supposed best solution is to use something like HTML:: Parser to remove the HTML tags.
Posted by: Monni Posted on: Mar 23rd, 2015 at 7:53pm
That's the code everybody uses as an example of what works with very simple html tags. Plus
Code
*?
is a 'lazy' or 'non-greedy' quantifier in Perl. In this case the '>' marks the place it starts looking for matches before the '>', working backwards.
It works when there is none stray < or >... But if there is even one stray >, it doesn't... it also fails miserably if there is no other characters between < and >, which is alternative way to say !=.
http://regexr.com/ is the tool I use to check regex patterns for bugs...
Posted by: Dandello Posted on: Mar 23rd, 2015 at 7:41pm
That's the code everybody uses as an example of what works with very simple html tags. Plus
Code
*?
is a 'lazy' or 'non-greedy' quantifier in Perl. In this case the '>' marks the place it starts looking for matches before the '>', working backwards.
Posted by: Monni Posted on: Mar 23rd, 2015 at 6:26pm
Okay - rolled back to the 'preformated, stripped text only' version of the e-mails. We'll come back to this when we start seriously looking at 2.6.2.
The
Code (Perl)
$thismessage =~ s/<.*?>//g;
in Post.pm should be okay in the short term because YaBB creates very simple html. So catching the extra (unmatched) sharp brackets after running through FromHTML should solve most of the current sharp brackets issues in e-mails.
The problem with that was that the .* part also matches ">" which results in everything stripped between two html tags.
Posted by: Dandello Posted on: Mar 23rd, 2015 at 3:14pm
Okay - rolled back to the 'preformated, stripped text only' version of the e-mails. We'll come back to this when we start seriously looking at 2.6.2.
The
Code (Perl)
$thismessage =~ s/<.*?>//g;
in Post.pm should be okay in the short term because YaBB creates very simple html. So catching the extra (unmatched) sharp brackets after running through FromHTML should solve most of the current sharp brackets issues in e-mails.
Posted by: Dandello Posted on: Mar 22nd, 2015 at 2:03pm
Clipping or truncating is bad... It should just leave out the post contents if the post is too long... Clipping or truncating can cause issues in parsing HTML when closing tags gets clipped out but opening tag doesn't...
Very true - the truncating function is iffy.
Posted by: Monni Posted on: Mar 22nd, 2015 at 1:59pm
But there is some html tags that are still better stripped off...
We're going to have to go through and decide which ones can and should be stripped - and whether or not it would be advisable to use clipping so YaBB isn't sending out HUGE notification emails if the post character limits are set really high.
Clipping or truncating is bad... It should just leave out the post contents if the post is too long... Clipping or truncating can cause issues in parsing HTML when closing tags gets clipped out but opening tag doesn't...
Posted by: Dandello Posted on: Mar 22nd, 2015 at 1:46pm
But there is some html tags that are still better stripped off...
We're going to have to go through and decide which ones can and should be stripped - and whether or not it would be advisable to use clipping so YaBB isn't sending out HUGE notification emails if the post character limits are set really high.
Posted by: Monni Posted on: Mar 22nd, 2015 at 1:36pm
Since e-mails are now being sent as HTML, we should probably just use YaBBC to render the tags instead of stripping them out, and use regexes to add in inline styling for the quote and code boxes.
Expect to see some nonsense messages around here tomorrow or so as I test this idea.
I agree with converting safe html tags to YaBB tags... But there is some html tags that are still better stripped off...
Posted by: Dandello Posted on: Mar 22nd, 2015 at 1:30pm
Since e-mails are now being sent as HTML, we should probably just use YaBBC to render the tags instead of stripping them out, and use regexes to add in inline styling for the quote and code boxes.
Expect to see some nonsense messages around here tomorrow or so as I test this idea.
Posted by: Monni Posted on: Mar 22nd, 2015 at 10:57am
What was happening was that although there is a regex to strip out html tags, 'loose' pointy brackets get turned from '<' to ">" in the FromHTML routine. The new code catches them and turns them back into html entities.
The regexes above this code need to be looked at as they're still sending the notification e-mails as txt even though YaBB now sends them as html.
Like I said earlier, the regex for stripping out html tags is actually incorrect, as it doesn't handle cases when there is second < before >. Instead of /<.*?>/ it should use
... this makes sure it doesn't remove anything between invalid html tag and next valid html tag or invalid tag after valid html tag.
To the second point, I think the main issue is not the regexes above the FromHTML, but that we need to convert back the safe ones to html below the recent fix.
Posted by: Dandello Posted on: Mar 22nd, 2015 at 2:06am
And the odd issue of "> showing up in the source code (as rendered by FireFox), is an artifact of the autolink url function.
Posted by: Dandello Posted on: Mar 21st, 2015 at 11:36pm
Okay - Monni's test and Zathrus's test came in fine in Outlook. So the temporary fix for the notification e-mail glitch is:
What was happening was that although there is a regex to strip out html tags, 'loose' pointy brackets get turned from '<' to ">" in the FromHTML routine. The new code catches them and turns them back into html entities.
The regexes above this code need to be looked at as they're still sending the notification e-mails as txt even though YaBB now sends them as html.
Posted by: Zatthrus Posted on: Mar 21st, 2015 at 3:28pm
test 2 : XML and HTML5 have different reserved characters, so I see no problem with having inline JavaScript as long as it is validated to not contain "</", which would cause the block to terminate prematurely
Posted by: Zatthrus Posted on: Mar 21st, 2015 at 2:42pm
Without actually seeing the raw e-mail, I can only assume it doesn't try to escape it even though it is obviously malformed html tag. So it "hides" everything until it finds next ">"...
Looking at the raw text, that's exactly what's happening in Outlook. So I think a check on how the html is generated for the html e-mail needs a looking at.
Simple regex should work... escape "<" if it is not followed by ">" before next "<" or end of string.
Posted by: Dandello Posted on: Mar 20th, 2015 at 5:47pm
Without actually seeing the raw e-mail, I can only assume it doesn't try to escape it even though it is obviously malformed html tag. So it "hides" everything until it finds next ">"...
Looking at the raw text, that's exactly what's happening in Outlook. So I think a check on how the html is generated for the html e-mail needs a looking at.
Posted by: Monni Posted on: Mar 20th, 2015 at 5:31pm
Could be. I don't have time today to check it out but I'm guessing it's related to the "<" "/" getting interpolated in a weird way when being sent to Mail.
Without actually seeing the raw e-mail, I can only assume it doesn't try to escape it even though it is obviously malformed html tag. So it "hides" everything until it finds next ">"...
Posted by: Dandello Posted on: Mar 20th, 2015 at 4:32pm
Could be. I don't have time today to check it out but I'm guessing it's related to the "<" "/" getting interpolated in a weird way when being sent to Mail.
Posted by: Monni Posted on: Mar 20th, 2015 at 4:05pm
Methinks the notification emails need more css - and other things.
Did we discover a bug, or...
Posted by: Dandello Posted on: Mar 20th, 2015 at 3:11pm
and this:
Code
XML and HTML5 have different reserved characters, so I see no problem with having inline JavaScript as long as it is validated to not contain "</", which would cause the block to terminate prematurely.
made an interesting bit in the notification email:
An awful lot of HTML5 tutorials show inline javascript examples although I think many of the ones YaBB uses could be pushed to the bottom of the page.
XML and HTML5 have different reserved characters, so I see no problem with having inline JavaScript as long as it is validated to not contain "</", which would cause the block to terminate prematurely.
Posted by: Dandello Posted on: Mar 20th, 2015 at 2:35pm
An awful lot of HTML5 tutorials show inline javascript examples although I think many of the ones YaBB uses could be pushed to the bottom of the page.
Posted by: Monni Posted on: Mar 20th, 2015 at 2:19pm
So none of this will be an issue once we go all HTML5 in 2.6.2 (except possibly in the RSS feed)? (Except for the trailing ';' in inline styles and javascript?)
As far as I know about HTML5, those kind of inline JavaScript blocks are forbidden completely. (Section 8.1.2.6)
Posted by: Dandello Posted on: Mar 20th, 2015 at 2:11pm
So none of this will be an issue once we go all HTML5 in 2.6.2 (except possibly in the RSS feed)? (Except for the trailing ';' in inline styles and javascript?)
Posted by: Monni Posted on: Mar 20th, 2015 at 2:07pm
On a related note: should we have all the inline javascript inside the ![CDATA] code?
It's only required if the JavaScript code contains characters that have special meaning in XML or XHTML... Chrome seems to be very picky about pages that don't use CDATA. The result is pretty unpredictable. Same goes for not having trailing ; in inline styles or JavaScript code, especially inside element attributes.
Posted by: Dandello Posted on: Mar 20th, 2015 at 1:10pm
On a related note: should we have all the inline javascript inside the ![CDATA] code?
Posted by: Monni Posted on: Mar 19th, 2015 at 10:41pm
Oh, I can imagine - I'm having to write down which install is which and what it was testing... (I only have about 20 different versions of YaBB on my home server. )
I only have 12 versions of YaBB here, so keep patching
Posted by: Dandello Posted on: Mar 19th, 2015 at 10:36pm
Oh, I can imagine - I'm having to write down which install is which and what it was testing... (I only have about 20 different versions of YaBB on my home server. )
Posted by: Monni Posted on: Mar 19th, 2015 at 10:20pm
One more diff... lol... doing four-way merge, so didn't catch all the differences at first attempt... Can't even imagine how many hours it took to compare all the revisions...
Posted by: Dandello Posted on: Mar 19th, 2015 at 7:10pm
Just went through your catches (new zip is at 2.6.11_fixb.zip).
I removed the Holly Hack from Templates/Admin/default.css - there's a problem with going between ANSI and UTF-8 char encoding with it and this bit of code isn't in Templates/Forum/default.css. We do need to check to make sure we don't have any uncleared floats triggering the newer version of the Guillotine Bug in MSIE. (But the reports on that one date from 2010 and as far as I'm aware, nobody's reported having problems with it on YaBBForum.)
Posted by: Monni Posted on: Mar 19th, 2015 at 4:19pm
This is what I got when comparing...
Posted by: Dandello Posted on: Mar 19th, 2015 at 1:48pm
The fixes for the non-Latinate attachment name transliteration may still need some work. And there may be an issue with the attachment name transliteration and multiple char encodings on the forum.
Maybe a note for non-Latinate users that the name of the file they're uploading needs to be in ASCII otherwise it will upload as '_______.*'
Posted by: Monni Posted on: Mar 19th, 2015 at 1:27pm
If someone (like Monni) would be so kind as to double check the files in the zip before they get uploaded to the SVN
There are a lot of little fixes - things like line ends, removing hard tabs, fixing hardcoded language, javascript and html validation. And of course, bug fixes. The file is at 2.6.11_fixa.zip (It's too big to just add an an attachment.)
Edited:
I forgot the dupes in the Paths.pm editor in AdminEdit.pm - that fix isn't in the zip.
I'm pretty sure it's still bad... I have to finish comparing the files, but some things just don't look good...
Posted by: Dandello Posted on: Mar 18th, 2015 at 5:38pm
If someone (like Monni) would be so kind as to double check the files in the zip before they get uploaded to the SVN
There are a lot of little fixes - things like line ends, removing hard tabs, fixing hardcoded language, javascript and html validation. And of course, bug fixes. The file is at 2.6.11_fixa.zip (It's too big to just add an an attachment.)
Edited:
I forgot the dupes in the Paths.pm editor in AdminEdit.pm - that fix isn't in the zip.