Page Index Toggle Pages: 1 2 [3]  ReplyAdd Poll Send Topic
Very Hot Topic (More than 25 Replies) JavaScript errors (Read 12845 times)
 
Paste Member Name in Quick Reply Box Monni
Language
***
Offline


Min izāmō

Posts: 413
Location: Kaarina, Finland
Joined: Jul 16th, 2014
Gender: Male
Mood: Frustrated
Zodiac sign: Pisces
Re: JavaScript errors
Reply #30 - Aug 10th, 2014 at 4:50pm
Mark & QuoteQuote  
Dandello wrote on Aug 10th, 2014 at 3:47pm:
I'm not arguing that the error catch is needed - just that we can't use the shortcut until/unless we announce that X version of YaBB will require a version of Perl newer that what a lot of hosts are still using.


I know... My point was just "we" already decided 2.6.2 will break backwards compatibility with legacy character set support. It really doesn't hurt to break more backwards compatibility issues especially if third-world country like Finland is already using Perl 5.10.1.

Problem with older Perl installations is not just the version of Perl, but more and more currently commonly used modules are missing from standard installations because they were not required or needed back when 5.8 was the latest version.
  
Back to top
IP Logged
 
Paste Member Name in Quick Reply Box Dandello
Forum Administrator
YaBB Modder
*****
Offline


I love YaBB 2.7!

Posts: 2234
Location: The Land of YaBB
Joined: Feb 12th, 2014
Gender: Female
Mood: Annoyed
Zodiac sign: Virgo
Re: JavaScript errors
Reply #31 - Aug 10th, 2014 at 5:10pm
Mark & QuoteQuote  
And I think 2.6.2 will be the point where we have to declare that YaBB needs a newer Perl to run. (2.6.0 broke the templates and mods, 2.6.2 will break darn near everything else to get us on a better track for moving forward.)

I'm keeping notes on the backwards compatibility things that can be tossed in 2.6.2. (There are more language encoding things that can be removed and our server gurus know of some spots where we should just yank out a couple sub routines that were written to solve problems that don't exist in modern server software. We'll also be looking at doing things like sequestering Mods away from the main code as much as possible - that may be a 2.6.1 change since we don't have that many 2.6.0 mods and they're still mostly in the alpha stage. )
  

Perfection is not possible. Excellence, however, is excellent.
Back to top
WWW  
IP Logged
 
Paste Member Name in Quick Reply Box Monni
Language
***
Offline


Min izāmō

Posts: 413
Location: Kaarina, Finland
Joined: Jul 16th, 2014
Gender: Male
Mood: Frustrated
Zodiac sign: Pisces
Re: JavaScript errors
Reply #32 - Aug 10th, 2014 at 5:24pm
Mark & QuoteQuote  
I'm still going to mix and match stuff even when 2.6.2 comes out as I don't think my clients are eager to switch to UTF-8.
  
Back to top
IP Logged
 
Paste Member Name in Quick Reply Box Dandello
Forum Administrator
YaBB Modder
*****
Offline


I love YaBB 2.7!

Posts: 2234
Location: The Land of YaBB
Joined: Feb 12th, 2014
Gender: Female
Mood: Annoyed
Zodiac sign: Virgo
Re: JavaScript errors
Reply #33 - Aug 10th, 2014 at 6:01pm
Mark & QuoteQuote  
One of the beauties of OpenSource - You CAN mix and match. (And we may keep the ANSI/UTF-8 language files for quite some time. I'm not finding a good way to auto convert old CP1251 files to UTF-8 on a forum with mixed encoding. And doing it by hand on a large forum with mixed encoding will be a major pain in the arse. With this in mind the default character encoding setting in the Admin Center may go as it's not working as well as we'd hoped.)
  

Perfection is not possible. Excellence, however, is excellent.
Back to top
WWW  
IP Logged
 
Paste Member Name in Quick Reply Box Monni
Language
***
Offline


Min izāmō

Posts: 413
Location: Kaarina, Finland
Joined: Jul 16th, 2014
Gender: Male
Mood: Frustrated
Zodiac sign: Pisces
Re: JavaScript errors
Reply #34 - Aug 10th, 2014 at 7:14pm
Mark & QuoteQuote  
It all comes to the probability of having some character that is invalid in all except one encoding... In ISO 8859-1 character codes 127-159 are invalid, in CP1251 character code 152 is invalid, in UTF-8 all character codes above 127 must follow one of the following patterns:

194-223 + 128-191
224-239 + 160-191 + 128-191
240-247 + 144-191 + 128-191 + 128-191
248-251 + 136-191 + 128-191 + 128-191 + 128-191
252-253 + 132-191 + 128-191 + 128-191 + 128-191 + 128-191

This leaves quite a lot of character codes and sequences invalid.
  
Back to top
IP Logged
 
Paste Member Name in Quick Reply Box Dandello
Forum Administrator
YaBB Modder
*****
Offline


I love YaBB 2.7!

Posts: 2234
Location: The Land of YaBB
Joined: Feb 12th, 2014
Gender: Female
Mood: Annoyed
Zodiac sign: Virgo
Re: JavaScript errors
Reply #35 - Aug 10th, 2014 at 8:41pm
Mark & QuoteQuote  
Where I'm having trouble is probably a combination of coding on a Windows machine and the fact that I haven't yet found a Perl based character encoding detector that can even semi-accurately detect the difference between an ISO-8859-1 string and a CP1251 string. (But I haven't exhausted all the possible options.)

Once we get around that then we'll be able to better handle non-Latin1 data.

  

Perfection is not possible. Excellence, however, is excellent.
Back to top
WWW  
IP Logged
 
Paste Member Name in Quick Reply Box Monni
Language
***
Offline


Min izāmō

Posts: 413
Location: Kaarina, Finland
Joined: Jul 16th, 2014
Gender: Male
Mood: Frustrated
Zodiac sign: Pisces
Re: JavaScript errors
Reply #36 - Aug 10th, 2014 at 8:54pm
Mark & QuoteQuote  
Dandello wrote on Aug 10th, 2014 at 8:41pm:
Where I'm having trouble is probably a combination of coding on a Windows machine and the fact that I haven't yet found a Perl based character encoding detector that can even semi-accurately detect the difference between an ISO-8859-1 string and a CP1251 string. (But I haven't exhausted all the possible options.)

Once we get around that then we'll be able to better handle non-Latin1 data.


Like I said earlier, ISO-8859-1 has more invalid characters, but for semi-accurate detection you also need to know which language the ISO-8859-1 text is as some high ASCII pairs are very unlikely in ISO-8859-1 even though they are technically valid.
With CP1251 it is more likely for text to contain more than 1 consecutive character in range 160-191 or 2 consecutive characters in range 192-255.
  
Back to top
IP Logged
 
Paste Member Name in Quick Reply Box Dandello
Forum Administrator
YaBB Modder
*****
Offline


I love YaBB 2.7!

Posts: 2234
Location: The Land of YaBB
Joined: Feb 12th, 2014
Gender: Female
Mood: Annoyed
Zodiac sign: Virgo
Re: JavaScript errors
Reply #37 - Aug 10th, 2014 at 10:54pm
Mark & QuoteQuote  
I'm actually pondering the best way to detect the CP1251 strings by using two or three characters together and checking against the non-ISO character list. There actually wouldn't be much of a problem at all with CP1251 except that there are overlaps in the character codes so ä gets converted to something else that's not right.

As for most YaBB forums the encoding will most likely be Latin1 or CP1251 as Chinese is converted internally to html entities. Had the guys done that with Cyrillic early on there wouldn't be a problem now.
  

Perfection is not possible. Excellence, however, is excellent.
Back to top
WWW  
IP Logged
 
Paste Member Name in Quick Reply Box Monni
Language
***
Offline


Min izāmō

Posts: 413
Location: Kaarina, Finland
Joined: Jul 16th, 2014
Gender: Male
Mood: Frustrated
Zodiac sign: Pisces
Re: JavaScript errors
Reply #38 - Aug 11th, 2014 at 5:25pm
Mark & QuoteQuote  
Dandello wrote on Aug 10th, 2014 at 10:54pm:
I'm actually pondering the best way to detect the CP1251 strings by using two or three characters together and checking against the non-ISO character list. There actually wouldn't be much of a problem at all with CP1251 except that there are overlaps in the character codes so ä gets converted to something else that's not right.

As for most YaBB forums the encoding will most likely be Latin1 or CP1251 as Chinese is converted internally to html entities. Had the guys done that with Cyrillic early on there wouldn't be a problem now.


By taking "ä" as an example, it is very likely that one of the surrounding characters is either "ä", "Ä" or low ASCII... that way you can eliminate out Cyrillic text. Using three characters and assuming there can't be three consecutive high ASCII characters in Latin-1 text, the result should be pretty promising... so patterns like:
1. low-ASCII + ä + word-boundary
2. low-ASCII + ä + ä
3. low-ASCII + ä + low-ASCII
4. Ä or ä + ä + low-ASCII
5. Ä or ä + low-ASCII + ä

low-ASCII here means a-z, A-Z, word-boundary means any low-ASCII character except a-Z or A-Z or any valid high-ASCII character that is not a-z or A-Z with diacritical marks, meaning character codes 32-64, 91-96, 123-126, 160-191, 215 or 247.

Care should be taken not to mistake Latin-15 (Latin-9, Windows-28605) encoded text as invalid Latin-1 text, because those can be used interchangeably. Latin-15 uses also character codes 166, 168, 180, 184 and 188-190 for letters.
  
Back to top
IP Logged
 
Page Index Toggle Pages: 1 2 [3] 
ReplyAdd Poll Send Topic
Bookmarks: del.icio.us Digg Facebook Google LinkedIn reddit Twitter Yahoo
JavaScript errors

Please type the characters exactly as they appear in the image,
without the first 2 and last 2 characters.
The characters must be typed in the same order,
and they are case-sensitive.
Open Preview Preview

You can resize the textbox by dragging the right or bottom border.
Off Topic Comment Insert Spoiler
Insert Hyperlink Insert FTP Link Insert Image Insert E-mail Insert Media Insert Table Insert Table Row Insert Table Column Insert Horizontal Rule Insert Teletype Insert Code Insert Quote Edited Superscript Subscript Insert List /me - my name Insert Marquee Insert Timestamp No Parse
Bold Italicized Underline Insert Strikethrough Highlight
                       
Change Text Color
Insert Preformatted Text Left Align Centered Right Align
resize_wb
resize_hb







Max 5000 characters. Remaining characters:
Text size: %
More Smilies
View All Smilies
Collapse additional features Collapse/Expand additional features Smiley Wink Cheesy Grin Angry Sad Shocked Cool Huh Roll Eyes Tongue Embarrassed Lips Sealed Undecided Kiss Cry