Envision, Create, Share

Welcome to HBGames, a leading amateur game development forum and Discord server. All are welcome, and amongst our ranks you will find experts in their field from all aspects of video game design and development.

REXML in RMXP (too short multibyte code string)

Script Here

A w.i.p. REXML implementation for RMXP.
Throws an syntax error in a place that, well, it shouldn't.
Error is on this Regexp call:

Code:
 

      VALID_XML_CHARS = /^( 

           [\x09\x0A\x0D\x20-\x7E]            # ASCII

         | [\xC2-\xDF][\x80-\xBF]             # non-overlong 2-byte

         |  \xE0[\xA0-\xBF][\x80-\xBF]        # excluding overlongs

         | [\xE1-\xEC\xEE][\x80-\xBF]{2}      # straight 3-byte

         |  \xEF[\x80-\xBE]{2}                #

         |  \xEF\xBF[\x80-\xBD]               # excluding U+fffe and U+ffff

         |  \xED[\x80-\x9F][\x80-\xBF]        # excluding surrogates

         |  \xF0[\x90-\xBF][\x80-\xBF]{2}     # planes 1-3

         | [\xF1-\xF3][\x80-\xBF]{3}          # planes 4-15

         |  \xF4[\x80-\x8F][\x80-\xBF]{2}     # plane 16

       )*$/x; 

 

Thanks for your help, XML in RM will probably be of use to a lot of scripters.
 

Zeriab

Sponsor

It appears that the [] does not support \xC0-\xFF. /[\xF1]/ will for example throw a syntax error. It may just be a matter of having to escape them. /[\\xF1]/ will for example not throw a syntax error. I don't know if it checks for the correct character.
Using (\xF1|\xF2|\xF3) instead of [\xF1-\xF3] works but is very tedious for large sets.

I suggest that you test each component when building large regular expressions.
To make my point clearer look at this:
[rgss]VALID_XML_CHARS = /^(
           a            # ASCII
         | b             # non-overlong 2-byte
         | c        # excluding overlongs
         | d      # straight 3-byte
         | e                #
         | f               # excluding U+fffe and U+ffff
         | g        # excluding surrogates
         | h     # planes 1-3
         | i    # planes 4-15
         | j     # plane 16
       )*$/x
[/rgss]

You can check each a-j expression independently first before adding them together.
If you have done this you would have caught the problem much earlier and wouldn't have been confused by the large expression.

I can tell you that it is b, d and i which cause syntax errors.

Having an XML parser will definitely be useful :biggrin:

*hugs*
 
XML parsing sure might come in handy... I tried working with it in Ruby, but quickly switched to YAML eventually.
Hm, that being said... when's the YAML parser coming? :D (not that it'd be a hard thing to do... XD ).

As the question of usefulness is in the original post, I feel free to reply to that (it's not actually put as a question, but the main question isn't either, so...).
I dunno if XML makes much sense for a game application. In almost all curcumstances, I think YAML would proove better and less complicated to use. Not that XML is complicated, but YAML is a tad simpler really.
The purpose of XML really is publication on multiple outputs that share the same input. I don't see that many applications in which that'd be the case... however I'm interested to hear some I'm not thinking about!
 
I'm going to be using XML for G.R.I.M., Graphical RPG Interface Markup- or more descriptively, my replacement interface for RM. Already have the codeside done to support an html-esque markup complete with select objects, windows, backgrounds, etc. I need to get XML because that's where the markup for this interface engine comes in- without that, its just another interface script. The XML format I am using should be comfortable enough for anyone familiar with HTML to be able to customize their entire GUI from title screen through battle. Supports mouse + keyboard, the select objects feature an "onSelect=this" xml argument in which "this" is a command to be eval'ed against the current scene. But that'll have its own thread shortly. Just saying, XML is useful at least in this example of what zeriab mentioned, "customization".

Speaking of which, holy crap! Zeriab! Didn't know you were back! (shows how little i pay attention). *hugs*
 
I have returned! Figured out that was an EXTENDED regexp call, and thanks to glitchfinder was able to get it to at least try to evaluate it. :cheers:

However, bad news :eek:uch: (or more descriptively, class challenge time! :thumb: )

2826regexp-error.png


The bad part kicks in in that this is, best as the interweb offers, a bug [we need a bug smiley] in, guess what, the build of ruby that RM uses! Yay! So I figure the alternative is that we need to write our own, comparable RegExp query. I'm horrible at RegExp. Well, not horrible, but I can't quite understand this. I get the base conversion to hex (\x) and the ranges ([]'s) but I don't understand what a range such as: [\x15\x1A\x1C-\x1F] does. It's not using |, and it's only checking one character at a time so it can't really be looking for a string...

And so, I humbly beseech any remaining gods of old scripteria to hear my humble plea and come forth! We need your help! :angel:
 
First of all I wanna ask you if you're still using Unicode escape sequences to get unusual characters for those Regexp. If not, they won't work, obviously. Just check they're not meant for ASCII encoding because Unicode UTF-8 is the default encoding in RM.

If that REXML script is run outside RM, you can change every string encoding like you'd do it in Ruby 1.9.x.
 

Thank you for viewing

HBGames is a leading amateur video game development forum and Discord server open to all ability levels. Feel free to have a nosey around!

Discord

Join our growing and active Discord server to discuss all aspects of game making in a relaxed environment. Join Us

Content

  • Our Games
  • Games in Development
  • Emoji by Twemoji.
    Top