DFA state reduction

Alexander Morou · Feb 21, 2009

Greetings,

Just curious if anyone here is interested in the silly research I've been doing lately.

The past year or so, I've been struggling on and off with creating a very basic lexical analysis state-machine generator. A while back I thought up something that I thought was close: a radix tree, to represent non-flexible state systems (such as keywords, operators, and so on, they don't repeat and have no optional elements).

The radix tree approach is far simpler than the final result. State->State transitions, by themselves, aren't very difficult, but the issue I kept running into was the need to properly express the problem, I think the reason I failed so many times was I didn't fully understand it. What I came up with is fairly straight forward. I start out with each expression group, unify each expression with each other, ignoring at first the overlaps, on each expression I concatenate each element, bearing mind the Kleene operators and flagging/clearing the edges as necessary based upon the next element. Here's where I usually screwed up: I kept thinking it was just going to work, but in thinking about what I have here, at this point, I have multiple expressions that potentially overlap. I realized this and tried, multiple times, to write a unification algorithm, but the thing I kept messing up was trying to do it all in the same state.

The issue lied in the fact that in order to do it properly and efficiently, I needed a clean slate, a completely new instance of the state with no transitions to foul things up. It was only later that I found the working solution, and later still that I realized: it's the standard concept of NFA->DFA conversion.

So then I thought, after finally getting it, I would take it further. I noticed a lot of redundancy in the generated code, so I knew pretty well how to handle the simplification process: any two states with equal sets of transition elements, are equivalent to one another. Therefore: for every point of duplication, remove the duplicate and place the 'master' (first) non-duplicate, and update the state->state transitions to move to the master state.

I thought further, and realized, using an example to simplify this: C (B | AB)
For every terminal edge, match those which have equal incoming transitions, and replace those as necessary using the above method. This basically means that the above set, CB or CAB, have the same ending-state value.

The second rule of reduction helped in reducing complex state-sets from roughly 460 states to 230, half the original working set. The only issue with it is now I need a sub-state value to represent the point of origin to maintain the determinism of the result (ie. if you're parsing a keyword, which keyword was evaluated, because 'ascending' and 'descending' end on the same state, and share the 8 states in 'scending').

So the question: Anyone here have any idea on how to reduce further?

The thing already reduces the machines further than I could possibly do (efficiently) by hand.

If anyone's interested in the code that it produces, PM me or post a reply. This project will be used to bootstrap the next version of OILexer (Objectified Intermediate Language, Lexical Analysis Generator).

Zeriab · Feb 22, 2009

I find your research very interesting. Unfortunately I can't help you much as I only have a little knowledge about NFA->DFA conversion and not much else about lexical analysis. I don't know radix trees work. (I only know a data structure)

I know that with other kinds of analysis it can be profitable to formalize the problem.
Could you try to explain both the process and the effect in a formal way? In other words both what it does and how it is done.

I dunno if it will help, but it might be worth a shot.

*hugs*
- Zeriab

Alexander Morou · Feb 23, 2009

Zeriab":15f57i7h said:
I find your research very interesting. Unfortunately I can't help you much as I only have a little knowledge about NFA->DFA conversion and not much else about lexical analysis. I don't know radix trees work. (I only know a data structure)

I know that with other kinds of analysis it can be profitable to formalize the problem.
Could you try to explain both the process and the effect in a formal way? In other words both what it does and how it is done.

I dunno if it will help, but it might be worth a shot.

*hugs*
- Zeriab

Unfortunately there's no real *simple* way to explain it, so I figured I'd give a somewhat visual:
"ZABCBAA" - S1
"ZABZBAA" - S2
"ZAZZZAA" - S3
"ZZZZZZA" - S4
"ZZBCBZA" - S5
"ZZZCZZA" - S6
"ZIHFEDC" - S7
"ZIFEDC" - S8

00 -> Z <- 01 -> A <- 02 -> B <- 03 -> C <- 04 -> B <- 05 -> A <- 06 -> A <- 20<END> - S1
00 -> Z <- 01 -> A <- 02 -> B <- 03 -> Z <- 04 -> B <- 05 -> A <- 06 -> A <- 20<END> - S2
00 -> Z <- 01 -> A <- 02 -> Z <- 07 -> Z <- 08 -> Z <- 05 -> A <- 06 -> A <- 20<END> - S3
00 -> Z <- 01 -> Z <- 09 -> Z <- 10 -> Z <- 11 -> Z <- 12 -> Z <- 06 -> A <- 20<END> - S4
00 -> Z <- 01 -> Z <- 09 -> Z <- 10 -> C <- 11 -> Z <- 12 -> Z <- 06 -> A <- 20<END> - S6
00 -> Z <- 01 -> Z <- 09 -> B <- 13 -> C <- 14 -> B <- 12 -> Z <- 06 -> A <- 20<END> - S5
00 -> Z <- 01 -> I <- 15 -> H <- 16 -> F <- 17 -> E <- 18 -> D <- 19 -> C <- 21<END> - S7
00 -> Z <- 01 -> I <-Â Â Â Â Â Â Â 15Â Â Â Â Â Â -> F <- 17 -> E <- 18 -> D <- 19 -> C <- 21<END> - S8

Above you have a very simple grammar, containing eight possible strings.

The first thing the program does is what I call unifying the states on the left-hand side. So the 'Z' in the first slot gets slapped together, and all strings go from the initial state (00) to 1 when the 'Z' character is encountered. The same concept is applied to all the other characters, merging 'A' on S1-S3 (01-02), and 'B' on S1-S2 (02-03) then you'll notice that C and Z are merged into state 04. This seems confusing, because it's not unifying the states on the characters, it's doing something else. After performing 'left union' on the states, it then does the opposite, or what I call 'right union'. But it only does so on the terminal edges (terminal edges have nothing going away from them, ie. 20 & 21 or <END>.) This would condense all the 'A' -> <END> into a single state, and the 'C' -> <END> into a single state. The reason for the limit to the terminal edges is the union is based upon the incoming states, in order to do it any other way, you'd have to verify that the incoming and outgoing states are equal, but that's actually not necessary (I'll explain below).

Once all terminal edges are merged, the real 'right union' can begin, basically states are only concerned with two things: Characters and their targets. If two states have the same characters to seek for, and those characters all go to the same states, the two states are equal to one another.

This means that when all the terminal edges are merged, all states that point to those terminal edges with the same characters, are then merged (The second-to-last 'A' on S1-S3, and the 'Z' on S4-S6). Repeat this process until there are no more duplicates. A state can use more than one character to go to another state, that's why the C and Z were merged into the same state when the right-most union was completed.

This makes it so you have a state machine with 19 logic states and 2 end states, for a total of 21. Before I added the reduction, it resulted in 33 logic states and 8 end states (one for each string, conveniently). The one that has a clear end for the 8 strings is easier to discern 'which' string you're actually ending up with. Without some extra work I won't get into, you'll be simply verifying a match.

If I've been overly confusing again, please let me know. I'm barely getting used to this stuff myself, so I'm sure to not be 100% clear the first go. Amusingly I failed to make the left-union based system 15+ times before I actually got something that wouldn't recurse into infinity or eat my system's memory to nothingness. The right union and terminal merging seemed easy compared to the first part.

Zeriab · Mar 4, 2009

As I understand it the problem is to transform a DFA to another DFA which contains fewer states while having equal behavior. I will for future reference call two DFA's equal when they have equal behavior.
Of course you would want to not generate all the states and then perform the transformation, but I believe it will be easy to first look at the problem as if you generate a DFA in a naive way (i.e. standard generic way without any form of reduction) and then want to transform it to an equal DFA with fewest possible states or perhaps least size.

This approach immediately brings up the issue of when are two DFA's equal (or when are their behavior equal).
I think we should look at the context in which they are used. If two DFA's exhibit the same behavior a given context (from a black-box perspective) then they are equal in that context.

Am I going in the right direction?

Alexander Morou · Mar 5, 2009

Zeriab":2nwhxxuo said:
As I understand it the problem is to transform a DFA to another DFA which contains fewer states while having equal behavior. I will for future reference call two DFA's equal when they have equal behavior.
Of course you would want to not generate all the states and then perform the transformation, but I believe it will be easy to first look at the problem as if you generate a DFA in a naive way (i.e. standard generic way without any form of reduction) and then want to transform it to an equal DFA with fewest possible states or perhaps least size.

This approach immediately brings up the issue of when are two DFA's equal (or when are their behavior equal).
I think we should look at the context in which they are used. If two DFA's exhibit the same behavior a given context (from a black-box perspective) then they are equal in that context.

Am I going in the right direction?

Well since I posted that message I've learned a few things:

1. Reducing the states using the first rule (where if all the transitions from a given state are equal to another state's, and the number of transitions are equal, the states are equivalent and only one is necessary), is a fairly common way to reduce a state machine.

2. Reducing the states using the second rule causes a few issues (that is converging multiple terminal states into one when the characters transitioning into those two terminals are equal). It reduces the usefulness of the overall state machine for analysis purposes. If you reduce the terminals such that multiple paths converge onto a single exit point, you create a simple recognizer, versus a transducer. A transducer can yield information to a processor that can make decisions based upon the the information that results from that kind of state machine.

Those distinctions are important, for cases where you just need to match a pattern, a recognizer is sufficient; however, if your state-machine is for keywords and you want to know -which- keyword was parsed, you need a transducer since the terminals are exclusive to each individual element of the given language that described them (assuming two language elements aren't identical).

From the example above, S1-S8, using the transducer method, there are many more states, but you can tell -which- string was matched, if it is important to you. If the specific string that results is not important, you can use the recognizer method and use the most minimal state variant of the state machine.

In either case, both require you to have originally started from a non-deterministic state machine, which allows the state machine to be in multiple states simultaneously. The overall implementation and structure of a NFA system is simpler, but their use isn't as simple as the DFA system. Whereas the DFA variant of state machines is typically derived from an NFA model (the difference being in a DFA system, no two transitions for a state can have overlapping character ranges, since this would yield potentially multiple outgoing states for a given character or set of characters), the DFA system makes it more difficult to perform analysis upon (between two different state machines), but is typically faster from what I can gather.

The current version of the project I'm working on uses both methods, there's no match code injection for the system yet, so it makes the decision on which to use based upon the structure of the regular expression for a given token, because this is only the first version of the program. I plan to use it to build version 2 (which will be the official version 1, since the version I'm using won't be seen by anyone but me).

Edit: The fun thing I'm having to do right now is altering the output of the transducer version. Before, I had no structure to handle cases when two elements from the token's description overlapped. Then I thought, 'What if two values exist, one case sensitive and one not, but both are valid at different points in the language?

So I restructured the terminal check code, and if this is your token's description:
SillyToken2 := @"if":InsensitiveIf; |
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â "if":SensitiveIf;;

The resulted code would contain six states, three logical states and three exit states. The '@' before the first if means 'case insensitive'. The insensitive if is, surprisingly, on three of these exit cases:
switchÂ (this.state)
{
Â Â Â Â caseÂ 0:
Â Â Â Â Â Â Â Â switchÂ (@char)
Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â caseÂ 'i':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.stateÂ =Â 1;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â returnÂ true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â caseÂ 'I':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.stateÂ =Â 2;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â returnÂ true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â break;
Â Â Â Â caseÂ 1:
Â Â Â Â Â Â Â Â switchÂ (@char)
Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â caseÂ 'f':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.stateÂ =Â 3;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.exitStateÂ =Â STATE_INSENSITIVEIF_SENSITIVEIF;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â returnÂ false;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â caseÂ 'F':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.stateÂ =Â 4;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.exitStateÂ =Â STATE_INSENSITIVEIF;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â returnÂ false;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â break;
Â Â Â Â caseÂ 2:
Â Â Â Â Â Â Â Â switchÂ (@char)
Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â caseÂ 'F':
Â Â Â Â Â Â Â Â Â Â Â Â caseÂ 'f':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.stateÂ =Â 5;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.exitStateÂ =Â STATE_INSENSITIVEIF;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â returnÂ false;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â break;
}

Zeriab · Mar 5, 2009

Thanks a lot for taking your time to explain it to me ^_^
I understand much better what you are trying to do know.
I am glad you found a solution and I look forward to your product.
I will tell you if I ever get any ideas for potential optimization although I doubt it.

Good luck and hugs
- Zeriab

Edit:
Whoops, didn't see your edit.
You are quite right. I didn't think about about multiple meanings of states.
Is three overlapping elements also a possibility or is it just two?

Alexander Morou · Mar 5, 2009

Zeriab":1th9rfnn said:
Is three overlapping elements also a possibility or is it just two?

The quantity isn't important. It's possible for more than two to hit the same point, for example: "For" "for" and @"for" would allow for, For, fOr, foR, FOr, FoR, FOR, fOR, two cases would yield two elements overlapping, but in the off chance they entered a fourth, that's also 'for', it would include it in the list, and as long as their individual names are exclusive it would create an exit state akin to: For_ForSecond_ForThird, the actual names between the '_' are sourced from the file you describe them in (as noted by the STATE_INSENSITIVEIF_SENSITIVEIF). I'm adding logic right now to add context sensitivity so it only refers to each individual element where necessary, an example being if you used a rule that said 'For', it would select the appropriate version; however if you explicitly stated it was Keywords.ForThird, the output information for that given state in the syntax parser would tell the state-machine, for keywords, that it only wants the logic chain for ForThird to be activated. The exit state would still be the For_ForSecond_ForThird, but when it determines the exact keyword at the end, it will use the context awareness to discern the appropriate token (which is, Keywords.ForThird).

Zeriab · Mar 7, 2009

Ah yeah, good point.

Alexander Morou · Mar 9, 2009

public bool Next(char @char)
{
Â Â Â Â switch (this.state)
Â Â Â Â {
Â Â Â Â Â Â Â Â case 0:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'a':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â if (((this.AllowedAttributeTargets & AttributeTargetCases.AssemblyTarget) != AttributeTargetCases.None))
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // a

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 1;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'r':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â if (((this.AllowedAttributeTargets & AttributeTargetCases.ReturnTypeTarget) != AttributeTargetCases.None))
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // r

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 2;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'f':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â if (((this.AllowedAttributeTargets & AttributeTargetCases.FieldTarget) != AttributeTargetCases.None))
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // f

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 3;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'm':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â if (((this.AllowedAttributeTargets & AttributeTargetCases.MethodTarget) != AttributeTargetCases.None))
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // m

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 4;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'e':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â if (((this.AllowedAttributeTargets & AttributeTargetCases.EventTarget) != AttributeTargetCases.None))
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // e

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 5;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'p':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â if (((this.AllowedAttributeTargets & AttributeTargetCases.PropertyTarget) != AttributeTargetCases.None))
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // p

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 6;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 't':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â if (((this.AllowedAttributeTargets & AttributeTargetCases.TypeTarget) != AttributeTargetCases.None))
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // t

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 7;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 1:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 's':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // as

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 8;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 2:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'e':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // re

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 14;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 3:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'i':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // fi

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 18;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 4:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'e':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // me

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 21;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 5:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'v':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // ev

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 25;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 6:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'r':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // pr

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 28;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 7:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'y':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // ty

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 34;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 8:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 's':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // ass

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 9;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 9:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'e':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // asse

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 10;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 10:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'm':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // assem

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 11;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 11:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'b':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // assemb

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 12;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 12:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'l':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // assembl

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 13;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 13:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'y':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.exitState = ExitStates.AssemblyTarget;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // assembly

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = -1;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 14:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 't':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // ret

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 15;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 15:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'u':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // retu

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 16;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 16:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'r':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // retur

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 17;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 17:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'n':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.exitState = ExitStates.ReturnTypeTarget;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // return

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = -1;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 18:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'e':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // fie

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 19;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 19:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'l':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // fiel

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 20;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 20:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'd':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.exitState = ExitStates.FieldTarget;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // field

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = -1;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 21:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 't':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // met

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 22;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 22:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'h':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // meth

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 23;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 23:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'o':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // metho

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 24;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 24:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'd':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.exitState = ExitStates.MethodTarget;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // method

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = -1;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 25:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'e':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // eve

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 26;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 26:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'n':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // even

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 27;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 27:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 't':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.exitState = ExitStates.EventTarget;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // event

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = -1;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 28:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'o':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // pro

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 29;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 29:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'p':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // prop

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 30;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 30:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'e':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // prope

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 31;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 31:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'r':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // proper

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 32;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 32:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 't':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // propert

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 33;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 33:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'y':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.exitState = ExitStates.PropertyTarget;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // property

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = -1;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 34:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'p':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // typ

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 35;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 35:
Â Â Â Â Â Â Â Â Â Â Â Â switch (@char)
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â case 'e':
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.exitState = ExitStates.TypeTarget;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // type

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = -1;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â }
Â Â Â Â return false;
}

Well I've been working on the context awareness aspect, and so far it's been going well.

Above is an example of a simpler state machine, that can parse the following terms:
AttributeTargets :=
Â Â Â Â "assembly":AssemblyTarget; |
Â Â Â Â "return":ReturnTypeTarget; |
Â Â Â Â "field":FieldTarget; |
Â Â Â Â "method":MethodTarget; |
Â Â Â Â "event":EventTarget; |
Â Â Â Â "property":PropertyTarget; |
Â Â Â Â "type":TypeTarget;;

As you can see, on the very first state it decides to terminate if the given term at that point isn't valid. This is important for cases where you have two terms that overlap, but diverge later on, an example, from the keywords state machine:

case 'r':
Â Â Â Â if ((((this.ActiveSets & (KeywordCases.Case2 | KeywordCases.Case3)) != KeywordCases.None) && (((this.AllowedInSet2 & (KeywordsCase2.ReadOnly | KeywordsCase2.Ref)) != KeywordsCase2.None) || ((this.AllowedInSet3 & KeywordsCase3.Return) != KeywordsCase3.None))))
Â Â Â Â {
Â Â Â Â Â Â Â Â // r

Â Â Â Â Â Â Â Â this.state = 15;
Â Â Â Â Â Â Â Â return true;
Â Â Â Â }
Â Â Â Â break;

Since there are a lot more things to choose from, it has to refine using another element besides just the keywords selected, the set selector. In cases where there's more than one set being used, it can terminate even earlier than checking two sets by checking the set selection, if the entire sets that it's dependent upon aren't available to it, it'll terminate before it checks which elements. Granted in most cases this won't be an issue, I haven't had the time yet to do comparisons to determine whether the set (vs. element) short-circuiting is time worthy.

To further things, it goes to another level of refinement later on, since at 'r' there's 'Ref', 'ReadOnly' and 'Return' from sets two and three, since all three share the second character 'e', the refinement isn't done until the third letter:

case 231:
Â Â Â Â switch (@char)
Â Â Â Â {
Â Â Â Â Â Â Â Â case 'a':
Â Â Â Â Â Â Â Â Â Â Â Â if (((this.AllowedInSet2 & KeywordsCase2.ReadOnly) != KeywordsCase2.None))
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // rea

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 232;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 'f':
Â Â Â Â Â Â Â Â Â Â Â Â if (((this.AllowedInSet2 & KeywordsCase2.Ref) != KeywordsCase2.None))
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.exitState = ExitStates.Ref;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // ref

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = -1;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â Â Â Â Â case 't':
Â Â Â Â Â Â Â Â Â Â Â Â if (((this.AllowedInSet3 & KeywordsCase3.Return) != KeywordsCase3.None))
Â Â Â Â Â Â Â Â Â Â Â Â {
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â // ret

Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â this.state = 233;
Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â return true;
Â Â Â Â Â Â Â Â Â Â Â Â }
Â Â Â Â Â Â Â Â Â Â Â Â break;
Â Â Â Â }
Â Â Â Â break;

For those elements, there's no need for it to take the checks further, because nothing else shares the path with it.

When I build the results of a token (that's next), I'll also have to include multiple results, like the example from a previous post (where it used 'SENSITIVE_IF' and 'INSENSITIVE_IF'), but also for cases where an element is entirely eclipsed by another path, a good example being 'for' and 'foreach', the path for 'foreach' should return 'for' as a part of its multi-part result. Granted, for most cases it'll fizzle out, but I can't predict all language descriptions, so there might be some genius out there who wants a foreach and a 'for each' in segmented form (even to the point of allowing the second 'foreach' to not require the space) , both with different meanings. This'll complicate the syntactical analysis, but such analysis would've already been necessary to build the proper output for each possible permutation of a given language.

Alexander Morou · Mar 12, 2009

Well, I decided to do some speed tests, though they're not a real case example, they give me an idea of whether context awareness at the Lexer level is worth the effort.

From initial tests: it is.

If a series of keywords weren't supposed to be parsed, they aren't. Short-circuiting kills the evaluation at the earliest point possible. Going at most one character over to reduce unnecessary double-check logic.

Above is a screen shot showing the test run, the issue with the test is it doesn't have the overhead for the actual streaming aspect, it uses an array of the valid keywords, all it does is emphasize that the Context Awareness is there, and in cases where it's necessary to only return a certain set of keywords, it does.

I had to tweak things a little because in cases where '-' was not a valid keyword, but '--' was, it would still enter the exit state for '-' until the '--' was encountered, which isn't correct, since if the full term is just '-', and '--' is allowed, well I'm sure you can see the problem. The solution was to segment the logic between overlapped members and the members that follow those members (eg. '-' is followed by '--' if the proper character is next).

Now that this is verified, I'll be completing the test structures for implementing syntactical analysis, which I'll use to build a CST and a top-down parser.

Edit:
Forgot to mention, the 'failed iterations' refers to the terms that are valid keywords, but weren't part of the valid list for the current context, thus they're still failures, but valid in the fullest accept context.

Edit 2:
The context-awareness is entered into the state machine via a 'reset' method that sets bit-fields related to the keywords/token terms which are valid in the current context. Here's a look at said method:

public void Reset(KeywordsCase1 allowedInSet1, KeywordsCase2 allowedInSet2, KeywordsCase3 allowedInSet3)
{
Â Â Â Â this.AllowedInSet1 = allowedInSet1;
Â Â Â Â this.AllowedInSet2 = allowedInSet2;
Â Â Â Â this.AllowedInSet3 = allowedInSet3;
Â Â Â Â this.ActiveSets = KeywordCases.None;
Â Â Â Â if ((this.AllowedInSet1 != KeywordsCase1.None))
Â Â Â Â Â Â Â Â this.ActiveSets = (this.ActiveSets | KeywordCases.Case1);
Â Â Â Â if ((this.AllowedInSet2 != KeywordsCase2.None))
Â Â Â Â Â Â Â Â this.ActiveSets = (this.ActiveSets | KeywordCases.Case2);
Â Â Â Â if ((this.AllowedInSet3 != KeywordsCase3.None))
Â Â Â Â Â Â Â Â this.ActiveSets = (this.ActiveSets | KeywordCases.Case3);
Â Â Â Â this.exitState = ExitStates.None;
Â Â Â Â this.state = 0;
}

The lack of a "|=" operator in the old code generator framework I wrote is the cause for the odd a = (a | b); syntax in the ActiveSets code.

Envision, Create, Share

DFA state reduction

Alexander Morou

Zeriab

Alexander Morou

Zeriab

Alexander Morou

Zeriab

Alexander Morou

Zeriab

Alexander Morou

Alexander Morou

Share this page

Thank you for viewing

Discord

RPG Maker

Content