Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions about ambiguity warnings, and working with ambiguous alternations #3

Open
matthew-dean opened this issue Aug 20, 2023 · 6 comments

Comments

@matthew-dean
Copy link

Correct me if I'm wrong, but I would think that this algorithm would be tolerant with ambiguities in OR expressions. To be fair, this lookahead strategy eliminates most ambiguity warnings, but sometimes, some persist, and IGNORE_AMBIGUITIES has no effect. Is there a workaround?

@matthew-dean
Copy link
Author

matthew-dean commented Aug 20, 2023

I had a look through the code, and AFAICT, IGNORE_AMBIGUITIES is never checked before building an ambiguity error? I think this is subverting Chevrotain's default behavior, which is that even if two alternatives match entirely (or their lookaheads do), the first one can be chosen and the warning is suppressed.

@msujew
Copy link
Member

msujew commented Aug 30, 2023

You're right, we currently don't check the flag. Note that an ambiguity in this package hints at a very serious grammar flaw and should not be disregarded. Compared to the LL(k) lookahead, an ambiguity in ALL(*) leads to very serious performance degradation, as the algorithm doesn't stop until it is absolutely sure it's ambiguous. Likely what you're experiencing in #5.

You can override the print behavior anyway with the constructor options of the strategy.

@matthew-dean
Copy link
Author

@msujew You're right. I get degraded performance when ambiguities exist. I assume that gates are applied before checking for ambiguities?

@msujew
Copy link
Member

msujew commented Sep 6, 2023

I assume that gates are applied before checking for ambiguities?

Yes, similar to the default LL(k) behavior in Chevrotain, gates are checked beforehand. In ANTLR4, gates are checked on demand, but that also leads to performance issues in ambiguous grammars - just different ones. While chevrotain-allstar struggles if the gates are expensive, ANTLR4's strategy struggles in case of long prefixes (that could be resolved by performing the gate check beforehand).

An adaptive gate evaluation mode (i.e. try lazy until encountering a long prefix - then switch to eager evaluation) would be quite interesting to investigate. Unfortunately nothing I have time for in the foreseeable future :(

@matthew-dean
Copy link
Author

matthew-dean commented Oct 11, 2023

@msujew Is there a way to tell chevrotain-allstar to ignore a certain path when testing for ambiguity?

Note that an ambiguity in this package hints at a very serious grammar flaw and should not be disregarded.

I don't necessarily disagree, but CSS's grammar is defined with ambiguities, and CSS is not a niche grammar.

To explain: in CSS, there are a number of places in the definition where it specifies a specific token pattern, but it has a "fallback" definition. For instance, for a media query, it allows a parenthetical like (min-width: 640px), or a "range" like (width > 640px) OR what it calls "general enclosed" which is ([any sequence of tokens]). Obviously, any sequence is inherently ambiguous and can match the other two patterns. CSS is intentionally ambiguous in lots of places, because the intention is that it's up to a user agent to interpret what that sequence of tokens may mean, if anything. So, there are increasingly places in CSS grammar where it basically says to "try and restart". I know Chevrotain by default doesn't handle this very well, but I'm wondering if this package may help.

So, in my case, I want it to determine if it's a declaration or a range, but only if it fully matches. Otherwise, it should just parse as a general sequence of tokens. Can you think about the best way to approach this?

@matthew-dean
Copy link
Author

I tried something like the following, because I naively thought that perhaps if I only passed on alternation into the lookahead function at a time, that would be the only one "tested".

orInternal<T>(
    altsOrOpts: Array<IOrAlt<any>> | OrMethodOpts<unknown>,
    occurrence: number
  ): T {
    let alts: Array<IOrAlt<any>>
    let laKey = this.getKeyForAutomaticLookahead(OR_IDX, occurrence)
    let optionalAlt = false

    if (isArray(altsOrOpts)) {
      alts = altsOrOpts
    } else {
      alts = altsOrOpts.DEF
      if (altsOrOpts.IGNORE_AMBIGUITIES) {
        optionalAlt = true
      }
    }

    let laFunc = this.getLaFuncFromCache(laKey)
    let altIdxToTake: any
    if (optionalAlt) {
      let altLength = alts.length
      for (let i = 0; i < altLength; i++) {
        let alt = alts[i]!
        /** Try a single alt at a time */
        let pass = laFunc.call(this, [alt])
        if (pass !== undefined) {
          altIdxToTake = i
          break
        }
      }
    } else {
      altIdxToTake = laFunc.call(this, alts)
    }
    if (altIdxToTake !== undefined) {
      const chosenAlternative: any = alts[altIdxToTake]
      return chosenAlternative.ALT.call(this)
    }
    this.raiseNoAltException(
      occurrence,
      (altsOrOpts as OrMethodOpts<unknown>).ERR_MSG
    )
  }

Unfortunately, I discovered that the lookahead functions are pre-cached for each alternation set, so this didn't work. 🤔

@matthew-dean matthew-dean changed the title Should I be getting ambiguity warnings? Questions about ambiguity warnings, and working with ambiguous alternations Oct 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants