Skip to content
This repository was archived by the owner on Dec 15, 2022. It is now read-only.
This repository was archived by the owner on Dec 15, 2022. It is now read-only.

Look-behind issue with case-insensitive match #105

@vpetrovykh

Description

@vpetrovykh

I've ran into a weird look-behind error in Atom 1.22.0 while trying to create a grammar file. After tinkering for a bit I've reduced the offending grammar to a fairly minimal CSON file:

name: "FooGrammar"
scopeName: "source.foo"
fileTypes: [
  "foo"
]
uuid: "708acdf0-3389-41cd-80f5-44b654eee848"
patterns: [
  {
    include: "#test"
  }
]
repository:
  test:
    begin: "(?i)(?<=aff)z"
    end: "end"
    contentName: "meta.foo"

This produces the following error:

Uncaught Error: invalid pattern in look-behind /usr/lib64/atom/app.asar/node_modules/first-mate/lib/scanner.js:31 
    at Scanner.module.exports.Scanner.createScanner (/usr/lib64/atom/app.asar/node_modules/first-mate/lib/scanner.js:31)
    at Scanner.module.exports.Scanner.getScanner (/usr/lib64/atom/app.asar/node_modules/first-mate/lib/scanner.js:37)
    at Scanner.module.exports.Scanner.findNextMatch (/usr/lib64/atom/app.asar/node_modules/first-mate/lib/scanner.js:56)
    at Rule.module.exports.Rule.findNextMatch (/usr/lib64/atom/app.asar/node_modules/first-mate/lib/rule.js:98)
    at Rule.module.exports.Rule.getNextTags (/usr/lib64/atom/app.asar/node_modules/first-mate/lib/rule.js:154)
    at Grammar.module.exports.Grammar.tokenizeLine (/usr/lib64/atom/app.asar/node_modules/first-mate/lib/grammar.js:152)
    at TokenizedBuffer.buildTokenizedLineForRowWithText (/usr/lib64/atom/app.asar/src/tokenized-buffer.js:506)
    at TokenizedBuffer.buildTokenizedLineForRow (/usr/lib64/atom/app.asar/src/tokenized-buffer.js:501)
    at TokenizedBuffer.tokenizeNextChunk (/usr/lib64/atom/app.asar/src/tokenized-buffer.js:389)
    at _.defer (/usr/lib64/atom/app.asar/src/tokenized-buffer.js:373)
    at /usr/lib64/atom/app.asar/node_modules/underscore/underscore.js:666

As best I can tell, the issue is caused by having ff or fi appear in the look-behind, but only if it's also case-insensitive. Here are some variations that produce the same issue for me:

begin: "(?i)(?<=afi)z"
begin: "(?i)(?<=fi|wq)z"

It is possible that this is because ff and fi can both be ligatures. The error happens irrespective of whether the actual file targeted by the grammar contains the offending pattern.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions