From 2e3eb892d1bf040493b52cf5f78f86e810e71eff Mon Sep 17 00:00:00 2001 From: Jordan Harband Date: Thu, 21 Sep 2023 16:02:23 -0700 Subject: [PATCH] [spec] first attempt at a spec rewrite --- .github/workflows/build.yml | 12 ++++++++ .github/workflows/deploy.yml | 20 +++++++++++++ .gitignore | 46 ++++++++++++++++++++++++++++++ .npmrc | 1 + package.json | 23 +++++++++++++++ spec-ecmarkup.html | 32 --------------------- spec.emu | 54 ++++++++++++++++++++++++++++++++++++ 7 files changed, 156 insertions(+), 32 deletions(-) create mode 100644 .github/workflows/build.yml create mode 100644 .github/workflows/deploy.yml create mode 100644 .gitignore create mode 100644 .npmrc create mode 100644 package.json delete mode 100644 spec-ecmarkup.html create mode 100644 spec.emu diff --git a/.github/workflows/build.yml b/.github/workflows/build.yml new file mode 100644 index 0000000..ec60271 --- /dev/null +++ b/.github/workflows/build.yml @@ -0,0 +1,12 @@ +name: Build spec + +on: [pull_request, push] + +jobs: + build: + runs-on: ubuntu-latest + + steps: + - uses: actions/checkout@v4 + - uses: ljharb/actions/node/install@main + - run: npm run build diff --git a/.github/workflows/deploy.yml b/.github/workflows/deploy.yml new file mode 100644 index 0000000..0bb1859 --- /dev/null +++ b/.github/workflows/deploy.yml @@ -0,0 +1,20 @@ +name: Deploy gh-pages + +on: + push: + branches: + - main + +jobs: + deploy: + runs-on: ubuntu-latest + + steps: + - uses: actions/checkout@v4 + - uses: ljharb/actions/node/install@main + - run: npm run build + - uses: JamesIves/github-pages-deploy-action@v4.3.3 + with: + branch: gh-pages + folder: build + clean: true diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..d3cae16 --- /dev/null +++ b/.gitignore @@ -0,0 +1,46 @@ +# Logs +logs +*.log +npm-debug.log* + +# Runtime data +pids +*.pid +*.seed + +# Directory for instrumented libs generated by jscoverage/JSCover +lib-cov + +# Coverage directory used by tools like istanbul +coverage + +# nyc test coverage +.nyc_output + +# Grunt intermediate storage (http://gruntjs.com/creating-plugins#storing-task-files) +.grunt + +# node-waf configuration +.lock-wscript + +# Compiled binary addons (http://nodejs.org/api/addons.html) +build/Release + +# Dependency directories +node_modules +jspm_packages + +# Optional npm cache directory +.npm + +# Optional REPL history +.node_repl_history + +# Only apps should have lockfiles +yarn.lock +package-lock.json +npm-shrinkwrap.json +pnpm-lock.yaml + +# Build directory +build diff --git a/.npmrc b/.npmrc new file mode 100644 index 0000000..43c97e7 --- /dev/null +++ b/.npmrc @@ -0,0 +1 @@ +package-lock=false diff --git a/package.json b/package.json new file mode 100644 index 0000000..23a25ba --- /dev/null +++ b/package.json @@ -0,0 +1,23 @@ +{ + "private": true, + "name": "proposal-regex-escaping", + "description": "Proposal for investigating RegExp escaping for the ECMAScript standard", + "scripts": { + "start": "npm run build-loose -- --watch", + "build": "npm run build-loose -- --strict", + "build-loose": "node -e 'fs.mkdirSync(\"build\", { recursive: true })' && ecmarkup --load-biblio @tc39/ecma262-biblio --verbose spec.emu build/index.html --lint-spec" + }, + "homepage": "https://github.com/tc39/proposal-regex-escaping#readme", + "repository": { + "type": "git", + "url": "git+https://github.com/tc39/proposal-regex-escaping.git" + }, + "license": "MIT", + "devDependencies": { + "@tc39/ecma262-biblio": "^2.1.2632", + "ecmarkup": "^17.1.1" + }, + "engines": { + "node": ">= 12" + } +} diff --git a/spec-ecmarkup.html b/spec-ecmarkup.html deleted file mode 100644 index 18494f8..0000000 --- a/spec-ecmarkup.html +++ /dev/null @@ -1,32 +0,0 @@ - - -RegExp.escape - - - -

RegExp.escape ( S )

- -

`escape` takes a string S and returns a version of it with all control characters escaped. It escapes characters that would otherwise be treated by the regular expressions engine as special meta characters such as `^ $ \ . * + ? ( ) [ ] { } |` using an escape sequence character.

- - -

When the `escape` function is called with an argument _S_ the following steps are taken:

- - - 1. Let _str_ be ToString(_S_). - 1. ReturnIfAbrupt(_str_). - 1. Let _cpList_ be a List containing in order the code points as defined in 6.1.4 of _str_, starting at the first element of _str_. - 1. Let _cuList_ be a new List - 1. For each code point _c_ in _cpList_ in List order, do: - 1. If _c_ is a SyntaxCharacter then do: - 1. Append code unit 0x005C (REVERSE SOLIDUS) to cuList. - 1. Append the elements of the UTF16Encoding (10.1.1) of c to cuList. - 1. Let _L_ be a String whose elements are, in order, the elements of _cuList_. - 1. Return _L_. - - -

The `length` property of the `escape` method is *1*.

- - -

`escape` takes a string and escapes it so it can be literally represented as a pattern. In contrast EscapeRegExpPattern (as the name implies) takes a pattern and escapes it so that it can be represented as a string. While the two are related they do not share the same character escape set or perform similar actions.

-
-
diff --git a/spec.emu b/spec.emu new file mode 100644 index 0000000..50298f4 --- /dev/null +++ b/spec.emu @@ -0,0 +1,54 @@ + + + + + +
+title: RegExp.escape
+stage: 1
+contributors: Jordan Harband
+
+ + +

Text Processing

+ + +

RegExp (Regular Expression) Objects

+ + +

Properties of the RegExp Constructor

+ + + +

RegExp.escape ( _S_ )

+

This method takes a string and returns a version of it with all control characters escaped. It escapes characters that would otherwise be treated by the regular expressions engine as special meta characters using an escape sequence character.

+

It performs the following steps when called:

+

+ The phrase "the ASCII punctuators that need escaping" + denotes the following String value, which consists of every ASCII punctuator except U+005F (LOW LINE): + *"(){}[]|,.?\*+-^$=<>\/#&!%:;@~'"`"*. +

+ + + 1. Let _str_ be ? ToString(_S_). + 1. Let _cpList_ be a List containing in order the code points as defined in 6.1.4 of _str_, starting at the first element of _str_. + 1. Let _toEscape_ be a CharSet containing every character in the ASCII punctuators that need escaping. + 1. Let _cuList_ be a new empty List. + 1. For each code point _c_ in _cpList_, do + 1. If _c_ is the first code point in _cpList_ and _c_ is a DecimalDigit, then + 1. Append code unit 0x005C (REVERSE SOLIDUS) to cuList. + 1. Append code unit 0x0078 (LATIN SMALL LETTER X) to cuList. + 1. Else if _c_ is a CharSetElement of _toEscape_ or is WhiteSpace, then + 1. Append code unit 0x005C (REVERSE SOLIDUS) to cuList. + 1. Append the elements of the UTF16Encoding (10.1.1) of c to cuList. + 1. Return CodePointsToString(_cuList_). + + + +

`escape` takes a string and escapes it so it can be literally represented as a pattern. In contrast EscapeRegExpPattern (as the name implies) takes a pattern and escapes it so that it can be represented as a string. While the two are related, they do not share the same character escape set or perform similar actions.

+
+
+
+
+
+
\ No newline at end of file