Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for custom regex, and clarifying the usage of anyOf() #237

Open
xRSquared opened this issue Mar 3, 2023 · 7 comments
Open

Allow for custom regex, and clarifying the usage of anyOf() #237

xRSquared opened this issue Mar 3, 2023 · 7 comments

Comments

@xRSquared
Copy link
Contributor

@didavid61202 @danielroe , As I was working on issue #7, two issues/points of clarification for the API came to mind.

Allowing for custom regex patterns

Unless there is a function that I don't know of, a user can't add to an expression using custom regex unless it is exported as one of the helpers for specific RegExp characters such as digit, whitespace, letter, etc. For example, using the current API there is no way to include the following regex pattern [1-9] without it being passed to exactly() and ending up as \[1-9\].

import { exactly } from "magic-regexp";

const test = exactly("foo").and("[1-9]"); // foo\[1-9\]

Note: the regex pattern was passed to exactly() here

When working within the package, we can create these arbitrary regex patterns using createInput(); however, this function isn't exported to end users.

Possible Solutions

  1. export an alias for createInput named one of the following input, rawInput,regex, or some other suggestion
  2. leave as is and don't allow users to use custom regex patterns (I don't think this is an end goal of the package)

Providing an alias to createInput would allow for patterns such as:

import { exactly, input } from "magic-regexp";

const test = exactly("foo").and(input("[1-9]")); // foo[1-9]

anyOf

@didavid61202 @danielroe, the anyOf function states that it takes an array of inputs, but it doesn't really take an array; it takes an arbitrary number of arguments. The function documentation stating it takes an array can lead to confusion, and in fact, it confused me when I first started using this package.

Consider the following examples:

import { anyOf } from "magic-regexp";

const test1 = anyOf(...["a", "b", "c"]); //(?:a|b|c)
const test2 = anyOf("a", "b", "c"); //(?:a|b|c)
const test3 = anyOf(["a", "b", "c"]); //(?:a,b,c)
const test4 = anyOf("abc"); //(?:abc)

Possible Solutions

  1. leave as is, and change the documentation
  2. accept arrays, and do the array unpacking within the function
  3. overload the function to allow for the passing of arrays
@didavid61202
Copy link
Collaborator

didavid61202 commented Mar 3, 2023

Nice suggestions!
I think we could support a helper function that insert custom raw RegExp pattern as an escape hatch for special cases, and for naming, I think maybe rawRegExp would be less confusing? Let's discuss further.
and it would be awesome to support both array or arbitrary number of arguments for anyOf.👍

I'm also thinking about simplifying a series of .and(...).and(...) to an array wrap with a new helper, what do you think?

@xRSquared
Copy link
Contributor Author

rawRegExp

I like rawRegExp; let's go with that unless @danielroe has a better suggestion. I will note that I don't view it as an escape hatch for special cases. I think it should be a standard function that users regularly use, at least, as long as there isn't an API to create custom character sets.

anyOf

I think we can go one of two ways:

  1. function overloading anyOf
  2. create a function anyOfArray

I'm indifferent to either choice, so we can go with whichever @didavid61202 chooses.

array wrapper for .and

This was actually on my list of things to suggest. Creating a function similar to anyOf for .and(...).and(...) should improve readability substantially.

Side Note: in the python package I'm working on, I used operator overloading of + to improve the readability of adding patterns. Sadly, no operator overloading in Typescript/JavaScript.

@danielroe
Copy link
Member

The main issue with custom regexp is that it bypasses the type safety of this library, which is why I've hitherto intentionally made it difficult to add custom chunks of regexp.

Instead, I think it would make sense to focus on providing any missing pieces (e.g. the API to create custom character sets you mention). Ranges (meant to be implemented in #162) and greedy/non-greedy globs are two other pieces to add in, and I'd be up for thinking about more also.

@xRSquared
Copy link
Contributor Author

xRSquared commented Mar 5, 2023

Agreed, now that I think about it, the primary use cases that I envisioned for the escape hatch were for character sets and the lazy quantifier. It is probably best to natively implement that functionality and not implement the escape hatch.

Edit(04/01/2023): I removed the requirements to close this Issue and explained them below in a separate comment.

@didavid61202
Copy link
Collaborator

didavid61202 commented Apr 1, 2023

Hi @xRSquared, @danielroe
I've created a PR (#284) proposing an update to improve the readability of chaining multiple .and(...).and(...) by updating all input helpers to variadic functions.

some suggestions or inputs are welcome! If everything is good, maybe we can update or add some examples in the doc?

@xRSquared
Copy link
Contributor Author

xRSquared commented Apr 2, 2023

List of features needed to close this Issue:

@wvffle
Copy link

wvffle commented Jun 25, 2023

Hi. How far are you from implementing the custom character sets? Have you considered the API for them and for negative character sets as well?

I find myself using regexps like href="[^"]+" quite often and as much as I'd like to use magic-regexp everywhere in the projects I'm working on, I'm sometimes forced to lean towards the regular regexps.

@polar-sh polar-sh bot added the polar label Jun 27, 2023
@danielroe danielroe removed the polar label Jun 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants