Skip to content

tylov/regexp9

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

regexp9

Regular expressions based on Plan9 code.

Changes from original code by Rob Pike

  • Fully supports UTF8 (transformed old Rune-type).
  • Made reentrant: moved global parser variables to stack.
  • Added support for escaped ctrl chars in expressions: tab, newline etc. \t \n \r \v \f
  • Added support for shorthand character classes and inverse: \d \D \s \S \w \W
  • Added support for word boundary meta character and inverse: \b \B
  • Added support for inline "single line"/dotAll mode (from insertion point): (?s)
  • Added support for POSIX char classes [:alnum:] [:alpha:] [:blank:] [:cntrl:] [:digit:] [:graph:] [:lower:] [:print:] [:punct:] [:space:] [:upper:] [:word:] [:xdigit:]
  • Removed obsolete rregexec9() and rregsub9(), and the rather pointless regcomplit9().
  • Constified (const char*) all references to input strings.
  • Formatting changes: tabs to space, etc.
  • Optimizations: malloc usage and shorter code. Fast UTF8 code.
  • Compiles with C99, C++.
  • Reduced total source code size from about 1600 to 1200 lines.

Example

#include "regexp9.h"
#include <stdio.h>

int main() {
    const char* pattern = "hell.([ \\t]w.rld)+";
    const char* input = "hell😀 w😀rld\tworld wxrld";

    enum {N=5};
    Resub rs[N] = {0};

    Reprog *p = regcomp9(pattern);
    if (regexec9(p, input, rs, N))
        printf("regexp9: '%s' => matched: %s\n", input, pattern);
    else
        printf("regexp9: No match\n");
    regfree9(p);
}

Releases

No releases published

Packages

No packages published