Skip to content
/ gwizo Public

Simple Go implementation of the Porter Stemmer algorithm with powerful features.

License

Notifications You must be signed in to change notification settings

kampsy/gwizo

Repository files navigation

gwizo

home

Gwizo version GoDoc License Twitter

Package gwizo implements Porter Stemmer algorithm, M. "An algorithm for suffix stripping." Program 14.3 (1980): 130-137. Martin Porter, the algorithm's inventor, maintains a web page about the algorithm at http://www.tartarus.org/~martin/PorterStemmer/

Installation

To install, simply run in a terminal:

go get github.com/kampsy/gwizo

Stem

Stem: stem the word.

package main

import (
  "fmt"
  "github.com/kampsy/gwizo"
)

func main() {
  stem := gwizo.Stem("abilities")
  fmt.Printf("Stem: %s\n", stem)
}
$ go run main.go

Stem: able

Vowels, Consonants and Measure

gwizo returns a type Token which has two fileds, VowCon which is the vowel consonut pattern and the Measure value [v]vc{m}[c]

  package main

  import (
    "fmt"
    "github.com/kampsy/gwizo"
    "strings"
  )

func main() {
  word := "abilities"
  token := gwizo.Parse(word)

  // VowCon
  fmt.Printf("%s has Pattern %s \n", word, token.VowCon)

  // Measure value [v]vc{m}[c]
  fmt.Printf("%s has Measure value %d \n", word, token.Measure)

  // Number of Vowels
  v := strings.Count(token.VowCon, "v")
  fmt.Printf("%s Has %d Vowels \n", word, v)

  // Number of Consonants
  c := strings.Count(token.VowCon, "c")
  fmt.Printf("%s Has %d Consonants\n", word, c)
}
$ go run main.go

abilities has Pattern vcvcvcvvc
abilities has Measure value 4
abilities Has 5 Vowels
abilities Has 4 Consonants

File Stem Performance.

  package main

  import (
    "fmt"
    "github.com/kampsy/gwizo"
    "bufio"
    "io/ioutil"
    "strings"
    "os"
    "time"
  )

  func main() {
    curr := time.Now()
    writeOut()
    elaps := time.Since(curr)
    fmt.Println("============================")
    fmt.Println("Done After:", elaps)
    fmt.Println("============================")
  }

  func writeOut() {
    re, err := ioutil.ReadFile("input.txt")
    if err != nil {
      fmt.Println(err)
    }

    file := strings.NewReader(fmt.Sprintf("%s", re))
    scanner := bufio.NewScanner(file)
    out, err := os.Create("stem.txt")
    if err != nil {
      fmt.Println(err)
    }
    defer out.Close()
    for scanner.Scan() {
      txt := scanner.Text()
      stem := gwizo.Stem(txt)
      out.WriteString(fmt.Sprintf("%s\n", stem))
      fmt.Println(txt, "--->", str)
    }
    if err := scanner.Err(); err != nil {
      fmt.Println(err)
    }
  }
$ go run main.go

About

Simple Go implementation of the Porter Stemmer algorithm with powerful features.

Topics

Resources

License

Stars

Watchers

Forks