Skip to content
You must be logged in to sponsor chrismattmann

Become a sponsor to Chris Mattmann

@chrismattmann

Chris Mattmann

chrismattmann
La Canada Flintridge, CA

Chris Mattmann contributes to open source as a former Director at the Apache Software Foundation where he was one of the initial contributors to Apache Nutch as a member of its project management committee, the predecessor to Apache Hadoop. Mattmann is the progenitor of the Apache Tika framework, the digital "babel fish" and de-facto content analysis and detection framework that exists.

Mattmann is the Director of the Information Retrieval & Data Science (IRDS) group at USC and Adjunct Associate Professor. He teaches graduate courses in Content Detection & Analysis & in Search Engines & Information Retrieval. Mattmann has materially contributed to understanding of the Deep Web and Dark Web through the DARPA MEMEX project. Mattmann's work helped uncover the Panama Papers scandal.

@chrismattmann

There are at least as many users of Tika-Python and then some. Getting 2 sponsors who recognize I do this work in my spare time and it's valuable to their business is my goal!

Featured work

  1. chrismattmann/tika-python

    Tika-Python is a Python binding to the Apache Tika™ REST services allowing Tika to be called natively in the Python community.

    Python 1,420
  2. chrismattmann/imagecat

    ImageCat is an Apache OODT RADIX application that uses Apache Solr, Apache Tika and Apache OODT to ingest 10s of millions of files (images,but could be extended to other files) in place, and to ext…

    Java 94
  3. chrismattmann/tika-similarity

    Tika-Similarity uses the Tika-Python package (Python port of Apache Tika) to compute file similarity based on Metadata features.

    Python 102
  4. chrismattmann/nutch-python

    Nutch-Python is a Python binding to the Apache Nutch™ REST services allowing Nutch to be called natively in the Python community. — Edit

    Python 36
  5. chrismattmann/lucene-geo-gazetteer

    Uses Apache Lucene, OpenNLP and geonames and extracts locations from text and geocodes them.

    Java 36
  6. chrismattmann/etllib

    This is the ETL lib package. It provides an API to munge and prepare JSON, TSV and other data using Apache Tika and JSON parsing/loading for ETL via Apache OODT (or other libs) into Apache Solr.

    Python 16

0% towards 2 monthly sponsors goal

Be the first to sponsor this goal!

Select a tier

$ a month

Choose a custom amount.

$100 a month

Select

Basic Appreciation level

You appreciate the work that Chris is doing free of charge in his spare time on a variety of open source projects e.g., Tika-Python and others and simply want to show your support.