Merge pull request #113 from readbeyond/devel

aeneas v1.6.0
readbeyond · Sep 26, 2016 · 82a2a86 · 82a2a86
2 parents d7dbb8c + 481bc6b
commit 82a2a86
Show file tree

Hide file tree

Showing 416 changed files with 44,474 additions and 4,191 deletions.
diff --git a/.gitignore b/.gitignore
@@ -10,6 +10,7 @@ bak
 build
 dist
 docs/build
+venvs
 tmp
 
 # service scripts

diff --git a/MANIFEST.in b/MANIFEST.in
@@ -1,12 +1,14 @@
 recursive-include aeneas/cdtw *
 recursive-include aeneas/cew *
+recursive-include aeneas/cfw *
 recursive-include aeneas/cint *
 recursive-include aeneas/cmfcc *
 recursive-include aeneas/cwave *
 recursive-include aeneas/extra *
 prune aeneas/extra/ctw_speect
 recursive-include aeneas/res *
 recursive-include aeneas/tools/res *
+recursive-include aeneas/ttswrappers *
 include aeneas_check_setup.py
 recursive-include bin *
 recursive-include docs *
@@ -18,6 +20,7 @@ include output/.gitignore
 include README.md
 include README.rst
 include requirements.txt
+include setupmeta.py
 recursive-include thirdparty *
 include VERSION
 recursive-include wiki *
diff --git a/README.md b/README.md
@@ -2,8 +2,8 @@
 
 **aeneas** is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment).
 
-* Version: 1.5.1.0
-* Date: 2016-07-25
+* Version: 1.6.0.0
+* Date: 2016-09-26
 * Developed by: [ReadBeyond](http://www.readbeyond.it/)
 * Lead Developer: [Alberto Pettarin](http://www.albertopettarin.it/)
 * License: the GNU Affero General Public License Version 3 (AGPL v3)
@@ -61,8 +61,8 @@ or raw AUD/CSV/SSV/TSV/TXT/XML for further processing.
 2. [Python](https://python.org/) 2.7 (Linux, OS X, Windows) or 3.4 or later (Linux, OS X)
 3. [FFmpeg](https://www.ffmpeg.org/)
 4. [eSpeak](http://espeak.sourceforge.net/)
-5. Python modules `BeautifulSoup4`, `lxml`, and `numpy`
-6. Python C headers to compile the Python C extensions (optional but strongly recommended)
+5. Python packages `BeautifulSoup4`, `lxml`, and `numpy`
+6. Python headers to compile the Python C/C++ extensions (optional but strongly recommended)
 7. A shell supporting UTF-8 (optional but strongly recommended)
 
 ### Supported Platforms
@@ -228,21 +228,22 @@ which explains how to use the built-in command line tools.
 * Arbitrary text fragment granularity (single word, subphrase, phrase, paragraph, etc.)
 * Input audio file formats: all those readable by `ffmpeg`
 * Output sync map formats: AUD, CSV, EAF, JSON, SMIL, SRT, SSV, SUB, TSV, TTML, TXT, VTT, XML
-* Confirmed working on languages: ARA, BUL, CAT, CYM, CES, DAN, DEU, ELL, ENG, EPO, EST, FAS, FIN, FRA, GLE, GRC, HRV, HUN, ISL, ITA, JPN, LAT, LAV, LIT, NLD, NOR, RON, RUS, POL, POR, SLK, SPA, SRP, SWA, SWE, TUR, UKR
+* Confirmed working on 37 languages: ARA, BUL, CAT, CYM, CES, DAN, DEU, ELL, ENG, EPO, EST, FAS, FIN, FRA, GLE, GRC, HRV, HUN, ISL, ITA, JPN, LAT, LAV, LIT, NLD, NOR, RON, RUS, POL, POR, SLK, SPA, SRP, SWA, SWE, TUR, UKR
 * MFCC and DTW computed via Python C extensions to reduce the processing time
-* Several built-in TTS engine wrappers: eSpeak (default, FLOSS), Festival (FLOSS), Nuance TTS API (commercial)
+* Several built-in TTS engine wrappers: eSpeak (default), eSpeak-ng, Festival, Nuance TTS API
 * Default TTS (eSpeak) called via a Python C extension for fast audio synthesis
-* A custom, user-provided TTS engine Python wrapper can be used instead of the built-in ones (included example for speect)
+* Possibility of running a custom, user-provided TTS engine Python wrapper (e.g., included example for speect)
 * Batch processing of multiple audio/text pairs
 * Download audio from a YouTube video
 * In multilevel mode, recursive alignment from paragraph to sentence to word level
+* In multilevel mode, time resolution and/or TTS engine can be specified for each level independently
 * Robust against misspelled/mispronounced words, local rearrangements of words, background noise/sporadic spikes
 * Adjustable splitting times, including a max character/second constraint for CC applications
 * Automated detection of audio head/tail
 * Output an HTML file for fine tuning the sync map manually (`finetuneas` project)
 * Execution parameters tunable at runtime
 * Code suitable for Web app deployment (e.g., on-demand cloud computing)
-* Extensive test suite including 898 unit/integration/performance tests, that run and must pass before each release
+* Extensive test suite including 800+ unit/integration/performance tests, that run and must pass before each release
 
 
 ## Limitations and Missing Features 
@@ -299,37 +300,21 @@ Feel free to
 
 ### Contributing
 
-If you think you found a bug,
+If you think you found a bug
+or you have a feature request,
 please use the
 [GitHub issue tracker](https://github.com/readbeyond/aeneas/issues)
-to file a bug report.
+to submit it.
 
-If you are able to contribute code directly, that is awesome!
-I will be glad to merge it!
-Just a few rules, to make life easier for both you and me:
+If you want to ask a question
+about using **aeneas**,
+your best option consists in sending an email to the
+[mailing list](https://groups.google.com/d/forum/aeneas-forced-alignment).
 
-1. Please do not work on the `master` branch.
- Instead, create a new branch on your GitHub repo
- by cheking out the `devel` branch.
- Open a pull request from your branch on your repo
- to the `devel` branch on this GitHub repo.
-
-2. Please make your code consistent with
- the existing code base style
- (see the
- [Google Python Style Guide](https://google-styleguide.googlecode.com/svn/trunk/pyguide.html)
- ), and test your contributed code
- against the unit tests
- before opening the pull request.
-
-3. Ideally, add some unit tests for the code you are submitting,
- either adding them to the existing unit tests or creating a new file
- in `aeneas/tests/`.
-
-4. **Please note that, by opening a pull request,
- you automatically agree to apply
- the AGPL v3 license
- to the code you contribute.**
+Finally, code contributions are welcome!
+Please refer to the
+[Code Contribution Guide](https://github.com/readbeyond/aeneas/blob/master/wiki/CONTRIBUTING.md)
+for details about the branch policies and the code style to follow.
 
 
 ## Acknowledgments
@@ -347,6 +332,9 @@ for its asynchronous usage.
 **Chris Hubbard** prepared the files for
 packaging aeneas as a Debian/Ubuntu `.deb`.
 
+**Daniel Bair** prepared the `brew` formula
+for installing **aeneas** and its dependencies on Mac OS X.
+
 **Daniel Bair**, **Chris Hubbard**, and **Richard Margetts**
 packaged the installers for Mac OS X and Windows.
 

diff --git a/README.rst b/README.rst
@@ -4,8 +4,8 @@ aeneas
 **aeneas** is a Python/C library and a set of tools to automagically
 synchronize audio and text (aka forced alignment).
 
-- Version: 1.5.1.0
-- Date: 2016-07-25
+- Version: 1.6.0.0
+- Date: 2016-09-26
 - Developed by: `ReadBeyond <http://www.readbeyond.it/>`__
 - Lead Developer: `Alberto Pettarin <http://www.albertopettarin.it/>`__
 - License: the GNU Affero General Public License Version 3 (AGPL v3)
@@ -74,8 +74,8 @@ System Requirements
  later (Linux, OS X)
 3. `FFmpeg <https://www.ffmpeg.org/>`__
 4. `eSpeak <http://espeak.sourceforge.net/>`__
-5. Python modules ``BeautifulSoup4``, ``lxml``, and ``numpy``
-6. Python C headers to compile the Python C extensions (optional but
+5. Python packages ``BeautifulSoup4``, ``lxml``, and ``numpy``
+6. Python headers to compile the Python C/C++ extensions (optional but
  strongly recommended)
 7. A shell supporting UTF-8 (optional but strongly recommended)
 
@@ -235,22 +235,24 @@ Supported Features
 - Input audio file formats: all those readable by ``ffmpeg``
 - Output sync map formats: AUD, CSV, EAF, JSON, SMIL, SRT, SSV, SUB,
  TSV, TTML, TXT, VTT, XML
-- Confirmed working on languages: ARA, BUL, CAT, CYM, CES, DAN, DEU,
+- Confirmed working on 37 languages: ARA, BUL, CAT, CYM, CES, DAN, DEU,
  ELL, ENG, EPO, EST, FAS, FIN, FRA, GLE, GRC, HRV, HUN, ISL, ITA, JPN,
  LAT, LAV, LIT, NLD, NOR, RON, RUS, POL, POR, SLK, SPA, SRP, SWA, SWE,
  TUR, UKR
 - MFCC and DTW computed via Python C extensions to reduce the
  processing time
-- Several built-in TTS engine wrappers: eSpeak (default, FLOSS),
- Festival (FLOSS), Nuance TTS API (commercial)
+- Several built-in TTS engine wrappers: eSpeak (default), eSpeak-ng,
+ Festival, Nuance TTS API
 - Default TTS (eSpeak) called via a Python C extension for fast audio
  synthesis
-- A custom, user-provided TTS engine Python wrapper can be used instead
- of the built-in ones (included example for speect)
+- Possibility of running a custom, user-provided TTS engine Python
+ wrapper (e.g., included example for speect)
 - Batch processing of multiple audio/text pairs
 - Download audio from a YouTube video
 - In multilevel mode, recursive alignment from paragraph to sentence to
  word level
+- In multilevel mode, time resolution and/or TTS engine can be
+ specified for each level independently
 - Robust against misspelled/mispronounced words, local rearrangements
  of words, background noise/sporadic spikes
 - Adjustable splitting times, including a max character/second
@@ -261,7 +263,7 @@ Supported Features
 - Execution parameters tunable at runtime
 - Code suitable for Web app deployment (e.g., on-demand cloud
  computing)
-- Extensive test suite including 898 unit/integration/performance
+- Extensive test suite including 800+ unit/integration/performance
  tests, that run and must pass before each release
 
 Limitations and Missing Features
@@ -333,31 +335,18 @@ Feel free to `get in touch <mailto:[email protected]>`__.
 Contributing
 ~~~~~~~~~~~~
 
-If you think you found a bug, please use the `GitHub issue
-tracker <https://github.com/readbeyond/aeneas/issues>`__ to file a bug
-report.
+If you think you found a bug or you have a feature request, please use
+the `GitHub issue
+tracker <https://github.com/readbeyond/aeneas/issues>`__ to submit it.
 
-If you are able to contribute code directly, that is awesome! I will be
-glad to merge it! Just a few rules, to make life easier for both you and
-me:
+If you want to ask a question about using **aeneas**, your best option
+consists in sending an email to the `mailing
+list <https://groups.google.com/d/forum/aeneas-forced-alignment>`__.
 
-1. Please do not work on the ``master`` branch. Instead, create a new
- branch on your GitHub repo by cheking out the ``devel`` branch. Open
- a pull request from your branch on your repo to the ``devel`` branch
- on this GitHub repo.
-
-2. Please make your code consistent with the existing code base style
- (see the `Google Python Style
- Guide <https://google-styleguide.googlecode.com/svn/trunk/pyguide.html>`__
- ), and test your contributed code against the unit tests before
- opening the pull request.
-
-3. Ideally, add some unit tests for the code you are submitting, either
- adding them to the existing unit tests or creating a new file in
- ``aeneas/tests/``.
-
-4. **Please note that, by opening a pull request, you automatically
- agree to apply the AGPL v3 license to the code you contribute.**
+Finally, code contributions are welcome! Please refer to the `Code
+Contribution
+Guide <https://github.com/readbeyond/aeneas/blob/master/wiki/CONTRIBUTING.md>`__
+for details about the branch policies and the code style to follow.
 
 Acknowledgments
 ---------------
@@ -373,6 +362,9 @@ asynchronous usage.
 **Chris Hubbard** prepared the files for packaging aeneas as a
 Debian/Ubuntu ``.deb``.
 
+**Daniel Bair** prepared the ``brew`` formula for installing **aeneas**
+and its dependencies on Mac OS X.
+
 **Daniel Bair**, **Chris Hubbard**, and **Richard Margetts** packaged
 the installers for Mac OS X and Windows.
 

diff --git a/VERSION b/VERSION
@@ -1 +1 @@
-1.5.1
+1.6.0
diff --git a/aeneas/README.md b/aeneas/README.md
@@ -1,11 +1,12 @@
 # aeneas Main Library 
 
-This Python module (directory) contains the main ``aeneas`` library.
+This Python package contains the main ``aeneas`` library.
 
-Unit tests are contained in the ``aeneas.tests`` submodule.
+Wrappers for the built-in TTS engines are located
+in the ``aeneas.ttswrappers`` subpackage.
 
-The ``aeneas.tools`` submodule define several command line tools
+The ``aeneas.tools`` subpackage define several command line tools
 which use the main ``aeneas`` library.
 
-
+Unit tests are contained in the ``aeneas.tests`` subpackage.
 
diff --git a/aeneas/__init__.py b/aeneas/__init__.py
@@ -1,21 +1,38 @@
 #!/usr/bin/env python
 # coding=utf-8
 
+# aeneas is a Python/C library and a set of tools
+# to automagically synchronize audio and text (aka forced alignment)
+#
+# Copyright (C) 2012-2013, Alberto Pettarin (www.albertopettarin.it)
+# Copyright (C) 2013-2015, ReadBeyond Srl (www.readbeyond.it)
+# Copyright (C) 2015-2016, Alberto Pettarin (www.albertopettarin.it)
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU Affero General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU Affero General Public License for more details.
+#
+# You should have received a copy of the GNU Affero General Public License
+# along with this program. If not, see <http://www.gnu.org/licenses/>.
+
 """
 **aeneas** is a Python/C library and a set of tools
 to automagically synchronize audio and text (aka forced alignment).
 """
 
 __author__ = "Alberto Pettarin"
+__email__ = "[email protected]"
 __copyright__ = """
  Copyright 2012-2013, Alberto Pettarin (www.albertopettarin.it)
  Copyright 2013-2015, ReadBeyond Srl (www.readbeyond.it)
  Copyright 2015-2016, Alberto Pettarin (www.albertopettarin.it)
  """
 __license__ = "GNU AGPL v3"
-__version__ = "1.5.1"
-__email__ = "[email protected]"
 __status__ = "Production"
-
-
-
+__version__ = "1.6.0"