Merge pull request #94 from readbeyond/devel

aeneas v1.5.1
readbeyond · Jul 25, 2016 · d7dbb8c · d7dbb8c
2 parents faeaff6 + d30ad36
commit d7dbb8c
Show file tree

Hide file tree

Showing 154 changed files with 2,064 additions and 1,705 deletions.
diff --git a/MANIFEST.in b/MANIFEST.in
@@ -14,6 +14,7 @@ prune docs/build
 include CHANGELOG
 include LICENSE
 recursive-include licenses *
+include output/.gitignore
 include README.md
 include README.rst
 include requirements.txt

diff --git a/README.md b/README.md
@@ -2,8 +2,8 @@
 
 **aeneas** is a Python/C library and a set of tools to automagically synchronize audio and text (aka forced alignment).
 
-* Version: 1.5.0.3
-* Date: 2016-04-23
+* Version: 1.5.1.0
+* Date: 2016-07-25
 * Developed by: [ReadBeyond](http://www.readbeyond.it/)
 * Lead Developer: [Alberto Pettarin](http://www.albertopettarin.it/)
 * License: the GNU Affero General Public License Version 3 (AGPL v3)
@@ -87,6 +87,16 @@ which can be installed on any modern OS (Linux, Mac OS X, Windows).
 
 ### Installation
 
+All-in-one installers are available for Mac OS X and Windows,
+and a Bash script for deb-based Linux distributions (Debian, Ubuntu)
+is provided in this repository.
+It is also possible to download a VirtualBox+Vagrant virtual machine.
+Please see the
+[INSTALL file](https://github.com/readbeyond/aeneas/blob/master/wiki/INSTALL.md)
+for detailed, step-by-step installation procedures for different operating systems.
+
+The generic OS-independent procedure is simple:
+
 1. Install
  [Python](https://python.org/) (2.7.x preferred),
  [FFmpeg](https://www.ffmpeg.org/), and
@@ -102,20 +112,16 @@ which can be installed on any modern OS (Linux, Mac OS X, Windows).
  pip install aeneas
  ```
 
-See the
-[INSTALL file](https://github.com/readbeyond/aeneas/blob/master/wiki/INSTALL.md)
-for detailed, step-by-step procedures for Linux, OS X, and Windows.
-
-
-## Usage
-
-1. To **check** whether you installed **aeneas** correctly, run:
+4. To **check** whether you installed **aeneas** correctly, run:
 
  ```bash
  python -m aeneas.diagnostics
  ```
 
-2. Run without arguments to get the **usage message**:
+
+## Usage
+
+1. Run without arguments to get the **usage message**:
 
  ```bash
  python -m aeneas.tools.execute_task
@@ -131,7 +137,7 @@ for detailed, step-by-step procedures for Linux, OS X, and Windows.
  python -m aeneas.tools.execute_task --examples-all
  ```
 
-3. To **compute a synchronization map** `map.json` for a pair
+2. To **compute a synchronization map** `map.json` for a pair
  (`audio.mp3`, `text.txt` in
  [plain](http://www.readbeyond.it/aeneas/docs/textfile.html#aeneas.textfile.TextFileFormat.PLAIN)
  text format), you can run:
@@ -169,7 +175,7 @@ for detailed, step-by-step procedures for Linux, OS X, and Windows.
  [documentation](http://www.readbeyond.it/aeneas/docs/)
  for details.
 
-4. If you have several tasks to process,
+3. If you have several tasks to process,
  you can create a **job container**
  to batch process them:
 
@@ -222,12 +228,12 @@ which explains how to use the built-in command line tools.
 * Arbitrary text fragment granularity (single word, subphrase, phrase, paragraph, etc.)
 * Input audio file formats: all those readable by `ffmpeg`
 * Output sync map formats: AUD, CSV, EAF, JSON, SMIL, SRT, SSV, SUB, TSV, TTML, TXT, VTT, XML
-* Tested languages: ARA, BUL, CAT, CYM, CES, DAN, DEU, ELL, ENG, EPO, EST, FAS, FIN, FRA, GLE, GRC, HRV, HUN, ISL, ITA, LAT, LAV, LIT, NLD, NOR, RON, RUS, POL, POR, SLK, SPA, SRP, SWA, SWE, TUR, UKR
+* Confirmed working on languages: ARA, BUL, CAT, CYM, CES, DAN, DEU, ELL, ENG, EPO, EST, FAS, FIN, FRA, GLE, GRC, HRV, HUN, ISL, ITA, JPN, LAT, LAV, LIT, NLD, NOR, RON, RUS, POL, POR, SLK, SPA, SRP, SWA, SWE, TUR, UKR
 * MFCC and DTW computed via Python C extensions to reduce the processing time
-* On Linux, eSpeak called via a Python C extension for faster audio synthesis
-* Batch processing of multiple audio/text pairs
 * Several built-in TTS engine wrappers: eSpeak (default, FLOSS), Festival (FLOSS), Nuance TTS API (commercial)
-* Use custom TTS engine wrappers besides the built-in ones
+* Default TTS (eSpeak) called via a Python C extension for fast audio synthesis
+* A custom, user-provided TTS engine Python wrapper can be used instead of the built-in ones (included example for speect)
+* Batch processing of multiple audio/text pairs
 * Download audio from a YouTube video
 * In multilevel mode, recursive alignment from paragraph to sentence to word level
 * Robust against misspelled/mispronounced words, local rearrangements of words, background noise/sporadic spikes
@@ -236,13 +242,14 @@ which explains how to use the built-in command line tools.
 * Output an HTML file for fine tuning the sync map manually (`finetuneas` project)
 * Execution parameters tunable at runtime
 * Code suitable for Web app deployment (e.g., on-demand cloud computing)
+* Extensive test suite including 898 unit/integration/performance tests, that run and must pass before each release
 
 
 ## Limitations and Missing Features 
 
 * Audio should match the text: large portions of spurious text or audio might produce a wrong sync map
-* Audio is assumed to be spoken: not suitable/YMMV for song captioning
-* No protection against memory trashing if you feed extremely long audio files
+* Audio is assumed to be spoken: not suitable for song captioning, YMMV for CC applications
+* No protection against memory trashing if you feed extremely long audio files (>1.5h per single audio file)
 * [Open issues](https://github.com/readbeyond/aeneas/issues)
 
 
@@ -340,6 +347,9 @@ for its asynchronous usage.
 **Chris Hubbard** prepared the files for
 packaging aeneas as a Debian/Ubuntu `.deb`.
 
+**Daniel Bair**, **Chris Hubbard**, and **Richard Margetts**
+packaged the installers for Mac OS X and Windows.
+
 **Firat Ozdemir** contributed the `finetuneas`
 HTML/JS code for fine tuning sync maps in the browser.
 

diff --git a/README.rst b/README.rst
@@ -4,8 +4,8 @@ aeneas
 **aeneas** is a Python/C library and a set of tools to automagically
 synchronize audio and text (aka forced alignment).
 
-- Version: 1.5.0.3
-- Date: 2016-04-23
+- Version: 1.5.1.0
+- Date: 2016-07-25
 - Developed by: `ReadBeyond <http://www.readbeyond.it/>`__
 - Lead Developer: `Alberto Pettarin <http://www.albertopettarin.it/>`__
 - License: the GNU Affero General Public License Version 3 (AGPL v3)
@@ -100,6 +100,16 @@ modern OS (Linux, Mac OS X, Windows).
 Installation
 ~~~~~~~~~~~~
 
+All-in-one installers are available for Mac OS X and Windows, and a Bash
+script for deb-based Linux distributions (Debian, Ubuntu) is provided in
+this repository. It is also possible to download a VirtualBox+Vagrant
+virtual machine. Please see the `INSTALL
+file <https://github.com/readbeyond/aeneas/blob/master/wiki/INSTALL.md>`__
+for detailed, step-by-step installation procedures for different
+operating systems.
+
+The generic OS-independent procedure is simple:
+
 1. Install `Python <https://python.org/>`__ (2.7.x preferred),
  `FFmpeg <https://www.ffmpeg.org/>`__, and
  `eSpeak <http://espeak.sourceforge.net/>`__
@@ -114,18 +124,14 @@ Installation
  pip install numpy
  pip install aeneas
 
-See the `INSTALL
-file <https://github.com/readbeyond/aeneas/blob/master/wiki/INSTALL.md>`__
-for detailed, step-by-step procedures for Linux, OS X, and Windows.
+4. To **check** whether you installed **aeneas** correctly, run:
+
+``bash python -m aeneas.diagnostics``
 
 Usage
 -----
 
-1. To **check** whether you installed **aeneas** correctly, run:
-
-``bash python -m aeneas.diagnostics``
-
-2. Run without arguments to get the **usage message**:
+1. Run without arguments to get the **usage message**:
 
  .. code:: bash
 
@@ -140,7 +146,7 @@ Usage
  python -m aeneas.tools.execute_task --examples
  python -m aeneas.tools.execute_task --examples-all
 
-3. To **compute a synchronization map** ``map.json`` for a pair
+2. To **compute a synchronization map** ``map.json`` for a pair
  (``audio.mp3``, ``text.txt`` in
  `plain <http://www.readbeyond.it/aeneas/docs/textfile.html#aeneas.textfile.TextFileFormat.PLAIN>`__
  text format), you can run:
@@ -178,7 +184,7 @@ specifies the parameters controlling the I/O formats and the processing
 options for the task. Consult the
 `documentation <http://www.readbeyond.it/aeneas/docs/>`__ for details.
 
-4. If you have several tasks to process, you can create a **job
+3. If you have several tasks to process, you can create a **job
  container** to batch process them:
 
  .. code:: bash
@@ -229,17 +235,19 @@ Supported Features
 - Input audio file formats: all those readable by ``ffmpeg``
 - Output sync map formats: AUD, CSV, EAF, JSON, SMIL, SRT, SSV, SUB,
  TSV, TTML, TXT, VTT, XML
-- Tested languages: ARA, BUL, CAT, CYM, CES, DAN, DEU, ELL, ENG, EPO,
- EST, FAS, FIN, FRA, GLE, GRC, HRV, HUN, ISL, ITA, LAT, LAV, LIT, NLD,
- NOR, RON, RUS, POL, POR, SLK, SPA, SRP, SWA, SWE, TUR, UKR
+- Confirmed working on languages: ARA, BUL, CAT, CYM, CES, DAN, DEU,
+ ELL, ENG, EPO, EST, FAS, FIN, FRA, GLE, GRC, HRV, HUN, ISL, ITA, JPN,
+ LAT, LAV, LIT, NLD, NOR, RON, RUS, POL, POR, SLK, SPA, SRP, SWA, SWE,
+ TUR, UKR
 - MFCC and DTW computed via Python C extensions to reduce the
  processing time
-- On Linux, eSpeak called via a Python C extension for faster audio
- synthesis
-- Batch processing of multiple audio/text pairs
 - Several built-in TTS engine wrappers: eSpeak (default, FLOSS),
  Festival (FLOSS), Nuance TTS API (commercial)
-- Use custom TTS engine wrappers besides the built-in ones
+- Default TTS (eSpeak) called via a Python C extension for fast audio
+ synthesis
+- A custom, user-provided TTS engine Python wrapper can be used instead
+ of the built-in ones (included example for speect)
+- Batch processing of multiple audio/text pairs
 - Download audio from a YouTube video
 - In multilevel mode, recursive alignment from paragraph to sentence to
  word level
@@ -253,15 +261,18 @@ Supported Features
 - Execution parameters tunable at runtime
 - Code suitable for Web app deployment (e.g., on-demand cloud
  computing)
+- Extensive test suite including 898 unit/integration/performance
+ tests, that run and must pass before each release
 
 Limitations and Missing Features
 --------------------------------
 
 - Audio should match the text: large portions of spurious text or audio
  might produce a wrong sync map
-- Audio is assumed to be spoken: not suitable/YMMV for song captioning
+- Audio is assumed to be spoken: not suitable for song captioning, YMMV
+ for CC applications
 - No protection against memory trashing if you feed extremely long
- audio files
+ audio files (>1.5h per single audio file)
 - `Open issues <https://github.com/readbeyond/aeneas/issues>`__
 
 License
@@ -362,6 +373,9 @@ asynchronous usage.
 **Chris Hubbard** prepared the files for packaging aeneas as a
 Debian/Ubuntu ``.deb``.
 
+**Daniel Bair**, **Chris Hubbard**, and **Richard Margetts** packaged
+the installers for Mac OS X and Windows.
+
 **Firat Ozdemir** contributed the ``finetuneas`` HTML/JS code for fine
 tuning sync maps in the browser.
 

diff --git a/VERSION b/VERSION
@@ -1 +1 @@
-1.5.0
+1.5.1
diff --git a/aeneas/__init__.py b/aeneas/__init__.py
@@ -13,7 +13,7 @@
  Copyright 2015-2016, Alberto Pettarin (www.albertopettarin.it)
  """
 __license__ = "GNU AGPL v3"
-__version__ = "1.5.0"
+__version__ = "1.5.1"
 __email__ = "[email protected]"
 __status__ = "Production"
 

diff --git a/aeneas/adjustboundaryalgorithm.py b/aeneas/adjustboundaryalgorithm.py
@@ -30,7 +30,7 @@
  Copyright 2015-2016, Alberto Pettarin (www.albertopettarin.it)
  """
 __license__ = "GNU AGPL v3"
-__version__ = "1.5.0"
+__version__ = "1.5.1"
 __email__ = "[email protected]"
 __status__ = "Production"
 

diff --git a/aeneas/analyzecontainer.py b/aeneas/analyzecontainer.py
@@ -32,7 +32,7 @@
  Copyright 2015-2016, Alberto Pettarin (www.albertopettarin.it)
  """
 __license__ = "GNU AGPL v3"
-__version__ = "1.5.0"
+__version__ = "1.5.1"
 __email__ = "[email protected]"
 __status__ = "Production"
 

diff --git a/aeneas/audiofile.py b/aeneas/audiofile.py
@@ -37,7 +37,7 @@
  Copyright 2015-2016, Alberto Pettarin (www.albertopettarin.it)
  """
 __license__ = "GNU AGPL v3"
-__version__ = "1.5.0"
+__version__ = "1.5.1"
 __email__ = "[email protected]"
 __status__ = "Production"
 
@@ -116,6 +116,77 @@ class AudioFile(Loggable):
  :type logger: :class:`~aeneas.logger.Logger`
  """
 
+ FILE_EXTENSIONS = [
+ u"3g2",
+ u"3gp",
+ u"aa",
+ u"aa3",
+ u"aac",
+ u"aax",
+ u"aiff",
+ u"alac",
+ u"amr",
+ u"ape",
+ u"asf",
+ u"at3",
+ u"at9",
+ u"au",
+ u"avi",
+ u"awb",
+ u"celt",
+ u"dct",
+ u"dss",
+ u"dvf",
+ u"eac",
+ u"flac",
+ u"flv",
+ u"gsm",
+ u"m4a",
+ u"m4b",
+ u"m4p",
+ u"m4v",
+ u"mid",
+ u"midi",
+ u"mkv",
+ u"mmf",
+ u"mov",
+ u"mp2",
+ u"mp3",
+ u"mp4",
+ u"mpc",
+ u"mpeg",
+ u"mpg",
+ u"mpv",
+ u"msv",
+ u"oga",
+ u"ogg",
+ u"ogv",
+ u"oma",
+ u"opus",
+ u"pcm",
+ u"qt",
+ u"ra",
+ u"ram",
+ u"raw",
+ u"riff",
+ u"rm",
+ u"rmvb",
+ u"shn",
+ u"sln",
+ u"theora",
+ u"tta",
+ u"vob",
+ u"vorbis",
+ u"vox",
+ u"wav",
+ u"webm",
+ u"wma",
+ u"wmv",
+ u"wv",
+ u"yuv",
+ ]
+ """ Extensions of common formats for audio (and video) files. """
+
  TAG = u"AudioFile"
 
  def __init__(self, file_path=None, is_mono_wave=False, rconf=None, logger=None):

diff --git a/aeneas/audiofilemfcc.py b/aeneas/audiofilemfcc.py
@@ -29,7 +29,7 @@
  Copyright 2015-2016, Alberto Pettarin (www.albertopettarin.it)
  """
 __license__ = "GNU AGPL v3"
-__version__ = "1.5.0"
+__version__ = "1.5.1"
 __email__ = "[email protected]"
 __status__ = "Production"
 
@@ -134,7 +134,7 @@ def __init__(
  self._compute_mfcc_c_extension,
  self._compute_mfcc_pure_python,
  (),
- c_extension=self.rconf[RuntimeConfiguration.C_EXTENSIONS]
+ rconf=self.rconf
  )
  self.audio_length = self.audio_file.audio_length
  if audio_file_was_none:

diff --git a/aeneas/cdtw/000_compile_driver.sh b/aeneas/cdtw/000_compile_driver.sh
@@ -1,6 +1,6 @@
 #!/bin/bash
 
-gcc cdtw_driver.c cdtw_func.c cint.c -o cdtw_driver -lm -Wall -pedantic -std=c99
+gcc cdtw_driver.c cdtw_func.c ../cint/cint.c -o cdtw_driver -lm -Wall -pedantic -std=c99
 
 
 
diff --git a/aeneas/cdtw/900_clean.sh b/aeneas/cdtw/900_clean.sh
@@ -0,0 +1,3 @@
+#!/bin/bash
+
+rm -rf build __pycache__ *.so cdtw_driver