Skip to content
This repository has been archived by the owner on May 26, 2021. It is now read-only.

Periods in filenames #7

Open
kevinashaw opened this issue Nov 19, 2017 · 9 comments
Open

Periods in filenames #7

kevinashaw opened this issue Nov 19, 2017 · 9 comments
Labels

Comments

@kevinashaw
Copy link

Are periods allowed in filenames? I'm getting an error from SAI:

OSError: [Errno 2] No such file or directory: 
'/Users/foo/filtered/Recording_Start2017-11-10T19-16-31-538Z_Part000_Dur.wav'

The original source filename is:
/Users/foo/Recording_Start.2017-11-10T19-16-31-538Z_Part.000_Dur.0m24s373.wav
Is SAI stripping out characters from the filename?
I will also note that the ~src/filtered directory does not seem to be showing up. Of course, this may be due to the filename issue.

@kevinashaw
Copy link
Author

Nevermind. This is my mistake. Sorry.

@aalireza
Copy link
Owner

aalireza commented Nov 20, 2017

Okay! However I'd appreciate it to know whether the problem was of the nature that further clarifications in the documentation would have prevented it from happening and if so, what was it.

For Future reference:

  1. Periods are allowed in the filenames.
  2. Stripping is done for the non-printable characters which can't typically be a filename.
  3. In the <1.0 versions, it was possible to partially read the filename that appears as some part being stripped, only if filename was of the form [some string].[some allowed format].[actual format of the file] and then renamed to [some string].[some allowed format] e.g. test.wav.mp3 that's renamed to test.wav will be considered as wav not mp3. Now only wav is supported and it's already in the specification that filename would be checked for the format of the file.
  4. Irrespective of the above -which wouldn't have applied to your particular file- filtered should be created prior to any file being touched. Such operation is contingent upon indexer being initialized which means asserting src_dir to exist should have passed. So either filtered was created correctly but in a wrong provided path, or it wasn't and you didn't get any error messages prior to that step which is not expected.

@kevinashaw
Copy link
Author

@aalireza : Thanks for the reply. We are using R1.0.0.
From what I am seeing, It seems that SAI is stripping periods (".") from filenames.
Once I used filenames without periods (other than the extension) all was fine.
Since I use pydub to convert our .m4a files to .wav prior to processing, I am just renaming the file to "temp.wav" and letting the Speech-to-text processor work on that. I then delete the temp file when the results are complete. Once this was done, the filtered directory showed up fine and everything just worked.
Thanks again!

@aalireza
Copy link
Owner

aalireza commented Dec 9, 2017

You're right! Periods are being stripped. That's a bug. I'll fix it in a couple of days after the finals.

@aalireza aalireza reopened this Dec 9, 2017
@aalireza aalireza added the bug label Dec 9, 2017
@enviz
Copy link

enviz commented Mar 16, 2019

https://simpleaudioindexer.readthedocs.io/usage/#as-a-python-library

I’ve read the documentation and tried to implement it. There is an error

from SimpleAudioIndexer import SimpleAudioIndexer as sai

indexer = sai(mode="cmu",src_dir="C:/Users/Vikram/audio/small_audio.wav") #even tried it with the IBM mode as well

indexer.index_audio() #the code stops at this statement and throws an error

indexer.save_indexed_audio("{}/indexed_audio".format(indexer.src_dir))

indexer.load_indexed_audio("{}/indexed_audio.txt".format(indexer.src_dir))

FileNotFoundError: [WinError 3] The system cannot find the path specified: 'C:/Users/Vikram/simple/small_audio.wav/filtered'

@enviz
Copy link

enviz commented Mar 16, 2019

as per the documentation, indexer.index_audio() is supposed to create two directories filtered and staging. But it doesn't seem to be doing it in the code.

If this is a bug,let me know. Thanks in advance.

@aalireza
Copy link
Owner

@enviz Not a bug. Windows is not supported.

If you’re on a Windows system, or for some reason don’t want to install natively, you may use the software within a Docker image.

sai needs sox, ffmpeg and (optionally) pocketsphinx. I don't know how to install and expose these on Windows, as I don't have a Windows system.

@enviz
Copy link

enviz commented Mar 17, 2019 via email

@aalireza
Copy link
Owner

@enviz
It should work. The docker file is here (https://raw.githubusercontent.com/aalireza/SimpleAudioIndexer/master/Dockerfile). Follow the instructions (https://simpleaudioindexer.readthedocs.io/installation/#docker-route). All you're doing essentially is getting a shell access to an Ubuntu system.

If you get an error, you should be more specific for me to help you. And it should be on a different issue as the docker stuff is irrelevant to this one.

Repository owner deleted a comment from enviz Mar 17, 2019
Repository owner locked as off-topic and limited conversation to collaborators Mar 17, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants