Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Maelstrom: uninformative error when duplicate regions present #318

Open
shaunmahony opened this issue Jun 16, 2024 · 0 comments
Open

Maelstrom: uninformative error when duplicate regions present #318

shaunmahony opened this issue Jun 16, 2024 · 0 comments

Comments

@shaunmahony
Copy link

Hi there,

I ran into the following error when I tried running maelstrom with a file containing identical regions (with different labels). It took quite a while to track down this error, so a more helpful error message would be good. Or you could put explicit instructions not to use duplicate entries on the tutorial site

2024-06-16 13:39:27,642 - INFO - Starting maelstrom 2024-06-16 13:39:28,107 - INFO - motif scanning (counts) 2024-06-16 13:39:28,108 - INFO - reading table 2024-06-16 13:39:37,351 - INFO - using 14000 sequences 2024-06-16 13:39:37,369 - INFO - Creating index for genomic GC frequencies. 2024-06-16 13:43:26,698 - INFO - setting threshold 2024-06-16 13:45:09,928 - INFO - creating count table 2024-06-16 14:00:08,540 - INFO - done 2024-06-16 14:00:08,547 - INFO - creating dataframe Traceback (most recent call last): File "/storage/home/sam77/work/software/miniconda3/envs/gimme/bin/gimme", line 12, in <module> cli(sys.argv[1:]) File "/storage/home/sam77/work/software/miniconda3/envs/gimme/lib/python3.10/site-packages/gimmemotifs/cli.py", line 755, in cli args.func(args) File "/storage/home/sam77/work/software/miniconda3/envs/gimme/lib/python3.10/site-packages/gimmemotifs/commands/maelstrom.py", line 42, in maelstrom run_maelstrom( File "/storage/home/sam77/work/software/miniconda3/envs/gimme/lib/python3.10/site-packages/gimmemotifs/maelstrom/__init__.py", line 192, in run_maelstrom counts = scan_regionfile_to_table( File "/storage/home/sam77/work/software/miniconda3/envs/gimme/lib/python3.10/site-packages/gimmemotifs/scanner/__init__.py", line 180, in scan_regionfile_to_table df = pd.DataFrame(scores, index=idx, columns=motif_names, dtype=dtype) File "/storage/home/sam77/work/software/miniconda3/envs/gimme/lib/python3.10/site-packages/pandas/core/frame.py", line 754, in __init__ mgr = arrays_to_mgr( File "/storage/home/sam77/work/software/miniconda3/envs/gimme/lib/python3.10/site-packages/pandas/core/internals/construction.py", line 123, in arrays_to_mgr arrays = _homogenize(arrays, index, dtype) File "/storage/home/sam77/work/software/miniconda3/envs/gimme/lib/python3.10/site-packages/pandas/core/internals/construction.py", line 620, in _homogenize com.require_length_match(val, index) File "/storage/home/sam77/work/software/miniconda3/envs/gimme/lib/python3.10/site-packages/pandas/core/common.py", line 571, in require_length_match raise ValueError( ValueError: Length of values (235020) does not match length of index (235653)

Installation information :

  • OS: Linux
  • Installation conda
  • Version [e.g. 0.18.0]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant