-
Notifications
You must be signed in to change notification settings - Fork 18
/
CHANGELOG
430 lines (391 loc) · 21.8 KB
/
CHANGELOG
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
1.1.alpha19
-----------
* Closed GH-352: Showing percent identity in the CSV file.
1.1.alpha18
-----------
* Closed GH-346: Adding proper support for ambigious amino acid codes.
1.1.alpha17
-----------
* Closed GH-340: "malformed row in jplace" after pplacer warning "computed function value is infinite or NaN".
* Closed GH-339: Better error message concerning strange mass.
1.1.alpha16
-----------
* Closed GH-334: multiclass_concat.py modifies placement database such that classif_table.py returns accurate tallies.
1.1.alpha15
-----------
* Closed GH-320: Now compiles fine with Batteries 2.
* Closed GH-327: Ignore sequences that aren't in both jplace and sequence files.
* Closed GH-312: An improved script for managing classification results, called ``classif_table.py``; ``classif_rect.py`` is now deprecated.
* Closed GH-315: support ``rppr info`` on reference packages lacking a taxonomy
* Closed GH-316: pplacer warns rather than fails when an alignment looks like nucleotides
* Closed GH-317: ``pplacer --write-masked-out`` respects ``--out-dir``
* Closed GH-318: Apply alignment mask on a per-placement basis
* Closed GH-319: ``pplacer --write-pre-masked`` no longer exits after writing mask.
* Closed GH-322: Moved to Batteries version 2.1.0. Supports OCaml 4.01.0
* Closed GH-323: ``pplacer``, ``guppy``, and ``rppr`` support gzip-compressed alignment and ``.jplace`` files.
* Closed GH-324: ``deduplicate_sequences.py`` supports gzip-compressed FASTA files.
1.1.alpha14
-----------
* Closed GH-181: Added ``guppy indep_c``; added ``--leaf-values`` to ``guppy mft``.
* Closed GH-215: Added ``pca_for_qiime.py`` and ``extract_taxonomy_from_biom.py``;
changed guppy and rppr to allow BIOM files to be passed as arguments where
placefiles are allowed.
* Closed GH-241: On the NBC classifier of ``guppy classify``: renamed
``--target-rank`` to ``--nbc-rank``; added ``--nbc-rank auto``.
* Closed GH-248: Added premasking by default to the NBC classifier of ``guppy
classify``; correspondingly added ``--no-pre-mask``.
* Closed GH-250: Completely revamped classification.
* Closed GH-251: Added the hybrid5 classifier.
* Closed GH-256: Added ``--mmap-file`` to pplacer; changed ``--pretend`` in
pplacer to show the amount of memory which would be allocated.
* Closed GH-259: Added ``guppy mcl`` and a ``--mcl`` flag to ``guppy compress``.
* Closed GH-260: Added ``rooted_pd`` to ``guppy fpd``; Added a ``--variance`` flag
to ``guppy rarefact``.
* Closed GH-261: Added ``rppr convex_taxids``.
* Closed GH-262: Added ``--as-mass`` to ``guppy redup``; updated
``sort_placefile.py`` for the new placefile format; fixed pplacer to better
deal with internal edges on a tree with a branch length of 0.
* Closed GH-263: Fixed a bug that could sometimes cause failures in ``rppr
min_adcl`` with the PAM algorithm.
* Closed GH-265: Fixed a bug that could cometimes cause ``rppr reroot`` to
return a suboptimal rooting.
* Closed GH-266: Adding ``map_ratio`` and ``map_overlap`` columns to ``guppy
to_csv``.
* Closed GH-267: Added a ``leaf_count`` column to ``rppr convex_taxids``.
* Closed GH-268: Merged various minor fixes.
* Closed GH-273: Fix convexify for multifurcating trees
* Closed GH-274: Added ``--weight-as-count`` to ``guppy rarefact``.
* Closed GH-276: Added ``-k`` to ``guppy rarefact``.
* Closed GH-277: Added ``--weight-as-count`` to ``guppy rarefy``.
* Closed GH-278: Changed ``guppy rarefact`` to allow multiple placefiles to be
passed instead of just one.
* Closed GH-282: Added a ``--inflation`` flag to ``guppy mcl`` and ``guppy
compress``.
* Closed GH-286: Changed pplacer's ``--mmap-file`` to accept a path to either a
directory or a file.
* Closed GH-287: Added a more informative error message for the case of an
input alignment containing some, but not all reference sequence names.
* Closed GH-288: Added ``guppy placemat``
* Closed GH-290: Random tie breaking in NBC classifier
* Closed GH-291: Added qD(T) alpha diversity measure from Chao et. al., 2010
(doi:10.1098/rstb.2010.0272)
* Closed GH-293: Switched build system to use OPAM
* Closed GH-294: Added ``--unitize`` option to ``guppy squash``
* Closed GH-297: Support multiprocessing during squash cluster bootstrapping
* Closed GH-298: Faster upper-limit finding during integration
* Closed GH-299: Fix NaN likelihoods for empirical frequencies + unseen state
* Closed GH-303: Updated Batteries dependency to version 2.0.0
* Closed GH-305: Eliminate compilation warnings from CDD
* Closed GH-306: Random tie breaking is default in NBC
* Closed GH-307: Renamed ``--kappa`` to ``--theta``, ``awpd`` to ``bwpd`` in ``guppy fpd``
* Closed GH-309: Support overlap minimization (SOM) added to ``guppy epca`` (see also GH-221)
1.1.alpha13
-----------
* Fixed a bug where pplacer would sometimes fail abruptly with a Child_error
of EBADF on close(2). [rc2]
* Added missing ``-o`` and related flags to ``guppy to_csv``. [rc3]
* Fixed minor errors in the documentation [r2]
* Closed GH-216: Added ``guppy trim`` and ``guppy check``.
* Closed GH-225: Added ``rppr vorotree``.
* Closed GH-235: Added ``--rep-edges`` flag for the splitify subcommands.
* Closed GH-237: Added a multiplicity column to ``guppy diplac``.
* Closed GH-238: Added ``--epsilon`` for splitify commands to filter constant
edges out of the resulting splitify matrices.
* Closed GH-239: Added the ``pam`` algorithm to voronoi commands; cleaned up
voronoi full algorithm and implementation.
* Closed GH-240: Added a ``-o`` flag for pplacer.
* Closed GH-242: Renamed ECLD to ADCL; added --max-adcl to voronoi commands.
* Closed GH-243: Updated ``rppr reroot`` so it can run on reference packages
which have unannotated sequences.
* Closed GH-244: Updated ``rppr infer`` to handle sequences which could not be
reasonably placed anywhere on the tree.
* Closed GH-245: Updated code to allow refpkgs to be passed as .zip files as
well as directories.
* Closed GH-246: Added --always-include to voronoi commands.
* Closed GH-247: Added ``guppy unifrac``.
* Closed GH-249: Added ``guppy rarefy``.
* Closed GH-252: Added ``--all-alternates`` to ``rppr convexify``.
* Closed GH-253: Added ``--no-collapse`` to ``guppy diplac``.
* Closed GH-255: Added ``guppy to_csv``.
* Closed GH-257: Renamed ``rppr voronoi`` to ``rppr min_adcl``, ``rppr vorotree`` to
``rppr min_adcl_tree``, and ``guppy diplac`` to ``guppy adcl``. Added ``--seed`` to
``rppr min_adcl`` and ``rppr min_adcl_tree``.
* Closed GH-280: Fixed a bug in which pplacer would fail if passed -t and
-s flags without also passing a -c or -r flag.
1.1.alpha12
-----------
* Added ``guppy to_rdp`` and ``guppy classify_rdp``.
* Closed GH-113: Updated the ``best_classification`` and ``multiclass`` SQL views;
correspondingly changed the flags to ``rppr prep_db``; added the
``placement_evidence`` table for bayes classification.
* Closed GH-183: Added documentation for more functions.
* Closed GH-189: Added ``guppy pentropy``.
* Closed GH-191: Added -m to ``guppy redup``; bumped the jplace format version
to 3 with the ability to give each name for a pquery a different mass.
* Closed GH-192: Added ``rppr infer``.
* Closed GH-193: Added ``rppr reclass``.
* Closed GH-194: Added a ``mass`` column to ``placement_names`` to parallel new
features in the jplace format.
* Closed GH-195: Added a --groups flag to pplacer for placing metagenomic
sequences onto trees built from very wide alignments.
* Closed GH-196: Added code to pplacer to better handle errors raised when
calling fork(2).
* Closed GH-198: Added progress reporting code to ``guppy compress``.
* Closed GH-199: Renamed ``guppy pentropy`` to ``guppy entropy``; added quadratic
entropy calculation.
* Closed GH-200: Added --discard-below to ``guppy {islands,compress}``.
* Closed GH-201: Added a setup.py file to the scripts directory.
* Closed GH-202: Added ``rppr prepsim``; added --include-pendant to
``guppy diplac``.
* Closed GH-204: Added more columns to ``rppr reclass``.
* Closed GH-205: Added ``guppy error``.
* Closed GH-206: Updated some pplacer documentation.
* Closed GH-207: Added CSV output to ``guppy edpl``.
* Closed GH-208: Fixed the -o flag to reflect the actual file written to for
``guppy {sing,tog,edpl}``.
* Closed GH-209: Fixed a bug preventing Gcat_model refinement if the supplied
alignment had a different number of sites from the refpkg; added
--always-refine to pplacer.
* Closed GH-210: Merged ``guppy {pd,wpd,entropy}`` into ``guppy fpd``.
* Closed GH-212: Changed parsing error messages to be more specific.
* Closed GH-213: Added PyNAST support to refpkg_align.py.
* Closed GH-214: Added ``guppy ograph``.
* Closed GH-218: Added --split-csv to ``guppy merge``.
* Closed GH-219: Fixed some guppy commands which needed a zero root branch
length.
* Closed GH-220: Added error messages for some exceptions; fixed
``guppy rarefact``.
* Closed GH-222: Added a -c flag to ``guppy bary``.
* Closed GH-236: Fixed support for numeric leaf names when writing out XML.
1.1.alpha11
-----------
* Fixed a bug with parsing MAP identity from a jplace file.
* Fixed a bug that occurred sometimes when running ``rppr convexify`` on a tree
where not every leaf had a corresponding tax_id at a particular rank.
* Fixed a bug with fasttree placements which caused failure when the site
categories had to be calculated (ie. when using a combined reference + query
alignment)
* Changed versions to be parsed only out of git tags.
* Closed GH-42: Added --edpl flag to ``guppy fat`` for EDPL coloring in fat
trees.
* Closed GH-146: Added an --unweighted flag to ``guppy pca``.
* Closed GH-168: Changed guppy and rppr to list the subcommands in --help;
reformatted the options flags to be more readable.
* Closed GH-169: Fixed all commands that work on reference packages to use the
-c flag to specify the reference package.
* Closed GH-176: Fixed ``rppr reroot`` to handle cases where not all tax_ids are
specified under the rank the rerooting is happening at.
* Closed GH-177: Added --mass-gt and --mass-le flags to ``guppy filter``.
* Closed GH-178: Fixed ``guppy compress`` to add a zero root branch length.
* Closed GH-179: Changed the JSON emitter to be able to represent a wider
range of floats.
* Closed GH-182: Changed 'weighted' to 'spread' and 'unweighted' to 'point'
across all the guppy and rppr commands.
* Closed GH-184: Added a placement_median_identites table for guppy classify.
* Closed GH-188: Added support for a reference package to be able to specify
that some reference sequences are unclassified.
1.1.alpha10
-----------
* Closed GH-126: Split Glv/Model into Gmix_model and Gcat_model with the
option of adding more models to pplacer later.
* Closed GH-154: Added ``guppy islands``.
* Closed GH-156: Added functionality for calculating multiple KR distances
with only one tree traversal.
* Closed GH-157: Added ``guppy compress``.
* Closed GH-162: Added fig ranking to pplacer.
* Closed GH-163: Added a --node-numbers flag to every command that emits a
phyloxml tree which adds the tree's node numbers to the phyloxml output as
bootstrap values.
* Closed GH-172: Switched from using xml-light to xmlm.
1.1.alpha09
-----------
* Fixed a bug that was making classifications appear too specific when the
placement was on the edge just proximal to an MRCA.
* Fixed a bug wherein the Newick parser would add a '.' to the end of leaf
names if the name was an integer.
* In fixing a bug, made ``guppy diplac`` only emit the first name in a pquery in
its results.
* Fixed a bug that caused the trailing } in a JSON file to be omitted when the
file was large enough.
* Added a ``multiclass`` view and associated parameters to ``rppr prep_db``.
* Fixed a bug where sometimes placements that fell too close to the root would
vanish with the --map-identity flag on.
* Fixed identity to give 0% sequence identity in the case of no overlap
between sequences.
* Fixed a bug by flushing all output buffers before forking, so that the same
data isn't written multiple times.
* Fixed a bug in pplacer in which it wouldn't write the original reference
tree when creating a placefile.
* Fixed some issues with phyloxml generation.
* Closed GH-27: Reorganized functions that operate on alignments.
* Closed GH-67: Added ``guppy wpd``.
* Closed GH-72: Added new features to ``rppr convexify`` and fixed the
implementation of the convexify algorithm, as it cut too many leaves in some
corner cases.
* Closed GH-75: Changed mass maps to use records types instead of tuples.
* Closed GH-76: Added a check in ``guppy classify`` to ensure that the sqlite
database has been properly initialized.
* Closed GH-78: Added an --average flag to ``guppy fat``.
* Closed GH-81: Removed old code that was deprecated by batteries; changed
most code to be using batteries.
* Closed GH-82: Merged the ``pick`` branch into dev.
* Closed GH-84: Renamed .json to .jplace for new-format placefiles.
* Closed GH-85: Added better error messages for potential premasking failures.
* Closed GH-86: Dropped support for older versions of the reference package
format.
* Closed GH-88: Improved handling very small alphas in pplacer, working around
a problem with the GSL that wasn't allowing us to invert the CDF for the
gamma distribution.
* Closed GH-89: Fixed some bugs in the convexify algorithm.
* Closed GH-90: Added an error message for if pplacer is invoked with no
arguments.
* Closed GH-91: Added more documentation to .mli files.
* Closed GH-92: Added searching and alignment scripts.
* Closed GH-94: Improved the leaf selection criteria in ``rppr voronoi``.
* Closed GH-96: Now using colors from colorbrewer2.org for heat.
* Closed GH-98: Added a --map-identity flag to pplacer.
* Closed GH-99: Added more examples for guppy batchfiles.
* Closed GH-101: Added virtual placefiles to guppy batchfiles.
* Closed GH-103: Added batteries as a dependency.
* Closed GH-104: Added a naive reference implementation of convexify,
accessible with a --naive flag.
* Closed GH-105: Added parameter substitution to guppy batchfiles.
* Closed GH-108: Added more data from jplace files into the taxtable schema;
added a way to change the cutoff for best_classifications.
* Closed GH-109: Added a ``rppr info`` subcommand.
* Closed GH-110: Added a --no-early flag to ``rppr convexify``.
* Closed GH-111: Added a script to update reference packages to version 1.1.
* Closed GH-112: Improved some of pplacer's error messages.
* Closed GH-114: Changed MAP identity in placefiles to be stored flat instead
of nested.
* Closed GH-115: Updated the documentation to reflect changes to the SQL
schema in ``rppr prep_db``.
* Closed GH-116: Added --keep-at-most and --keep-factor flags to pplacer.
* Closed GH-117: Added the output of ``git describe`` to --version output.
* Closed Gh-118: Added a --quiet flag to guppy and rppr.
* Closed GH-119: Added a ``rppr reroot`` subcommand.
* Closed GH-120: Added a --rank-limit flag to ``rppr convexify``.
* Closed GH-122: Renamed ``rppr taxtable`` to ``rppr prep_db``.
* Closed GH-124: Rewrote the Newick parser.
* Closed GH-125: Added a --rooted flag to ``rppr convexify``.
* Closed GH-128: Added informative priors to pplacer integration.
* Closed GH-129: Added the ability for ``rppr convexify`` to convexify a tree
not in a reference package.
* Closed GH-130: Added ``guppy pd``.
* Closed GH-131: Changed pplacer to set some defaults with the -p flag.
* Closed GH-133: Added edge painting classification by default to pplacer and
a corresponding --painted flag to ``rppr ref_tree``.
* Re-closed GH-133: Added checks for the case when not all ranks are
represented by some sequence in the reference package.
* Closed GH-135: Generally improved error messages in pplacer, guppy, and
rppr.
* Closed GH-138: Removed the --transform flag from everything except ``guppy
mft``; removed the --unweighted flag from commands which don't respect it.
* Closed GH-139: Added a check to pplacer to ensure all query sequences have
at least some overlap with some reference sequence.
* Closed GH-140: Allowed pplacer to be run with -t and without -r or -c.
* Closed GH-141: Removed ``compare_placements.py``; added ``check_placements.py``;
documented the rest of the python scripts.
* Closed GH-143: Changed the --leaves argument to ``rppr voronoi`` to specify
the number of leaves to keep in the tree instead of the number of leaves to
prune.
* Closed GH-144: Changed pplacer to explicitly disallow unplaced sequences
from being written out; changed ``best_classifications`` and ``multiclass`` views
to yield NULL for rows where no placements match all the criteria.
* Closed GH-149: Updated the jplace version to 2; changed the Newick parser
and emitter to allow for the new format of trees with edge numbers specified
in braces instead of brackets.
* Closed GH-153: Changed ``guppy mft`` to not squash name lists to mass by
default.
* Closed GH-155: Changed pplacer to always check for invalid bases when
pre-masking sequences.
* Closed GH-159: Updated to use the v2 batteries API.
* Closed GH-164: Fixed some bugs related to various APIs not knowing how to
deal with node labels on non-leaves.
* Closed GH-167: Fixed an issue where ``rppr reroot`` wasn't updating the bark
of trees correctly.
1.1.alpha08
-----------
* Closed GH-46: Added a --timing flag to pplacer to show where time was spent
during placements.
* Closed GH-47: Added unit tests for gaussian kr distance.
* Closed GH-49: Added a more informative error message to pplacer if reference
sequences occur multiple times in a query file.
* Closed GH-52: Fixed padding in ``guppy squash``.
* Closed GH-58: Improved navigation for the documentation on guppy commands.
* Closed GH-59: Improved documentation for guppy commands using sqlite.
* Closed GH-60: Improved documentation for ``guppy classify``.
* Closed GH-65: Released alpha08.
* Closed GH-66: Added a FAQ to the documentation.
* Closed GH-68: Changed ``guppy info`` to show multiplicity.
* Closed GH-69: Changed ``guppy tog`` to use polytomy instead of a _ delimiter
joining sequence names.
* Closed GH-70: Added documentation for how to compile pplacer.
* Closed GH-73: Added a pre-masking feature to cut out irrelevant columns of
the alignment.
* Closed GH-74: Improved how pplacer dedupes sequences before a placerun.
* Added a script to sort placefiles for easier visual diffs.
* Added more informative error messages when the length of the query alignment
and reference alignment don't match.
* ``guppy edpl`` now calculates the EDPL (expected distance between placement
locations) for placements
1.1.alpha07
-----------
* Fixed a serious issue with any command that used a ``--prefix`` flag.
1.1.alpha06
-----------
* Closed GH-20: efficiency of kr_distance evaluated and approved.
* Closed GH-44: improved sqlite3 support in guppy.
* Closed GH-45: better consistency of output flags across all guppy commands.
* Closed GH-48: better error reporting when loading json placefiles.
* Closed GH-50: fixed the output of ``guppy classify``.
* Closed GH-51: only parsing the first word as a sequence name in a fasta file.
* Renamed the ``--normal`` flag for ``guppy kr`` to ``--gaussian`` to avoid confusion
with normalization.
* Only one shuffle at a time is now stored in memory (big savings, obviously).
* Fixed bug that was throwing off ``guppy kr`` shuffling significance estimation.
* ``guppy pca`` now defaults to scaling eigenvalues to percent variance; now has
a --raw-eval flag to specify otherwise.
* Fixed all of the sequence parsers to be tail-recursive, so parsing large
files no longer causes segfaults.
* Added a ``guppy redup`` command for re-adding duplicate sequences to
placefiles generated from deduplicated sequence files.
1.1.alpha05
-----------
* Closed GH-11: options for the tree-generating subcommands are more uniform.
* Closed GH-22: added more unit tests for ``guppy kr``.
* Closed GH-32: using power iteration rather than full
eigendecomposition. Leads to 60x speedup for PCA on demo data.
* Closed GH-35: added a remark in the documentation about multiple comparison.
* Closed GH-41: added docs for the json placefile format.
* Added ``guppy splitify``. Spits out differences of mass across splits.
* Changed default behavior of ``guppy kr`` to not normalize, and
correspondingly added in a ``--normalize`` command line flag.
* Improved documentation.
* Moved ``guppy heat`` to ``guppy kr_heat``.
* The ``guppy heat`` subcommand now maps an arbitrary vector to a phylogenetic
tree contained in a reference package.
* Removed redundant ``classic`` subcommand from guppy.
* Changed ``guppy round`` to use the new placefile format.
* Added support for posterior probabilities in json placefiles.
* Removed csv and old-placefile-format generation from pplacer.
* Added the ability to run pplacer in multiple processes in parallel with the
-j flag.
* Removed friend-finding from pplacer, as it was superseded by some of the
multiprocessing code.
1.1.alpha03 to 1.1.alpha04
--------------------------
* Closed GH-12: removed Placement.classif, and renamed
Placement.contain_classif to Placement.classif.
* Closed GH-29: added ``guppy merge`` to merge multiple placefiles.
* Closed GH-30: accepting \r, \n, and \r\n as delimiters everywhere in pplacer
and guppy.
* Closed GH-36: added ``guppy {classic,tog,sing,filter}``. The former three were
originally a part of placeviz, and the lattermost filters placefiles to only
a specific subset of their placements.
* Closed GH-37: added a trailing newline to the end of json files.
* Closed GH-38: added this changelog.
* Bugfix in ``guppy classify``. From the commit:
Previously, a file was being opened and closed for every pquery in every
placerun. This resulted in only the last pquery being classified and
written out. Now only one file is being opened for each placerun.