-
Notifications
You must be signed in to change notification settings - Fork 1
/
evn.diff
296 lines (296 loc) · 38 KB
/
evn.diff
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
328d327
< Updated ignore patterns. 2019-10-23T18:40:46+00:00
335d333
< ignore *.fomabin. 2019-10-08T06:35:05+00:00
337,340d334
< ign 2019-10-07T21:32:11+00:00
< ign 2019-10-07T21:15:15+00:00
< ign 2019-10-07T21:13:09+00:00
< Force unix line endings, to make sure it works ok also on the Windows subsystem for Linux. 2019-10-07T17:16:53+00:00
350d343
< Updating svn ignores for tools/analysers/. 2019-06-14T06:38:51+00:00
357,358d349
< Updating svn ignores. 2019-05-24T09:55:04+00:00
< Updating svn ignores. 2019-05-24T09:44:55+00:00
369d359
< Updated svn ignores. 2019-02-27T10:18:02+00:00
381,382d370
< Ignore compiled cg3 files in tools/tokenisers/. 2019-01-08T07:08:34+00:00
< Ignore more files, including files that are automatically added to svn when populating a new language. This is done to avoid them showing up as noise for external languages, in which case these files might not be in our svn (but in the external svn repo instead). 2019-01-08T06:55:51+00:00
394,395d381
< ignore for bin 2018-10-14T13:31:01+00:00
< added korp.cg3 to svn ignore. 2018-10-14T12:56:20+00:00
416,417d401
< svn ignore update 2018-09-20T08:44:05+00:00
< updated svn ignore. 2018-09-20T08:28:11+00:00
421d404
< More general ignore pattern for tools/mt/apertium/tagsets/. 2018-09-10T11:16:40+00:00
424d406
< Updated svn ignore patterns. 2018-09-08T05:26:27+00:00
434d415
< Updated svn ignores. 2018-08-30T16:00:09+00:00
437d417
< Updated svn ignores. 2018-08-29T05:25:34+00:00
439d418
< Updating svn ignores. 2018-08-28T10:47:06+00:00
459d437
< More things to ignore. 2018-05-14T10:33:30+00:00
473,476d450
< Added ignore pattern for in.txt 2018-03-01T07:09:50+00:00
< More ignores 2018-03-01T06:52:33+00:00
< More svn ignores. 2018-03-01T06:25:59+00:00
< Added svnignore pattern for sigma.txt. 2018-02-21T09:49:57+00:00
479d452
< Two more files to ignore. 2018-02-06T09:44:18+00:00
490d462
< Updated svn ignores. 2018-01-31T12:13:59+00:00
523d494
< Updated svn ignores. 2017-12-11T12:55:46+00:00
544,545d514
< Updated svn ignores for tokenisers and grammar checkers + subdirs. 2017-10-11T11:47:18+00:00
< Updated svn ignores for tokenisers and grammar checkers + subdirs. 2017-10-11T11:22:45+00:00
557d525
< Updating svn ignores. 2017-08-25T10:22:58+00:00
571,572d538
< Updated svn ignores. 2017-06-28T23:37:25+00:00
< Updated svn ignores. 2017-06-28T23:08:42+00:00
579d544
< ign 2017-03-21T19:49:19+00:00
591d555
< Updated svn ignores. 2017-03-01T12:02:48+00:00
607d570
< Updated svn ignores. 2017-01-30T10:04:48+00:00
675d637
< Updated svn ignores. 2016-06-09T20:11:13+00:00
694d655
< Setting svn ignore patterns on tools/spellcheckers/filters/. 2016-05-10T01:00:11+00:00
715d675
< Ignore more preprocessor files = fst’s. 2016-04-14T16:01:04+00:00
719d678
< Updated svn ignores. 2016-03-15T19:54:49+00:00
722d680
< Use a more general svn ignore pattern in src/morphology/. 2016-03-07T17:10:12+00:00
743d700
< Updated the svn ignore property for recent changes in the infrastructure. 2016-02-16T22:36:51+00:00
748d704
< Updating svn:ignore’s. 2016-02-02T15:34:45+00:00
753,754d708
< Updated svn:ignore’s. 2016-02-02T10:33:44+00:00
< Updated svn:ignore’s. 2016-02-02T10:16:28+00:00
758d711
< Updated svn ignores. 2016-01-25T08:11:45+00:00
771d723
< Updated svn:ignore’s. 2015-11-18T23:05:40+00:00
786d737
< Updated svn ignores. 2015-10-20T07:50:38+00:00
812d762
< Ignore temporary files generated by the speller suggestion test script. 2015-09-03T04:23:51+00:00
848a799,800
> symlink in langs, not in startup-langs. 2015-04-23T14:28:19+00:00
> removing symlink 2015-04-23T14:27:12+00:00
850c802,997
< moving evn from startuplangs to langs. 2015-04-23T11:39:16+00:00
---
> [Template merge - langs/und] Finally got all weighting to work as intended, including the no-sugg weights. 2015-04-23T06:53:23+00:00
> typo 2015-04-21T13:13:31+00:00
> [Template merge - langs/und] Further modularisation and improvements to weighted spellers. With hfst3 revision 4329, using a tab-separated tag reweighting file is working. 2015-04-21T11:47:05+00:00
> testfile 2015-04-21T09:43:04+00:00
> Improving the automaton. 2015-04-21T09:42:45+00:00
> docy 2015-04-19T17:08:52+00:00
> Added archiphoneme {R} for present tense (r)A, and 5 rules to govern its behaviour. They still do not, so return to this. Also added testcases revealing the sad state. 2015-04-19T17:08:29+00:00
> Started working on the verbs. Vowel harmony for variable present tense marker problematic (still not working). 2015-04-19T17:07:13+00:00
> added l and r to the left context of D:д. 2015-04-17T02:55:15+00:00
> docu 2015-04-17T01:28:32+00:00
> short/long. Note the long ones are two characters. 2015-04-17T01:17:52+00:00
> pairs 2015-04-17T01:14:52+00:00
> The spellrelaxer now works. The problem was that this fst had length as two characters, but the spellrelaxer wrote the vowels fused. Tuning and normativity still needed, but now we may type in unmarked forms. 2015-04-17T01:14:22+00:00
> [Template merge - langs/und] Do not remove usage tags when building spellers, speller tags were throwned out. 2015-04-16T09:55:55+00:00
> corp 2015-04-16T09:45:35+00:00
> Possessive plural, only nominative correct. 2015-04-16T09:45:05+00:00
> newfiles 2015-04-16T09:44:31+00:00
> additions 2015-04-16T09:44:09+00:00
> [Template merge - langs/und] Added an attempt at normalising the corpus-based weights towards a standard max upper weight, to allow a much higher weight for strings not to be suggested. Also split the processing of adding corpus-based weights and morphology weights into more steps - retaining each intermediate fst - to allow easier debugging of the weight assignments. 2015-04-16T08:24:24+00:00
> corrected 2015-04-16T03:41:39+00:00
> spellrelaxing long vowels. 2015-04-16T03:28:25+00:00
> long vowels. 2015-04-16T03:27:50+00:00
> lemma with long vowel sign. 2015-04-16T03:27:11+00:00
> Новые слов. 2015-04-16T01:15:28+00:00
> speller: Evenki as Belorussian. 2015-04-14T00:44:08+00:00
> corrected the spellrelax to the Evenki practice of non-precomposed in the code. 2015-04-14T00:43:30+00:00
> commented-in-nouns 2015-04-14T00:42:28+00:00
> ref to pron 2015-04-14T00:34:22+00:00
> personal and reflexive pronouns. 2015-04-14T00:33:05+00:00
> [Template merge - langs/und] Xerox composition of weights and lexical fst. 2015-04-13T12:50:00+00:00
> to make it compile, more to come. 2015-04-12T07:35:45+00:00
> updates 2015-04-12T03:42:02+00:00
> formatting 2015-04-11T18:45:36+00:00
> Evenki work in progress. 2015-04-11T18:06:36+00:00
> nouns 2015-04-11T09:07:02+00:00
> no noun failures. 2015-04-11T09:06:31+00:00
> docu 2015-04-11T09:05:31+00:00
> Vowel harmony, 3 noun classes, tags. 2015-04-11T09:05:08+00:00
> [Template merge - langs/und] Moved a script for cleaning weighting corpus to the core. Require new core. 2015-04-10T12:41:19+00:00
> [Template merge - langs/und] Fixed bugs related to the new support for frequency-weighted spellers: missing checks for required tools. 2015-04-10T11:14:50+00:00
> [Template merge - langs/und] Stupid copy-paste error turned the positive test into a negative. Now corrected. 2015-04-10T10:20:48+00:00
> [Template merge - langs/und] Skip Xerox testing if no test data is found. Added comments. 2015-04-10T10:02:12+00:00
> [Template merge - langs/und] Added pair-test for hfst, improved pair-testing for Xerox' twolc. 2015-04-10T09:01:40+00:00
> [Template merge - langs/und] Add a huge weight to words tagged with +Use/SpellNoSugg. 2015-04-09T11:41:39+00:00
> [Template merge - langs/und] Added support for corpus-based (frequency) weighting of the speller fst's. Also reorganised where to specify the tag-based weights (and this is subject to change pending a bug fix in hfst-reweight). All languages are given a toy corpus, which can be replaced with a real one. This is finally the core of Tommi's dissertation applied to all languages. 2015-04-09T09:48:58+00:00
> pron 2015-04-07T21:57:21+00:00
> Evenki ported from Apertium, thanks to Francis, a first start. 2015-04-07T19:48:34+00:00
> [Template merge - langs/und] More robust testing for Xerox fst's - will properly report all generation fails. 2015-04-07T11:58:45+00:00
> [Template merge - langs/und] Corrected tests for nouns and propernouns. Now nouns behave correctly with hfst, and proper nouns have correct tags. 2015-04-07T06:59:56+00:00
> [Template merge - langs/und] Modernised the generate-noun-lemmas.sh.in script, added similar scripts for adj, proper nouns and verbs. 2015-04-02T06:27:37+00:00
> [Template merge - langs/und] Check that yaml testing is enabled before running yaml tests in test/tools/. 2015-04-01T11:17:14+00:00
> [Template merge - langs/und] Require new version of the core, updated comments about Err tags. 2015-03-30T10:14:05+00:00
> +Err/Sub -> +Err/Orth, according to the meeting 19.3., with notes in http://divvun.no/doc/lang/common/ErrorTags.html. Commandos used for the tag change: 2015-03-29T09:34:21+00:00
> [Template merge - langs/und] Removed CmpNP tags from downcase-derived-proper-strings.xfscript.in. 2015-03-19T07:36:08+00:00
> [Template merge - langs/und] When doing 'make clean', remove generated html files in the root dir. 2015-03-14T10:31:14+00:00
> [Template merge - langs/und] Removed multichar definition of superfluous flag diacritics. 2015-03-13T13:57:48+00:00
> [Template merge - langs/und] Added a new directory named devtools/ to each language, with the idea that it should contain tools useful for development, but not necessarily suitable for automake testing. Initially it contains shell scripts to generate a table of generated word forms for each continuation lexicon. 2015-03-13T11:21:36+00:00
> [Template merge - langs/und] Removed corpus names from tools/spellcheckers/fstbased/hfst/data/Makefile.am. It caused the build to stop with an error for all languages except FIN. 2015-03-12T08:50:49+00:00
> [Template merge - langs/und] Make building the abbr.txt configurable (default=no), check for the existence of src/morphology/stems/abbreviations.lexc, and error out if not found. 2015-03-12T06:31:56+00:00
> [Template merge - langs/und] Forgot to include the new Makefile (r109076) in configure.ac. 2015-03-12T05:49:15+00:00
> [Template merge - langs/und] Preparations for supporting corpus-based frequency weights, as per TommiP. 2015-03-11T17:45:49+00:00
> [Template merge - langs/und] Enabled weighting of speller fst's. Adjust weights and tags as needed. 2015-03-11T13:26:56+00:00
> [Template merge - langs/und] Added support for all languages to generate the abbr.txt file used by $GTCORE/scripts/preprocess. At the same time added initial support for compiling pmatch scripts into fst's for hfst-proc2, which is the future alternative to preprocess. 2015-03-09T15:13:06+00:00
> [Template merge - langs/und] Forgot to remove some debug statements from the yaml test runner. Now cleaned. 2015-03-06T16:32:05+00:00
> [Template merge - langs/und] Moved MWE tag processing into the core - we want this for many languages. 2015-03-06T15:30:22+00:00
> [Template merge - langs/und] Added support for a new type of yaml tests: speller acceptance testing. The basic idea is to just give a list of words and word constructions (compounds, derivations, etc) the speller should accept or reject, and let the yaml test bench verify whether this is actually the case. 2015-03-06T13:29:54+00:00
> [Template merge - langs/und] Several changes to properly support all position-based +CmpN/XX tags: * moved tag path splitting and tag-to-flag conversion into separate regex files in the core. * added support for compiling and using the new regexes * added support for a new type +CmpN/Suff * added the required multichar symbols to the root.lexc files * increased required core version number 2015-03-05T15:48:07+00:00
> [Template merge - langs/und] Fixed a bug in the yaml test bench when both hfst and xfst was enabled, but where only one type is built, e.g. for Apertium. 2015-03-05T14:12:35+00:00
> [Template merge - langs/und] Added build support for alternate orthographies: default fst's, dicts and oapha. 2015-03-04T15:51:40+00:00
> [Template merge - langs/und] Fixed a bug that caused the wrong fst to be picked in certain cases, which caused the test script to fail. 2015-03-04T10:58:15+00:00
> [Template merge - langs/und] A couple of changes related to testing: * require Python 3.3+ * require new gtcore * update YAML test runner to make SMS testing work as intended also with Xerox 2015-03-03T16:42:49+00:00
> [Template merge - langs/und] Added support for country/region specific proofing tools in configure.ac. 2015-03-02T13:16:18+00:00
> [Template merge - langs/und] We do not support anything but the latest/newest Voikko now. 2015-02-27T12:44:31+00:00
> [Template merge - langs/und] Finalised the basic multiple writing system support, by adding support for Oahpa and dictionary fst's. 2015-02-27T11:23:58+00:00
> [Template merge - langs/und] Added a configuration flag to enable two-step compose-intersect. In most cases this will not make any difference, but for some languages it will correct a bug in compose-intersect that would otherwise create a bad fst, and for other languages it will make the operation much slower without changing the fst. Disabled by default, whether it is useful must be tested in each case / language. 2015-02-27T09:21:50+00:00
> [Template merge - langs/und] Corrected errors in hfst compilation of alternative writing system fst's. 2015-02-26T18:08:55+00:00
> [Template merge - langs/und] Added test runners for genation and analysis tests only for the descriptive fst. 2015-02-26T12:39:20+00:00
> [Template merge - langs/und] Compilation of the default set of fst's with alternate writing systems working. 2015-02-26T08:41:42+00:00
> [Template merge - langs/und] First step in adding support for alternate writing systems and orthographies: adding variables to configure.ac. Removed the variable LO_min_version, it isn't used. 2015-02-25T06:54:29+00:00
> [Template merge - langs/und] Split the m4/ax_python_module.m4 file, it contained mostly java autotools stuff. Improved the message to update the gtcore. 2015-02-24T15:20:24+00:00
> [Template merge - langs/und] Added the make-optional-hyph-tags filter to the generators. Fixes bug #1914. 2015-02-12T22:51:16+00:00
> [Template merge - langs/und] Make use of the new remove-adv_comp filter. Require new core and newest hfst. 2015-02-12T15:09:33+00:00
> [Template merge - langs/und] Put to use the make-optional-adv_comp filter. 2015-02-12T11:34:03+00:00
> [Template merge - langs/und] Don't build xerox fst's within the Apertium dir tree - no need for it. 2015-02-12T11:21:36+00:00
> [Template merge - langs/und] Require new core because of new filters. Use hfst-optimized-lookup in the yaml testing (but only if possible), this should speed up hfst testing quite a bit. 2015-02-11T21:49:36+00:00
> [Template merge - langs/und] Put the new optional minip filter to use, and increased required gtcore version. 2015-02-11T04:54:18+00:00
> [Template merge - langs/und] Replaced all instances of sub and lexsub filters with the new, generated error filters. 2015-02-10T20:56:58+00:00
> [Template merge - langs/und] Added support for extracting error tags and constructing filters for manipulating error strings and tags. Updated required version of gtcore. 2015-02-10T14:43:26+00:00
> [Template merge - langs/und] Remove variant tags in disamb analyser. 2015-02-10T13:49:58+00:00
> [Template merge - langs/und] Xerox fst's are irrelevant to Apertium, don't even try to build them. 2015-02-10T09:15:35+00:00
> [Template merge - langs/und] Use the new make-optional-v1-tags filter for apertium generators. 2015-02-09T19:52:03+00:00
> [Template merge - langs/und] Forgot to include the new regex in the src file listing in the previous commit. 2015-02-09T18:05:34+00:00
> [Template merge - langs/und] Corrected dictionary generators to require a variant tag except for +v1, which is optional. 2015-02-09T17:43:07+00:00
> [Template merge - langs/und] Removed 'invert net' from a couple of more instances. 2015-02-09T17:01:44+00:00
> [Template merge - langs/und] Treat Hfst and Xerox the same during *tmp.Xfst and *.Xfst build - invert both only in the last step when going from tmp to non-tmp fst (invert the analyser for hfst, the generator for xfst). This should remove one more confusing difference between the two. 2015-02-09T15:34:18+00:00
> [Template merge - langs/und] Check that we have at least Python3.1 when enabling Apertium, error out if not. Also add AM check for hfst-optimized-lookup. 2015-02-09T14:47:56+00:00
> [Template merge - langs/und] A small, functionally equivalent change: from suffix rule to pattern rule. 2015-01-30T16:52:02+00:00
> [Template merge - langs/und] Now +CmpN/Pref is correctly supported (earlier it was treated as +CmpN/First). 2015-01-30T13:01:24+00:00
> [Template merge - langs/und] Corrected fst file reference in test shell script. 2015-01-29T20:06:18+00:00
> [Template merge - langs/und] Corrected source file reference in test shell script. 2015-01-29T19:07:41+00:00
> [Template merge - langs/und] Changes to a couple of Makefile.am files to fix issues with 'make dist'. 2015-01-29T09:52:41+00:00
> [Template merge - langs/und] The last part of the CmpN location restriction flag diacritics added. 2015-01-29T08:01:20+00:00
> Makefile.in does not belong in svn. 2015-01-28T08:57:33+00:00
> [Template merge - langs/und] Code cleanup: no use for the M4 part - the null alternative did not work. 2015-01-27T20:11:06+00:00
> [Template merge - langs/und] Finally nailed all combinations of fst compilator and lexicon minimisation - now downcasing of derived proper nouns is working as it should again for both Xerox, Hfst hyperminimised and Hfst normal lexc compilation. 2015-01-27T15:38:42+00:00
> [Template merge - langs/und] +CmpN/Only supported, first steps in tag splitting taken. 2015-01-27T13:15:02+00:00
> [Template merge - langs/und] Moved code common to all yaml testrunner shell scripts to an include file in GTCORE to avoid code duplication and reduce the risk for introducing bugs. This requires the newest version of the CORE. Because of the inclusion, I had to rename the test runner to .sh.in, and added autoconf processing of it. Also added a test file for testing the base speller fst (it must be tailored to each language of course). 2015-01-26T15:33:56+00:00
> [Template merge - langs/und] Last change to get hyperminimisation to produce the correct output: made the derived-proper downcase script being processed by autoconf, so that we can require a symbol in a certain context, and at the same time in the end let the symbol be empty if not needed. 2015-01-22T11:09:19+00:00
> [Template merge - langs/und] Added optional flag diacritic inserted by Hfst hyperminimisation. This resolves the remaining cases of errors after the hfst team fixed a bug in lexc compilation with hyperminimisation turned on. Since it is optional, it does not make any harm when using Xerox or when not using hyperminimisation. 2015-01-22T08:42:30+00:00
> [Template merge - langs/und] Added xerox variable flag-is-epsilon to the tag reorder regex. This fixes most of the cases of errors after the hyperminimisation bug was fixed in hfst-lexc. The remaining errors must be fixed in the downcase-derived-proper regex. 2015-01-22T07:21:19+00:00
> [Template merge - langs/und] Added more silent builds for hfst tools. 2015-01-20T20:37:55+00:00
> [Template merge - langs/und] Added conversion of tags to flag diacritica for position-restricting tags. These are currently used in sma, sme, smj and sje. Together with some additions to the R lexicon, the tags will finally do what they are meant to do for hfst-based spellers. 2015-01-20T09:16:17+00:00
> [Template merge - langs/und] Added Multichar symbol definitions for flag diacritica controlling compounding based on position tags. Done for most langs, the symbols will be ignored if not used. 2015-01-19T21:29:59+00:00
> [Template merge - langs/und] New: added example test file for the fstspeller fst file (starting point for foma and hfst spellers). 2015-01-14T18:15:21+00:00
> [Template merge - langs/und] Fixed: errors in the yaml test runner when the fst has a suffix 'hfst'. 2015-01-14T18:04:59+00:00
> [Template merge - langs/und] Fixed: directory and fst names in the yaml runner shell script. 2015-01-14T17:28:34+00:00
> [Template merge - langs/und] Added support for yaml tests for speller fst's. 2015-01-14T17:08:17+00:00
> [Template merge - langs/und] Added support for Xerox fst's in tools/spellcheckers/fstbased, mainly to help in debugging hfst. Turned out to be very useful. 2015-01-13T23:44:01+00:00
> [Template merge - langs/und] Improved comments to make the lemma generation script easier to adapt. 2015-01-13T15:42:21+00:00
> [Template merge - langs/und] Additions to generate the inverted fst's, to enable symmetric yaml testing. 2015-01-13T06:24:21+00:00
> [Template merge - langs/und] Fixed: order of filter application was wrong, causing all Use/-Spell forms to be included in the spellers. 2015-01-12T21:41:21+00:00
> [Template merge - langs/und] Make sure the easter egg is rebuilt every time the fst is rebuilt. 2015-01-09T10:15:40+00:00
> [Template merge - langs/und] Fixed: The MacVoikko target contained one subtarget that built even when spellers were not enabled, and thus failed because of a missing dependency. 2015-01-08T10:08:05+00:00
> [Template merge - langs/und] A number of changes to make the MacVoikko.service build cleanly with proper dependency tracking. Also a bit safer cleaning. 2015-01-07T11:10:56+00:00
> [Template merge - langs/und] Fixed: The MacVoikko target was missing from noinst_DATA, thus it was not built. 2015-01-07T09:42:25+00:00
> [Template merge - langs/und] Two template merges: * Added initial support for building language-specific macosx systemwide spellers. [r105104] * Added strip function to get rid of extra spaces, resolves bug in abbr.txt build. [r104185] 2015-01-05T16:11:58+00:00
> [Template merge - langs/und] Expanded the source file base for building the abbr file, more like the old infra. Included lexc files in src/morphology/ in the abbr file making. 2014-11-19T11:45:17+00:00
> [Template merge - und] Only delete (aka 'make clean') generated corpus files used for weighting if such files exist. Removes a very dangerous 'rm -rf .*' command. 2014-11-18T07:37:22+00:00
> [Template merge - und] Fixed bug in the phonology building that caused extra source files not to be compiled. 2014-11-14T10:19:32+00:00
> [Template merge - und] Removing Use/LexSub strings from all normative fst's. Fixes bug #1904. 2014-11-11T09:48:28+00:00
> [Template merge - und] Added support for turning off building of vislcg3/syntactic tools. 2014-11-04T10:55:58+00:00
> [Template merge - und] Improvements and corrections in the README file. 2014-10-28T23:23:39+00:00
> [Template merge - und] Changed Hfst configuration: * moved xerox check before hfst check to ... * automatically enable hfst if the Xerox tools are not found * moved minimum version requirement definition to configure.ac * removed hfst-foma requirement, instead checking for all required tools * removed path check for obsolete hfst tools * improved hfst configuration messages * updated the summary text to reflect that hfst is automatically enabled These changes should ease configuration on systems without Xerox. 2014-10-28T22:27:33+00:00
> [Template merge - und] Corrected names of compiled twolc files in test/src/phonology/pair-test*.sh.in. We need to use the 'compose' fst because compiled twolc files are not treated the same as other fst's. We can't just skip the new lookup-friendly filenames either, because morphophonological rules can be written using xfscript, in which case the lookup renaming (and inversion) is essential. 2014-10-27T15:11:42+00:00
> [Template merge - und] Corrrected references to the new lookup style fst names. Fixes broken inituppercase tests. Updated config header in initcap yaml file correspondingly. 2014-10-23T09:15:10+00:00
> [Template merge - und] Now both general and language-pair specific relabelling using regexes are supported, in addition to using relabel files. The regexes allow context-dependent and multisymbol changes, whereas the relabel files only cover 1:1 mappings of single symbols. The actual change was to add support for regex files in the language-pair independent processing. The tools/mt/apertium/tagsets/README.txt file was more or less completely rewritten to better document the filenames being recognised, and how they should be used. 2014-10-23T06:15:55+00:00
> [Template merge - und] Retain the regular non-optimised hfst analyser for easy paradigm generation using a regex plus composition. 2014-10-21T11:02:19+00:00
> [Template merge - und] Fixed a bug in the Apertium build that blocked building of AP-tagged analysers. 2014-10-20T19:14:52+00:00
> [Template merge - und] Make sure there is always an apertium analyser for 'und' if nothing else. 2014-10-20T14:41:04+00:00
> [Template merge - und] Do not remove homonymy tags from the apertium fst's. Also simplified the automatic conversion by moving all non-automatic changes to a separate file, run as a sort of tag conversion postprocessing. Updated the tagset/README.txt file to contain info aobut the manually maintained postprocessing relabel file. Added an initial postprocessing relabel file containing word boundary and homonymy tag changes. 2014-10-14T19:22:50+00:00
> [Template merge - und] Do not remove homonymy tags from the regular analysers. 2014-10-14T14:39:21+00:00
> [Template merge - und] Fixed a bug in building Oahpa generators - orig-lang tags were not removed. Clean *.hfstol files in tools/mt/apertium/. 2014-10-14T11:15:36+00:00
> [Template merge - und] Moved Apertium tagset creation and relabeling from src/tagsets/ to tools/mt/apertium/tagsets/. This should fix building of apertium fst's for fin, smn. 2014-10-10T14:28:42+00:00
> [Template merge * 2 - und] 1) Improved test for gnu awk. 2) Renamed AWK to GAWK in relevant places to get around another AWK test. Now gawk is found properly in all cases. 2014-10-09T11:02:07+00:00
> [Template merge - und] Require newest core to force people to upgrade to get an important bugfix. 2014-10-07T14:07:39+00:00
> [Template merge - und] Fixed a bug in the core for generated regexes - a reserved char was not escaped. Required core version bumped. 2014-10-06T07:34:07+00:00
> [Template merge - und] Hfst 3.8.0 is out, with a number of important bug fixes and improvements, including new options required to make our code build properly. 2014-10-04T07:16:41+00:00
> [Template merge - und] Several changes to accomodate a downcaseerror variant of the L2 error fst for Oahpa: * added configure.ac option --enable-downcaseerror (independent of the L2 opt) * a number of changes to the build instructions for Oahpa to support the new fst * made the error fst compilation independent of whether an L2 twolc/xfscript file is used - if not, it will just use the ordinary twolc/xfscript file. This way it is possible to build a downcaseerror fst without starting L2 development. * svn-copied regexes from the old to the new infra, including to the core * increased gtcore version number and required version number due to new regexes 2014-10-02T14:34:27+00:00
> [Template merge - und] Corrected wrong filenames and file references that blocked the oahpa L2 build. 2014-10-02T06:39:24+00:00
> [Template merge - und] Tagset relabeling didn't work for xfst files, now it does. Also generalised the use of relabel files (for use with hfst-relabel). 2014-10-02T06:24:06+00:00
> [Template merge - und] Simplified the building of hfst's with alternative tagsets, now that the *.hfst files are not in optimised lookup format. Silenced regex compilation. 2014-10-01T14:44:20+00:00
> [Template merge - und] Last part of the lookup & composition cleanup: phonetics and phonology now covered. Now all non-lexical and non-filter files have a suffix .compose.* or .lookup.* depending on their intended use, and they are all properly inverted where needed (i.e. only for Xerox' lookup tool). There might still be source files to clean, but that is a separate step. 2014-09-30T22:55:10+00:00
> [Template merge - und] Corrected a couple of cases where old filenames were still used, and thus broke compilation. Also improved filtering of transcriptors, and constructed transcriptor target names dynamically based on the source files. 2014-09-30T21:30:41+00:00
> [Template merge - und] Xfscript and lookup cleanup: now we explicitly build files made for lookup and composition marked in the filenames. This is done for hyphenation and for orthography, phonology and phonetics still to be done. From now on there should be no need to use invert as part of the xfscript code - DON'T DO IT! All targets updated to use the new filenames. Removed inversion from the hyphenation xfscript. 2014-09-30T19:54:37+00:00
> [Template merge - und] Use explicit pipe mode with hfst-xfst. 2014-09-30T10:50:52+00:00
> [Template merge - und] Moved Apertium target language specification from configure.ac to tools/mt/apertium/Makefile.am. Changed the target filename construction to better follow the Apertium naming scheme. Fixed a bug introduced about four weeks ago that destroyed the dependency chain (due to a bug/fragileness in GNU make). 2014-09-30T10:37:59+00:00
> [Template merge - und] Cleaned up building of target fst's using the lookup-include.am file. Now all hfst transducers in optimised lookup format have the suffix .hfstol, and optimisation should not be hidden or implisit anymore. All test scripts should be updated as well. Also move all common targets from src/Makefile.am to am-shared/src-dir-include.am and sub-included AM files. This cleans up the src/ dir Makefile.am quite a lot. 2014-09-29T21:22:26+00:00
> [Template merge - und] Added support for additional local lexc files not part of the lexical fst. 2014-09-29T11:51:44+00:00
> [Template merge - und] Several changes to clean up the mess with the transcriptors: * moved transcriptor final builds from src/ to src/transcriptions/ * renamed transcriptor source files and targets * streamlined transcriptor compilation to use lexc-include and lookup-include * also silenced xfst in lookup-include.am 2014-09-27T10:35:34+00:00
> [Template merge * 2 - und] There were a couple of issues in the previous commit: * vpath directive didn't work reliably * L1 and L2 variabless were declared for easy merging, but in a way that AM didn't like * forgot to change the name of the lexical fst in the filter processing 2014-09-24T12:17:05+00:00
> [Template merge - und] Added support for filters written in lexc and xfscript. Renamed variables and added a lexc-include.am file to support general lexc compilation. 2014-09-22T12:19:48+00:00
> [Template merge - und] Fixed an unfortunate AM syntax error that blocked Automake, and thus all builds. 2014-09-20T10:04:31+00:00
> [Template merge - und] Three template updates at once: * Cleaned the filter build files even more. Now only local / language specific regex source files need to be listed in the local Makefile.am. * Fixed a problem with MT filter compilation that only revealed itself in sme. * Another filter build cleanup: all filter regexes in core are now built for all languages. One obsolete filter was removed. 2014-09-18T17:43:37+00:00
> [Template merge - und] Added a new filter to the filter compilation. Used the new filter to build correct fst's for dictionary analysis and generation. Increased the version number of the required gtd core version, due to the new and required filter in the core. 2014-09-18T07:45:56+00:00
> [Template merge - und] Major cleanup of filter and tagset compilation: * moved all non-local data and build instructions into am-shared/ * created dir-specific am-include files * clean use of regex-include.am * removed sme-specific source files from tools/mt/apertium/tagsets/Makefile.am * switched the apertium filter use to use the one built in src/filters/ instead of rebuilding it 2014-09-17T11:52:42+00:00
> [Template merge - und] analyser-oahpa-gt-desc should be analyser-oahpa-gt-norm. Now renamed. 2014-09-15T17:03:46+00:00
> [Template merge - und] The listbased speller fst is now generated properly using both Xerox and Hfst. 2014-09-15T13:00:26+00:00
> [Template merge - und] Fixed a logical error that turned off all hfst spellers. Renamed a variable. 2014-09-15T10:03:01+00:00
> [Template merge - und] Only build Apertium tagsets in tools/mt/ if Apertium is turned on. 2014-09-15T08:41:49+00:00
> [Template merge - und] Corrected a syntax error in the src_disamb-include.am file. Moved all fst trimming of general interest from tools-spellcheckers-listbased to tools-spellcheckers. Made the configuration so that list-based spellers will only compile if configured to build Hunspell. Also tried to make the configuration of other spellers such that they are automatically off when spellers are off. 2014-09-15T06:04:55+00:00
> [Template merge - und] Downcasing of the initial letter of derived proper nouns (Pariisi -> pariisilainen) is now finally working with Hfst. It requires Hfst svn rev. 4000. 2014-09-12T10:36:40+00:00
> [Template merge - und] The first major step for adding support for generating list-based spellers such as Hunspell and the PLX (Polderland/MS Word) spellers. The conversion is not trivial, since we try to control compounding according to the linguistic specifiation in the lexicon (using tags). Although PLX is only for three Sámi languages, Hunspell conversion should be useful for all languages in our infrastructure. No real Hunspell or PLX files produced yet, only prerequisite fst's. - At the same time fixed a glitch in the version checking of VislCG3 that would turn off support for CG files now that the vislcg3 svn revision number has turned 10 000. 2014-09-08T21:33:54+00:00
> [Template merge - und] Added support for local overrides of the base speller fst. 2014-09-05T20:55:33+00:00
> [Template merge - und] Generalised and simplified the code for building oxt's - no more hard-coded filenames. Now the LO-voikko versions supported as well as the platforms are just defined in two variables, and the rest follows from there. The build code also handles cases of unsupported combinations of voikko versions and platforms. Also silenced the build quite a lot in non-verbose mode. 2014-09-04T08:23:38+00:00
> [Template merge - und] Switched to universal binary build for the LO41 voikko OXT. 2014-09-02T10:13:12+00:00
> [Template merge - und] Made the hfst optimised lookup file format explicit by using the .hfstol suffix, and by optimising files for lookup in a separate build step, instead of implicitly as before. So far only for tools/mt/apertium/, but more will come. Removed the removal of semantic tags - they are already optional, which should be more flexible and robust. 2014-08-27T11:58:27+00:00
> [Template merge - und] Made speller minimisation default to yes, specified where to push weights. 2014-08-26T07:08:16+00:00
> [Template merge - und] Added --encode-weights to determinise and minimise. This fixed the never-ending compilation of Finnish spellers. 2014-08-25T14:32:36+00:00
> [Template merge - und] The optimisations that worked for Greenlandic didn't work for Finnish, potentially due to Finnish being corpus-weighted and thus posing more challenges to determinisation and minimisation. Because of this the Greenlandic optimisation is now rolled into the configuration option --enable-minimised-spellers, OFF by default. 2014-08-22T17:40:51+00:00
> [Template merge - und] Added size and speed optimisations to the speller compilation process: remove-epsilons, push-weights, determinise and minimise. Together this made the KAL speller *much* smaller and *much* faster. It is now as fast and small as any other fst-based speller. 2014-08-22T09:32:19+00:00
> [Template merge - und] Hyperminimisation seems to be stable now, and I have added it as a standard configuration option. Also added autoconf support for the preliminary tool hfst-proc2, to facilitate easier testing of the tokeniser/analyser. 2014-08-21T08:51:07+00:00
> [Template merge - und] Updated the tagset targets to support Xerox fst's, and tagset replacement using regexes instead of the hfst-only relabel tool. Now all languages can get localised analysis and generation tags by adding a regex file and specifying a few targets. 2014-08-19T16:43:15+00:00
> [Template merge - und] Added build step to explicitly convert hfst transducers to optimised lookup format. Whitespace changes in the silent rule variables. Included the new lookup-include file in src-dir-include.am. 2014-08-19T11:34:06+00:00
> [Template merge - und] Preparations for better handling of lookup & testing of free-standing lexc and rewrite rule transducers: added build rules to do inversion of fst's intended for lookup. 2014-08-19T07:53:39+00:00
> [Template merge - und] Added a test dir for the upcoming hfst-based tokeniser. 2014-08-19T07:01:24+00:00
> [Template merge - und] Corrected some paths to enable VPATH building of spellers. Added support for retaining intermediate files when building using "make --debug". 2014-08-15T06:56:35+00:00
> [Template merge - und] Added support for building OXT for LO/OOo 3.6-4.0 for Mac. Language support is limited. 2014-08-11T09:05:30+00:00
> [Template merge - und] Brought all start-up langs in line with the und template, merging in the following revisions (some languages only merged a subset of these, as they have a more recent baseline from the previous merge or creation): 2014-08-08T06:58:04+00:00
> Use/Sub to Err/Sub Command used: cd $GTHOME perl -p -i -e 's/Use\/Sub/Err\/Sub/g' `grep -rl 'Use/Sub' * 2> /dev/null` 2014-06-04T08:13:54+00:00
> [Template merge - und] Massive template merge to bring the startup languages in line with the other languages and the und/ template. 2014-05-28T07:58:18+00:00
> The generate-noun-lemmas test actually pass for EVN. 2014-04-10T07:20:50+00:00
> [Template merge - und] Modified the gttags.txt target to produce output also in cases where no GTD tags are defined (this is the case in some of the experimental languages). Earlier the build would break in this case. 2014-04-10T07:15:23+00:00
> Resolved an incomplete merge from yesterday's massive update - some variables were defined twice. 2014-04-10T07:12:05+00:00
> Massive template merge - updated all the startup languages to the present state of the und template. 2014-04-09T11:52:05+00:00
> Corrected the path for the documentation pages. 2013-12-16T23:03:48+00:00
> Symlink to startup catalogue. 2013-12-16T22:36:14+00:00
858d1004
< Ignore generated regex files in the src/filters/ dir. 2013-12-03T11:56:43+00:00
866,867d1011
< Extra ignores in the filters/ dir. 2013-11-24T15:24:39+00:00
< igno 2013-11-24T15:13:39+00:00
871a1016
> Evenki, Selkup 2013-11-23T20:03:36+00:00