Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export to Alan: Some Fixes Required #476

Open
5 tasks
tajmone opened this issue Feb 24, 2019 · 3 comments
Open
5 tasks

Export to Alan: Some Fixes Required #476

tajmone opened this issue Feb 24, 2019 · 3 comments
Labels

Comments

@tajmone
Copy link

tajmone commented Feb 24, 2019

There are a few problems when exporting maps to Alan (ie. Alan 3, the latest icarnatation of Alan) which need to be addressed:

  • Remove BOM.
  • Export sources as ISO-8859-1.
  • Enclose IDs in single quotes.
  • Escape Quotes in Identifiers.
  • Uncoditional Exits shouldn't have an End.

Hopefully, these will improve Trizbort support for Alan IF authoring.

Don't Add BOM to Source File

The exported ALAN sources contain a BOM, which prevents compiling the generated map-code — indeed, even pasting parts of it into an Alan source could break its encoding in many editors (i.e. cause the editor to switch to UTF-8 encoding).

Export as ISO-8859-1

Alan source files must be in ISO-8859-1 (there are some other possible encodings but they are all single-character encodings). Exporting the file as UTF-8 without BOM should be generally safe enough if the user didn't use any Unicode characters in the project (Alan users are aware of the problem).

If Trizbort could enforce conversion to ISO-8859-1 at export time it would be safer because it would correctly represent those valid ISO-8859-1 characters which are encoded with two-byte sequences in UTF-8 — i.e. ISO chars in the 128-255 ($80-$FF) range, which include some currency symbols, vowels and consonants with accents, umlaut or other diacritical marks or ligatures used in some European alphabets.

Adding ISO conversion would introduce the complication of how to handle out-of-range characters (i.e. chars beyond 255/$FF) — the problem here being that users might use Unicode characters in the Trizbort project to correctly represent text in the map image/PDF file, but will need to omit them from the generated Alan sources, where they are not supported.

Alan users should just stick to using only valid ISO-8859-1 characters in their Trizbort projects, to avoid problems when exporting to Alan source.

Identifiers Within Single Quotes

Room, objects and exits identifiers in exported Alan maps should also always be enclosed in single quotes, like Trizbort does with NAME. Currently, an exported map location looks like this:

The Your Bedroom isa location Name 'Your Bedroom'
...
end The Your Bedroom.

whereas it should be:

The 'Your Bedroom' isa location Name 'Your Bedroom'
...
end The 'Your Bedroom'.

This would be a safe approach, even with single word identifiers (where this might not be required), because enclosing an ID in single quotes allows stropping Alan keywords and use them as identifiers — eg:

The 'The Bedroom' isa location Name 'The Bedroom'
  Exit 'Exit' to corridor.
end The 'The Bedroom'.

Where The and Exit are Alan keyword, but can be safely used in the room an exit ID since they are stropped by single quotes.

Escaping Quotes in Identifiers

If an identifier contains an apostrophe (ie. a single quote char '') it must be escaped by doubling it:

The 'Bob''s Bedroom' isa location Name 'Bob''s Bedroom'

Unconditional Exits Syntax

When an Exit has no Check clause it should not be followed by an End exit.:

The The Foyer isa location Name 'The Foyer'
  Exit North to Kitchen.
  End exit.

should be:

The The Foyer isa location Name 'The Foyer'
  Exit North to Kitchen.
@JasonLautzenheiser
Copy link
Owner

Thanks for the feedback. Not an ALAN user so this is very helpful.

@github-actions
Copy link

Stale issue message

@tajmone
Copy link
Author

tajmone commented Aug 22, 2020

Notes on ISO-8859-1 Conversion

I've edited the original Issue to add a request for converting ALAN sources to ISO-8859-1 encoding when exporting projects to ALAN.

I would like to add a few considerations here.

  1. This problem might affect other IF systems too, especially those that weren't updated since the introduction of Unicode.
  2. Exporting as UTF-8 might break sources in many European languages — i.e. all those languages which have special letters or variants with some ligatures.

In the pre-Unicode era, when code pages and ISO encodings were still the norm, most editors and IDEs had problems handling Unicode/UTF-8 sources. Today the contrary is true, with most editors assuming UTF-8 as the base encoding and switching to it whenever a multi-byte character is pasted into a source.

Setting a source file to ISO-8858-1 encoding is a manual operation, since there's no BOM-like marker to signal that a source file uses a particular encoding other than ASCII. When an out range character (>$FF) is present in the clipboard during paste operations, the editor will have to switch to UTF-8 in order to accommodate the character (smart editors might prevent the paste operation). If the clipboard contains a valid ISO character stored as UTF-8 two-bytes (i.e. ISO chars $80-$FF) the editor will either switch to UTF-8 encoding the source file or convert the clipboard on the fly to preserve the correct encoding.

The problematic ISO characters are those in the 128-255/$80-$FF range, which include a few commonly used symbols and letters present in many European languages (see ISO-8859-1 code page layout on Wikipedia):

£ ¥ © ¿

À Á Â Ã Ä Å È É Ê Ë Ì Í Î Ï Ò Ó Ô Õ Ö Ù Ú Û Ü
à á â ã ä å è é ê ë ì í î ï ò ó ô õ ö ù ú û ü

Æ Ç Ð Ñ Ø Ý Þ ß
æ ç ð ñ ø ý þ ÿ

As the above table shows, many of these characters are fundamental for writing adventures in many languages other than English, and need to be correctly handled when exporting Trizbort maps to source code for any IF system that does not support UTF-8 sources.

Alternative Solution: Export Filters

A quick alternative solution to adding to Trizbort an internal encodings converter could be to allow end users to set filters to code export operations for specific IF systems — i.e. specify a tool of their choice to which the generated source is piped to (before saving) at export time. For OSs which natively ship with the iconv tool, this could be the default filter preset. Other OSs would require end users to install and set a tool of their choice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants