Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Special characters in bib-file encoded incorrectly #2

Open
apragsdale opened this issue Oct 27, 2020 · 5 comments
Open

Special characters in bib-file encoded incorrectly #2

apragsdale opened this issue Oct 27, 2020 · 5 comments

Comments

@apragsdale
Copy link

I've been struggling with this for a little while now - my bib file has special characters (say author names with accents or umlauts, etc), which are encoded in the .bib file as follows, for example:

@ARTICLE{Skov2020-jc,
  title     = "The nature of Neanderthal introgression revealed by 27,566
               Icelandic genomes",
  author    = "Skov, Laurits and Coll Maci{\`a}, Mois{\`e}s and
               Sveinbj{\"o}rnsson, Gar{\dh}ar and Mafessoni, Fabrizio and
               Lucotte, Elise A and Einarsd{\'o}ttir, Margret S and Jonsson,
               Hakon and Halldorsson, Bjarni and Gudbjartsson, Daniel F and
               Helgason, Agnar and Schierup, Mikkel Heide and Stefansson, Kari",
  journal   = "Nature",
  publisher = "nature.com",
  volume    =  582,
  number    =  7810,
  pages     = "78--83",
  month     =  jun,
  year      =  2020,
  language  = "en"
}

These are usually handled by setting \usepackage[T1]{fontenc} in the preamble.

However, these get converted to rich text in the bib entries in the auto-generated tex file, and latex fails with Undefined control sequence when it encounters those non-ascii encoded characters in the bib entries.

I can't figure out how to enforce Rmarkdown to not convert characters to rich text. In Rstudio, I've tried Tools -> General options -> Code -> Saving and then setting Default text encoding to ASCII, but did not seem to change the behavior. Any ideas?

@molpopgen
Copy link
Member

So, the "good news" is that this actually is working. The only problem is the {\dh} diacritic.

Removing that, the document process with or without fontenc: "T1" added to the bookdown bit of the YAML.
I'll dig deeper later.

@apragsdale
Copy link
Author

Ah, I see - I thought it was because latex wouldn't recognize the rendered characters, but it's the other way around. I remember having to add the /usepackage[T1]{fontenc} to pure latex to get it to recognize that character. So I had hoped adding it in the preamble would fix that issue here as well.

This is an automatically output bib file from paperpile, which gets updated when I add new docs, so I was hoping I wouldn't have to manually remove the {\dh} each time I have a new bib.

@molpopgen
Copy link
Member

It must be possible to get the desired work flow. Ideally, one would prevent pandoc from processing the LaTeX here, leaving it to be processed in the resulting .tex file. I'll look into it more, but will fix #3 first.

@molpopgen
Copy link
Member

molpopgen commented Oct 27, 2020

This seems to be at the pandoc-citeproc step. The following sentence in a regular Rmd does just fine:

Let's write \dh\ and see what happens.

@molpopgen
Copy link
Member

@apragsdale -- The following change to the bibtex record allows everything to pass through nicely. This isn't a solution necessarily, but more of a hint.

  author    = "Skov, Laurits and Coll Maci{\`a}, Mois{\`e}s and
               Sveinbj{\"o}rnsson, Gar\dh ar and Mafessoni, Fabrizio and
               Lucotte, Elise A and Einarsd{\'o}ttir, Margret S and Jonsson,
               Hakon and Halldorsson, Bjarni and Gudbjartsson, Daniel F and
               Helgason, Agnar and Schierup, Mikkel Heide and Stefansson, Kari",

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants