Whenever you use with XML files, expect these character entities and manually check for them. You should never manually translate back and forth between regular text and entity-replaced text. Usually, all programs and functions capable of reading and writing XML should do this for you. Unfortunately, this will not always work, and you may still find these combinations of characters in your data. To decode a string that contains XML entities within R, you can use xml_text(read_xml(paste0("<x>", s, "</x>")))
from the xml2 package.