We are cleansing some marketing data in traditional Chinese. We found R can read UTF-8 traditional Chinese variable names without any problem. However, we can not get valid UTF-8 output there. For example,
If we command: unique(rframe$性別)
This is what we got: [1] "\u5973" "\u7537"
In which 性別 is "gender," \u5973 means female (女), and \u7537 means male (男).
The most interesting thing is R on the Linux platform generates the valid UTF-8 Chinese output if we use the same UTF-8 CSV file. Why does the same RStudio, which can generate Chinese output encoding in UTF-8 on the Linux platform successfully, cannot output valid UTF-8 Chinese output on the Mac system?
This very troublesome issue has been there for a long while. In fact, in the older RStudio version, we could get valid UTF-8 output. Can any friend help us?
Much obliged.
Chandler
https://stackoverflow.com/questions/66832589/how-to-get-the-valid-encoding-output-in-the-chinese-characters-on-rstudio-in-mac March 27, 2021 at 10:46PM
没有评论:
发表评论