I have a simple Java program that takes in hex and converts it to ASCII. Using JDK 8, I compiled the following:
import java.nio.charset.Charset; import java.util.Scanner; public class Main { public static void main(String[] args) { System.out.println("Charset: " + Charset.defaultCharset()); Scanner in = new Scanner(System.in); System.out.print("Type a HEX string: "); String s = in.nextLine(); String asciiStr = new String(); // Split the string into an array String[] hexes = s.split(":"); // For each hex for (String hex : hexes) { // Translate the hex to ASCII System.out.print(" " + Integer.parseInt(hex, 16) + "|" + (char)Integer.parseInt(hex, 16)); asciiStr += ((char) Integer.parseInt(hex, 16)); } System.out.println("\nthe ASCII string is " + asciiStr); in.close(); } } I am passing in a hex string of C0:A8:96:FE to the program. My main concern is the 0x96 value, because it is defined as a control character (characters in the range of 128 - 159).
The output when I run the program without any JVM flags is the following:
Charset: windows-1252 Type a HEX string: C0:A8:96:FE 192|À 168|¨ 150|? 254|þ the ASCII string is À¨?þ The output when I use the JVM flag -Dfile.encoding=ISO-8859-1 to set the character encoding appears to be the following:
Charset: ISO-8859-1 Type a HEX string: C0:A8:96:FE 192|À 168|¨ 150|– 254|þ the ASCII string is À¨–þ I'm wondering why, when the character encoding is set to ISO-8859-1, I get the extra Windows-1252 characters for characters 128 - 159? These characters shouldn't be defined in ISO-8859-1, but should be defined in Windows-1252, but it is appearing to be backwards here. In ISO-8859-1, I would think that the 0x96 character is supposed to be encoded as a blank character, but that is not the case. Instead, the Windows-1252 encoding does this, when it should properly encode it as a –. Any help here?
没有评论:
发表评论