Java HTML character conversion method

Here's the source code for a Java method that converts a given String into an equivalent new String, where characters that cause problems when rendered as HTML have been converted to their ISO Latin equivalent:

  /**
   * Convert extended characters to ISO-Latin equivalents.
   * Might be improved by using a HashMap to map the key/values.
   * However, I'm currently limited to a JDK1.1.x environment, not JDK2.
   */
  public static String convertExtendedCharactersToIsoLatin(String input)
  {
    StringBuffer buffer = new StringBuffer(input);
    StringBuffer output = new StringBuffer();
    int length = buffer.length();
    for (int i = 0; i < length; i++)
    {
      if (buffer.charAt(i) == '"')
        output.append(""");
      else if (buffer.charAt(i) == '!')
        output.append("!");
      else if (buffer.charAt(i) == '#')
        output.append("#");
      else if (buffer.charAt(i) == '&')
        output.append("&");
      else if (buffer.charAt(i) == '\'')
        output.append("'");
      else if (buffer.charAt(i) == '(')
        output.append("(");
      else if (buffer.charAt(i) == ')')
        output.append(")");
      else if (buffer.charAt(i) == '*')
        output.append("*");
      else if (buffer.charAt(i) == '+')
        output.append("+");
      else if (buffer.charAt(i) == ',')
        output.append(",");
      else if (buffer.charAt(i) == '-')
        output.append("-");
      else if (buffer.charAt(i) == '.')
        output.append(".");
      else if (buffer.charAt(i) == '/')
        output.append("/");
      else if (buffer.charAt(i) == ':')
        output.append(":");
      else if (buffer.charAt(i) == ';')
        output.append(";");
      else if (buffer.charAt(i) == '<')
        output.append("<");
      else if (buffer.charAt(i) == '=')
        output.append("=");
      else if (buffer.charAt(i) == '>')
        output.append(">");
      else if (buffer.charAt(i) == '?')
        output.append("?");
      else if (buffer.charAt(i) == '@')
        output.append("@");
      else if (buffer.charAt(i) == '[')
        output.append("[");
      else if (buffer.charAt(i) == '\\')
        output.append("\");
      else if (buffer.charAt(i) == ']')
        output.append("]");
      else if (buffer.charAt(i) == '^')
        output.append("^");
      else if (buffer.charAt(i) == '_')
        output.append("_");
      else if (buffer.charAt(i) == '`')
        output.append("`");
      else if (buffer.charAt(i) == '{')
        output.append("{");
      else if (buffer.charAt(i) == '|')
        output.append("|");
      else if (buffer.charAt(i) == '}')
        output.append("}");
      else if (buffer.charAt(i) == '~')
        output.append("~");
      else
        output.append(buffer.charAt(i));
    }
    return output.toString();
  }

 

Note: Looking at this code a few year's later I don't know why I didn't write this using a 2D array and a simpler loop, but in the end, it works, and performance hasn't been an issue.

Share it!

There’s just one person behind this website; if this article was helpful (or interesting), I’d appreciate it if you’d share it. Thanks, Al.

Add new comment

The content of this field is kept private and will not be shown publicly.

Anonymous format

  • Allowed HTML tags: <em> <strong> <cite> <code> <ul type> <ol start type> <li> <pre>
  • Lines and paragraphs break automatically.
By submitting this form, you accept the Mollom privacy policy.