[SOLVED] How can non-ASCII characters be removed from a string?

Issue

I have strings "A função", "Ãugent" in which I need to replace characters like ç, ã, and à with empty strings.

How can I remove those non-ASCII characters from my string?

I have attempted to implement this using the following function, but it is not working properly. One problem is that the unwanted characters are getting replaced by the space character.

public static String matchAndReplaceNonEnglishChar(String tmpsrcdta) {
    String newsrcdta = null;
    char array[] = Arrays.stringToCharArray(tmpsrcdta);
    if (array == null)
        return newsrcdta;

    for (int i = 0; i < array.length; i++) {
        int nVal = (int) array[i];
        boolean bISO =
                // Is character ISO control
                Character.isISOControl(array[i]);
        boolean bIgnorable =
                // Is Ignorable identifier
                Character.isIdentifierIgnorable(array[i]);
        // Remove tab and other unwanted characters..
        if (nVal == 9 || bISO || bIgnorable)
            array[i] = ' ';
        else if (nVal > 255)
            array[i] = ' ';
    }
    newsrcdta = Arrays.charArrayToString(array);

    return newsrcdta;
}

Solution

This will search and replace all non ASCII letters:

String resultString = subjectString.replaceAll("[^\\x00-\\x7F]", "");

Answered By – FailedDev

Answer Checked By – Senaida (BugsFixing Volunteer)

Leave a Reply

Your email address will not be published. Required fields are marked *