public final class UriEncoder
extends java.lang.Object
Per Section 2.1 of RFC 3986, URIs should contain only characters that are part of US-ASCII, and some characters are further reserved to delimit components or subcomponents; therefore, characters that are outside the allowed set need to be encoded. This is done using the escape sequence "%XX" where XX is the hexadecimal value of the bytewise representation of the character.
This encoding format is used for the application/x-www-form-urlencoded content type, as defined by section 17.13.4 of the W3C's HTML 4.01 Specification.
For example, the Unicode string "flambé" is represented as the byte
sequence [0x66, 0x6c, 0x61, 0x6d, 0x62, 0xe9]
in ISO-8859-1. In
UTF-8, it is represented as [0x66, 0x6c, 0x61, 0x6d, 0x62, 0xc3,
0xa9]
. The first five characters are unreserved and do not require encoding,
but the last character is not, so the URI representation is "flamb%E9" in
ISO-8859-1 and "flamb%C3%A9" in UTF-8. Escape sequences are not
case-sensitive.
Uri
Modifier and Type | Field and Description |
---|---|
static java.nio.charset.Charset |
DEFAULT_ENCODING
The default character encoding, UTF-8, per Section 2.5 of RFC 3986.
|
Modifier and Type | Method and Description |
---|---|
static java.lang.String |
decode(java.lang.String string)
Percent-decodes a US-ASCII string into a Unicode string.
|
static java.lang.String |
decode(java.lang.String string,
java.nio.charset.Charset encoding)
Percent-decodes a US-ASCII string into a Unicode string.
|
static java.lang.String |
encode(java.lang.String string)
Percent-encodes a Unicode string into a US-ASCII string.
|
static java.lang.String |
encode(java.lang.String string,
java.nio.charset.Charset encoding)
Percent-encodes a Unicode string into a US-ASCII string.
|
public static java.lang.String encode(java.lang.String string)
DEFAULT_ENCODING
, UTF-8, is used to determine how non-US-ASCII and
reserved characters should be represented as consecutive sequences of the
form "%XX".
This replaces ' ' with '+'. So this method should not be used for non application/x-www-form-urlencoded strings such as host and path.
string
- a Unicode stringjava.lang.NullPointerException
- if string
is nullpublic static java.lang.String encode(java.lang.String string, java.nio.charset.Charset encoding)
This replaces ' ' with '+'. So this method should not be used for non application/x-www-form-urlencoded strings such as host and path.
string
- a Unicode stringencoding
- a character encodingjava.lang.NullPointerException
- if any argument is nullpublic static java.lang.String decode(java.lang.String string)
DEFAULT_ENCODING
, UTF-8, is used to determine what characters are
represented by any consecutive sequences of the form "%XX".
This replaces '+' with ' '. So this method should not be used for non application/x-www-form-urlencoded strings such as host and path.
string
- a percent-encoded US-ASCII stringjava.lang.NullPointerException
- if string
is nullpublic static java.lang.String decode(java.lang.String string, java.nio.charset.Charset encoding)
This replaces '+' with ' '. So this method should not be used for non application/x-www-form-urlencoded strings such as host and path.
string
- a percent-encoded US-ASCII stringencoding
- a character encodingjava.lang.NullPointerException
- if any argument is nulljava.lang.RuntimeException
- if any the decoding failed because some %
sequence above is invalid (for example, "%HH")