org.apache.xml.utils
Class URI
java.lang.Object
org.apache.xml.utils.URI
- All Implemented Interfaces:
- Serializable
- public class URI
- extends Object
- implements Serializable
A class to represent a Uniform Resource Identifier (URI). This class
is designed to handle the parsing of URIs and provide access to
the various components (scheme, host, port, userinfo, path, query
string and fragment) that may constitute a URI.
Parsing of a URI specification is done according to the URI
syntax described in RFC 2396
. Every URI consists
of a scheme, followed by a colon (':'), followed by a scheme-specific
part. For URIs that follow the "generic URI" syntax, the scheme-
specific part begins with two slashes ("//") and may be followed
by an authority segment (comprised of user information, host, and
port), path segment, query segment and fragment. Note that RFC 2396
no longer specifies the use of the parameters segment and excludes
the "user:password" syntax as part of the authority segment. If
"user:password" appears in a URI, the entire user/password string
is stored as userinfo.
For URIs that do not follow the "generic URI" syntax (e.g. mailto),
the entire scheme-specific part is treated as the "path" portion
of the URI.
Note that, unlike the java.net.URL class, this class does not provide
any built-in network access functionality nor does it provide any
scheme-specific functionality (for example, it does not know a
default port for a specific scheme). Rather, it only knows the
grammar and basic set of operations that can be applied to a URI.
- See Also:
- Serialized Form
Nested Class Summary |
static class |
URI.MalformedURIException
MalformedURIExceptions are thrown in the process of building a URI
or setting fields on a URI when an operation would result in an
invalid URI specification. |
Field Summary |
private static boolean |
DEBUG
Indicate whether in DEBUG mode |
private String |
m_fragment
If specified, stores the fragment for this URI; otherwise null. |
private String |
m_host
If specified, stores the host for this URI; otherwise null. |
private String |
m_path
If specified, stores the path for this URI; otherwise null. |
private int |
m_port
If specified, stores the port for this URI; otherwise -1. |
private String |
m_queryString
If specified, stores the query string for this URI; otherwise
null. |
private String |
m_scheme
Stores the scheme (usually the protocol) for this URI. |
private String |
m_userinfo
If specified, stores the userinfo for this URI; otherwise null. |
private static String |
MARK_CHARACTERS
URI punctuation mark characters - these, combined with
alphanumerics, constitute the "unreserved" characters |
private static String |
RESERVED_CHARACTERS
reserved characters |
private static String |
SCHEME_CHARACTERS
scheme can be composed of alphanumerics and these characters |
private static String |
USERINFO_CHARACTERS
userinfo can be composed of unreserved, escaped and these
characters |
Constructor Summary |
URI()
Construct a new and uninitialized URI. |
URI(String p_uriSpec)
Construct a new URI from a URI specification string. |
URI(String p_scheme,
String p_schemeSpecificPart)
Construct a new URI that does not follow the generic URI syntax.
|
URI(String p_scheme,
String p_userinfo,
String p_host,
int p_port,
String p_path,
String p_queryString,
String p_fragment)
Construct a new URI that follows the generic URI syntax from its
component parts. |
URI(String p_scheme,
String p_host,
String p_path,
String p_queryString,
String p_fragment)
Construct a new URI that follows the generic URI syntax from its
component parts. |
URI(URI p_other)
Construct a new URI from another URI. |
URI(URI p_base,
String p_uriSpec)
Construct a new URI from a base URI and a URI specification string.
|
Method Summary |
void |
appendPath(String p_addToPath)
Append to the end of the path of this URI. |
boolean |
equals(Object p_test)
Determines if the passed-in Object is equivalent to this URI. |
String |
getFragment()
Get the fragment for this URI. |
String |
getHost()
Get the host for this URI. |
String |
getPath()
Get the path for this URI. |
String |
getPath(boolean p_includeQueryString,
boolean p_includeFragment)
Get the path for this URI (optionally with the query string and
fragment). |
int |
getPort()
Get the port for this URI. |
String |
getQueryString()
Get the query string for this URI. |
String |
getScheme()
Get the scheme for this URI. |
String |
getSchemeSpecificPart()
Get the scheme-specific part for this URI (everything following the
scheme and the first colon). |
String |
getUserinfo()
Get the userinfo for this URI. |
private void |
initialize(URI p_other)
Initialize all fields of this URI from another URI. |
private void |
initialize(URI p_base,
String p_uriSpec)
Initializes this URI from a base URI and a URI specification string.
|
private void |
initializeAuthority(String p_uriSpec)
Initialize the authority (userinfo, host and port) for this
URI from a URI string spec. |
private void |
initializePath(String p_uriSpec)
Initialize the path for this URI from a URI string spec. |
private void |
initializeScheme(String p_uriSpec)
Initialize the scheme for this URI from a URI string spec. |
private static boolean |
isAlpha(char p_char)
Determine whether a char is an alphabetic character: a-z or A-Z |
private static boolean |
isAlphanum(char p_char)
Determine whether a char is an alphanumeric: 0-9, a-z or A-Z |
static boolean |
isConformantSchemeName(String p_scheme)
Determine whether a scheme conforms to the rules for a scheme name.
|
private static boolean |
isDigit(char p_char)
Determine whether a char is a digit. |
boolean |
isGenericURI()
Get the indicator as to whether this URI uses the "generic URI"
syntax. |
private static boolean |
isHex(char p_char)
Determine whether a character is a hexadecimal character. |
private static boolean |
isReservedCharacter(char p_char)
Determine whether a character is a reserved character:
';', '/', '?' |
private static boolean |
isUnreservedCharacter(char p_char)
Determine whether a char is an unreserved character. |
private static boolean |
isURIString(String p_uric)
Determine whether a given string contains only URI characters (also
called "uric" in RFC 2396). uric consist of all reserved
characters, unreserved characters and escaped characters. |
static boolean |
isWellFormedAddress(String p_address)
Determine whether a string is syntactically capable of representing
a valid IPv4 address or the domain name of a network host. |
void |
setFragment(String p_fragment)
Set the fragment for this URI. |
void |
setHost(String p_host)
Set the host for this URI. |
void |
setPath(String p_path)
Set the path for this URI. |
void |
setPort(int p_port)
Set the port for this URI. -1 is used to indicate that the port is
not specified, otherwise valid port numbers are between 0 and 65535.
|
void |
setQueryString(String p_queryString)
Set the query string for this URI. |
void |
setScheme(String p_scheme)
Set the scheme for this URI. |
void |
setUserinfo(String p_userinfo)
Set the userinfo for this URI. |
String |
toString()
Get the URI as a string specification. |
RESERVED_CHARACTERS
private static final String RESERVED_CHARACTERS
- reserved characters
- See Also:
- Constant Field Values
MARK_CHARACTERS
private static final String MARK_CHARACTERS
- URI punctuation mark characters - these, combined with
alphanumerics, constitute the "unreserved" characters
- See Also:
- Constant Field Values
SCHEME_CHARACTERS
private static final String SCHEME_CHARACTERS
- scheme can be composed of alphanumerics and these characters
- See Also:
- Constant Field Values
USERINFO_CHARACTERS
private static final String USERINFO_CHARACTERS
- userinfo can be composed of unreserved, escaped and these
characters
- See Also:
- Constant Field Values
m_scheme
private String m_scheme
- Stores the scheme (usually the protocol) for this URI.
m_userinfo
private String m_userinfo
- If specified, stores the userinfo for this URI; otherwise null.
m_host
private String m_host
- If specified, stores the host for this URI; otherwise null.
m_port
private int m_port
- If specified, stores the port for this URI; otherwise -1.
m_path
private String m_path
- If specified, stores the path for this URI; otherwise null.
m_queryString
private String m_queryString
- If specified, stores the query string for this URI; otherwise
null.
m_fragment
private String m_fragment
- If specified, stores the fragment for this URI; otherwise null.
DEBUG
private static boolean DEBUG
- Indicate whether in DEBUG mode
URI
public URI()
- Construct a new and uninitialized URI.
URI
public URI(URI p_other)
- Construct a new URI from another URI. All fields for this URI are
set equal to the fields of the URI passed in.
- Parameters:
p_other
- the URI to copy (cannot be null)
URI
public URI(String p_uriSpec)
throws URI.MalformedURIException
- Construct a new URI from a URI specification string. If the
specification follows the "generic URI" syntax, (two slashes
following the first colon), the specification will be parsed
accordingly - setting the scheme, userinfo, host,port, path, query
string and fragment fields as necessary. If the specification does
not follow the "generic URI" syntax, the specification is parsed
into a scheme and scheme-specific part (stored as the path) only.
- Parameters:
p_uriSpec
- the URI specification string (cannot be null or
empty)
- Throws:
URI.MalformedURIException
- if p_uriSpec violates any syntax
rules
URI
public URI(URI p_base,
String p_uriSpec)
throws URI.MalformedURIException
- Construct a new URI from a base URI and a URI specification string.
The URI specification string may be a relative URI.
- Parameters:
p_base
- the base URI (cannot be null if p_uriSpec is null or
empty)p_uriSpec
- the URI specification string (cannot be null or
empty if p_base is null)
- Throws:
URI.MalformedURIException
- if p_uriSpec violates any syntax
rules
URI
public URI(String p_scheme,
String p_schemeSpecificPart)
throws URI.MalformedURIException
- Construct a new URI that does not follow the generic URI syntax.
Only the scheme and scheme-specific part (stored as the path) are
initialized.
- Parameters:
p_scheme
- the URI scheme (cannot be null or empty)p_schemeSpecificPart
- the scheme-specific part (cannot be
null or empty)
- Throws:
URI.MalformedURIException
- if p_scheme violates any
syntax rules
URI
public URI(String p_scheme,
String p_host,
String p_path,
String p_queryString,
String p_fragment)
throws URI.MalformedURIException
- Construct a new URI that follows the generic URI syntax from its
component parts. Each component is validated for syntax and some
basic semantic checks are performed as well. See the individual
setter methods for specifics.
- Parameters:
p_scheme
- the URI scheme (cannot be null or empty)p_host
- the hostname or IPv4 address for the URIp_path
- the URI path - if the path contains '?' or '#',
then the query string and/or fragment will be
set from the path; however, if the query and
fragment are specified both in the path and as
separate parameters, an exception is thrownp_queryString
- the URI query string (cannot be specified
if path is null)p_fragment
- the URI fragment (cannot be specified if path
is null)
- Throws:
URI.MalformedURIException
- if any of the parameters violates
syntax rules or semantic rules
URI
public URI(String p_scheme,
String p_userinfo,
String p_host,
int p_port,
String p_path,
String p_queryString,
String p_fragment)
throws URI.MalformedURIException
- Construct a new URI that follows the generic URI syntax from its
component parts. Each component is validated for syntax and some
basic semantic checks are performed as well. See the individual
setter methods for specifics.
- Parameters:
p_scheme
- the URI scheme (cannot be null or empty)p_userinfo
- the URI userinfo (cannot be specified if host
is null)p_host
- the hostname or IPv4 address for the URIp_port
- the URI port (may be -1 for "unspecified"; cannot
be specified if host is null)p_path
- the URI path - if the path contains '?' or '#',
then the query string and/or fragment will be
set from the path; however, if the query and
fragment are specified both in the path and as
separate parameters, an exception is thrownp_queryString
- the URI query string (cannot be specified
if path is null)p_fragment
- the URI fragment (cannot be specified if path
is null)
- Throws:
URI.MalformedURIException
- if any of the parameters violates
syntax rules or semantic rules
initialize
private void initialize(URI p_other)
- Initialize all fields of this URI from another URI.
- Parameters:
p_other
- the URI to copy (cannot be null)
initialize
private void initialize(URI p_base,
String p_uriSpec)
throws URI.MalformedURIException
- Initializes this URI from a base URI and a URI specification string.
See RFC 2396 Section 4 and Appendix B for specifications on parsing
the URI and Section 5 for specifications on resolving relative URIs
and relative paths.
- Parameters:
p_base
- the base URI (may be null if p_uriSpec is an absolute
URI)p_uriSpec
- the URI spec string which may be an absolute or
relative URI (can only be null/empty if p_base
is not null)
- Throws:
URI.MalformedURIException
- if p_base is null and p_uriSpec
is not an absolute URI or if
p_uriSpec violates syntax rules
initializeScheme
private void initializeScheme(String p_uriSpec)
throws URI.MalformedURIException
- Initialize the scheme for this URI from a URI string spec.
- Parameters:
p_uriSpec
- the URI specification (cannot be null)
- Throws:
URI.MalformedURIException
- if URI does not have a conformant
scheme
initializeAuthority
private void initializeAuthority(String p_uriSpec)
throws URI.MalformedURIException
- Initialize the authority (userinfo, host and port) for this
URI from a URI string spec.
- Parameters:
p_uriSpec
- the URI specification (cannot be null)
- Throws:
URI.MalformedURIException
- if p_uriSpec violates syntax rules
initializePath
private void initializePath(String p_uriSpec)
throws URI.MalformedURIException
- Initialize the path for this URI from a URI string spec.
- Parameters:
p_uriSpec
- the URI specification (cannot be null)
- Throws:
URI.MalformedURIException
- if p_uriSpec violates syntax rules
getScheme
public String getScheme()
- Get the scheme for this URI.
- Returns:
- the scheme for this URI
getSchemeSpecificPart
public String getSchemeSpecificPart()
- Get the scheme-specific part for this URI (everything following the
scheme and the first colon). See RFC 2396 Section 5.2 for spec.
- Returns:
- the scheme-specific part for this URI
getUserinfo
public String getUserinfo()
- Get the userinfo for this URI.
- Returns:
- the userinfo for this URI (null if not specified).
getHost
public String getHost()
- Get the host for this URI.
- Returns:
- the host for this URI (null if not specified).
getPort
public int getPort()
- Get the port for this URI.
- Returns:
- the port for this URI (-1 if not specified).
getPath
public String getPath(boolean p_includeQueryString,
boolean p_includeFragment)
- Get the path for this URI (optionally with the query string and
fragment).
- Parameters:
p_includeQueryString
- if true (and query string is not null),
then a "?" followed by the query string
will be appendedp_includeFragment
- if true (and fragment is not null),
then a "#" followed by the fragment
will be appended
- Returns:
- the path for this URI possibly including the query string
and fragment
getPath
public String getPath()
- Get the path for this URI. Note that the value returned is the path
only and does not include the query string or fragment.
- Returns:
- the path for this URI.
getQueryString
public String getQueryString()
- Get the query string for this URI.
- Returns:
- the query string for this URI. Null is returned if there
was no "?" in the URI spec, empty string if there was a
"?" but no query string following it.
getFragment
public String getFragment()
- Get the fragment for this URI.
- Returns:
- the fragment for this URI. Null is returned if there
was no "#" in the URI spec, empty string if there was a
"#" but no fragment following it.
setScheme
public void setScheme(String p_scheme)
throws URI.MalformedURIException
- Set the scheme for this URI. The scheme is converted to lowercase
before it is set.
- Parameters:
p_scheme
- the scheme for this URI (cannot be null)
- Throws:
URI.MalformedURIException
- if p_scheme is not a conformant
scheme name
setUserinfo
public void setUserinfo(String p_userinfo)
throws URI.MalformedURIException
- Set the userinfo for this URI. If a non-null value is passed in and
the host value is null, then an exception is thrown.
- Parameters:
p_userinfo
- the userinfo for this URI
- Throws:
URI.MalformedURIException
- if p_userinfo contains invalid
characters
setHost
public void setHost(String p_host)
throws URI.MalformedURIException
- Set the host for this URI. If null is passed in, the userinfo
field is also set to null and the port is set to -1.
- Parameters:
p_host
- the host for this URI
- Throws:
URI.MalformedURIException
- if p_host is not a valid IP
address or DNS hostname.
setPort
public void setPort(int p_port)
throws URI.MalformedURIException
- Set the port for this URI. -1 is used to indicate that the port is
not specified, otherwise valid port numbers are between 0 and 65535.
If a valid port number is passed in and the host field is null,
an exception is thrown.
- Parameters:
p_port
- the port number for this URI
- Throws:
URI.MalformedURIException
- if p_port is not -1 and not a
valid port number
setPath
public void setPath(String p_path)
throws URI.MalformedURIException
- Set the path for this URI. If the supplied path is null, then the
query string and fragment are set to null as well. If the supplied
path includes a query string and/or fragment, these fields will be
parsed and set as well. Note that, for URIs following the "generic
URI" syntax, the path specified should start with a slash.
For URIs that do not follow the generic URI syntax, this method
sets the scheme-specific part.
- Parameters:
p_path
- the path for this URI (may be null)
- Throws:
URI.MalformedURIException
- if p_path contains invalid
characters
appendPath
public void appendPath(String p_addToPath)
throws URI.MalformedURIException
- Append to the end of the path of this URI. If the current path does
not end in a slash and the path to be appended does not begin with
a slash, a slash will be appended to the current path before the
new segment is added. Also, if the current path ends in a slash
and the new segment begins with a slash, the extra slash will be
removed before the new segment is appended.
- Parameters:
p_addToPath
- the new segment to be added to the current path
- Throws:
URI.MalformedURIException
- if p_addToPath contains syntax
errors
setQueryString
public void setQueryString(String p_queryString)
throws URI.MalformedURIException
- Set the query string for this URI. A non-null value is valid only
if this is an URI conforming to the generic URI syntax and
the path value is not null.
- Parameters:
p_queryString
- the query string for this URI
- Throws:
URI.MalformedURIException
- if p_queryString is not null and this
URI does not conform to the generic
URI syntax or if the path is null
setFragment
public void setFragment(String p_fragment)
throws URI.MalformedURIException
- Set the fragment for this URI. A non-null value is valid only
if this is a URI conforming to the generic URI syntax and
the path value is not null.
- Parameters:
p_fragment
- the fragment for this URI
- Throws:
URI.MalformedURIException
- if p_fragment is not null and this
URI does not conform to the generic
URI syntax or if the path is null
equals
public boolean equals(Object p_test)
- Determines if the passed-in Object is equivalent to this URI.
- Overrides:
equals
in class Object
- Parameters:
p_test
- the Object to test for equality.
- Returns:
- true if p_test is a URI with all values equal to this
URI, false otherwise
- See Also:
Object.hashCode()
,
Hashtable
toString
public String toString()
- Get the URI as a string specification. See RFC 2396 Section 5.2.
- Overrides:
toString
in class Object
- Returns:
- the URI string specification
isGenericURI
public boolean isGenericURI()
- Get the indicator as to whether this URI uses the "generic URI"
syntax.
- Returns:
- true if this URI uses the "generic URI" syntax, false
otherwise
isConformantSchemeName
public static boolean isConformantSchemeName(String p_scheme)
- Determine whether a scheme conforms to the rules for a scheme name.
A scheme is conformant if it starts with an alphanumeric, and
contains only alphanumerics, '+','-' and '.'.
- Parameters:
p_scheme
- The sheme name to check
- Returns:
- true if the scheme is conformant, false otherwise
isWellFormedAddress
public static boolean isWellFormedAddress(String p_address)
- Determine whether a string is syntactically capable of representing
a valid IPv4 address or the domain name of a network host. A valid
IPv4 address consists of four decimal digit groups separated by a
'.'. A hostname consists of domain labels (each of which must
begin and end with an alphanumeric but may contain '-') separated
& by a '.'. See RFC 2396 Section 3.2.2.
- Parameters:
p_address
- The address string to check
- Returns:
- true if the string is a syntactically valid IPv4 address
or hostname
isDigit
private static boolean isDigit(char p_char)
- Determine whether a char is a digit.
- Parameters:
p_char
- the character to check
- Returns:
- true if the char is betweeen '0' and '9', false otherwise
isHex
private static boolean isHex(char p_char)
- Determine whether a character is a hexadecimal character.
- Parameters:
p_char
- the character to check
- Returns:
- true if the char is betweeen '0' and '9', 'a' and 'f'
or 'A' and 'F', false otherwise
isAlpha
private static boolean isAlpha(char p_char)
- Determine whether a char is an alphabetic character: a-z or A-Z
- Parameters:
p_char
- the character to check
- Returns:
- true if the char is alphabetic, false otherwise
isAlphanum
private static boolean isAlphanum(char p_char)
- Determine whether a char is an alphanumeric: 0-9, a-z or A-Z
- Parameters:
p_char
- the character to check
- Returns:
- true if the char is alphanumeric, false otherwise
isReservedCharacter
private static boolean isReservedCharacter(char p_char)
- Determine whether a character is a reserved character:
';', '/', '?', ':', '@', '&', '=', '+', '$' or ','
- Parameters:
p_char
- the character to check
- Returns:
- true if the string contains any reserved characters
isUnreservedCharacter
private static boolean isUnreservedCharacter(char p_char)
- Determine whether a char is an unreserved character.
- Parameters:
p_char
- the character to check
- Returns:
- true if the char is unreserved, false otherwise
isURIString
private static boolean isURIString(String p_uric)
- Determine whether a given string contains only URI characters (also
called "uric" in RFC 2396). uric consist of all reserved
characters, unreserved characters and escaped characters.
- Parameters:
p_uric
- URI string
- Returns:
- true if the string is comprised of uric, false otherwise