java.util.regex
public
final
class
java.util.regex.Pattern
Represents a pattern used for matching, searching, or replacing strings.
Patterns are specified in terms of regular expressions and compiled using
an instance of this class. They are then used in conjunction with a Matcher
to perform the actual search.
The regular expressions used in this class are actually a superset of
those used in other implementations. This means that existing applications
will normally work as expected, but in rare cases some regular expression
content that is meant to be literal might be interpreted with a special
meaning. The most notable examples of this are be the strings "&&" and "--"
which represent character class union and intersection. Also, some of the
flags are handled slightly different:
- The flag CASE_INSENSITIVE silently assumes Unicode case-insensitivity.
- The flag CANON_EQ is not supported at all (throws exception).
Summary
Constants
| |
|
|
Value |
|
| int |
CANON_EQ |
This constant specifies that a character in a Pattern and a character in
the input string only match if they are canonically equivalent. |
128 |
0x00000080 |
| int |
CASE_INSENSITIVE |
This constant specifies that a Pattern is matched case-insensitive. |
2 |
0x00000002 |
| int |
COMMENTS |
This constant specifies that a Pattern may contain whitespace or
comments. |
4 |
0x00000004 |
| int |
DOTALL |
This constant specifies that the '.' meta character matches arbitrary
characters, including line endings, which is normally not the case. |
32 |
0x00000020 |
| int |
LITERAL |
This constant specifies that the whole Pattern is to be taken literally,
that is, all meta characters lose their meanings. |
16 |
0x00000010 |
| int |
MULTILINE |
This constant specifies that the meta characters '^' and '$' match only
the beginning and end end of an input line, respectively. |
8 |
0x00000008 |
| int |
UNICODE_CASE |
This constant specifies that a Pattern is matched case-insensitive
with regard to all Unicode characters. |
64 |
0x00000040 |
| int |
UNIX_LINES |
This constant specifies that a Pattern recognizes only Unix line
endings ('\n') in the '.', '^', and '$' meta characters. |
1 |
0x00000001 |
Public Methods
Protected Methods
clone,
equals,
finalize,
getClass,
hashCode,
notify,
notifyAll,
toString,
wait,
wait,
wait
Details
Constants
public
static
final
int
CANON_EQ
This constant specifies that a character in a Pattern and a character in
the input string only match if they are canonically equivalent. This
flag is (currently) not supported in Android.
Constant Value:
128
(0x00000080)
public
static
final
int
CASE_INSENSITIVE
This constant specifies that a Pattern is matched case-insensitive. Note
that for Android this always takes all Unicode characters into account,
whereas the JDK only does this for ASCII characters. That is, it silently
assumes that UNICODE_CASE is also set.
Constant Value:
2
(0x00000002)
public
static
final
int
COMMENTS
This constant specifies that a Pattern may contain whitespace or
comments. Otherwise comments and whitespace are taken as literal
characters.
Constant Value:
4
(0x00000004)
public
static
final
int
DOTALL
This constant specifies that the '.' meta character matches arbitrary
characters, including line endings, which is normally not the case.
Constant Value:
32
(0x00000020)
public
static
final
int
LITERAL
This constant specifies that the whole Pattern is to be taken literally,
that is, all meta characters lose their meanings.
Constant Value:
16
(0x00000010)
public
static
final
int
MULTILINE
This constant specifies that the meta characters '^' and '$' match only
the beginning and end end of an input line, respectively. Normally, they
match the beginning and the end of the complete input.
Constant Value:
8
(0x00000008)
public
static
final
int
UNICODE_CASE
This constant specifies that a Pattern is matched case-insensitive
with regard to all Unicode characters. It is used in conjunction with
the CASE_INSENSITIVE flag. The CASE_INSENSITIVE flag alone only
achieves case-insensitivity with regard to ASCII characters. This
flag is (currently) not supported in Android.
Constant Value:
64
(0x00000040)
public
static
final
int
UNIX_LINES
This constant specifies that a Pattern recognizes only Unix line
endings ('\n') in the '.', '^', and '$' meta characters. This flag is
(currently) not supported in Android.
Constant Value:
1
(0x00000001)
Public Methods
public
static
Pattern
compile(String pattern, int flags)
Compiles a regular expression, creating a new Pattern instance in the
process. Allows to set some flags that modify the behavior of the
Pattern.
Parameters
| pattern
| The regular expression. |
| flags
| The flags to set. Basically, any combination of the
constants defined in this class is valid. Some of the
flags (UNIX_LINES and CANON_EQ) are not supported in
Android, though. |
Returns
- The new Pattern instance.
public
static
Pattern
compile(String pattern)
Compiles a regular expression, creating a new Pattern instance in the
process.
Parameters
| pattern
| The regular expression. |
Returns
- The new Pattern instance.
public
int
flags()
Returns the flags that have been set for this Pattern instance.
Returns
- The flags that have been set. A combination of the constants
defined in this class.
Returns a matcher for the regular expression and a given input. The
matcher can be used to match the pattern against the whole input, find
occurences of the pattern in the input, or replace parts of the input.
Parameters
| input
| The input to process. |
public
static
boolean
matches(String regex, CharSequence input)
Tries to match a given pattern against a given input. This is actually
nothing but a convenience method that compiles the pattern, builds a
matcher for it, and does the match. If the same pattern is used for
multiple operations, it is recommended the pattern is compiled explicitly
and a reusable matcher is created.
Parameters
| regex
| The regular expression. |
| input
| The input to process. |
Returns
- true if and only if the pattern matches the input.
public
String
pattern()
Returns the regular expression that was compiled into this Pattern.
Quotes a given string, escaping all meta-characters with backslashes
in the process. The resulting string is guaranteed to have no unescaped
occurences of the characters '?+[(){}^$|\./' anymore.
Splits the given input sequence around occurences of the Pattern. The
function first determines all occurences of the Pattern inside the input
sequence. It then builds an array of the "remaining" strings
before, in-between, and after these occurences. An additional parameter
determines the maximal number of entries in the resulting array and
the handling of trailing empty strings.
Parameters
| inputSeq
| The input sequence. |
| limit
| Determines the maximal number of entries in the resulting
array.
- For n > 0, it is guaranteed that the resulting
array contains at most n entries.
- For n < 0, the length of the resulting array
is exactly the number of occurences of the pattern
+1. All entries are included.
- For n == 0, the length of the resulting array
is at most the number of occurences of the pattern
+1. Empty strings at the end of the array are not
included.
|
Splits a given input around occurences of a regular expression. This is
a convencience method that is equivalent to calling
split(java.lang.CharSequence, int) with a limit of 0.
Parameters
| input
| The input sequence. |
public
String
toString()
Answers a string containing a concise, human-readable description of the
receiver.
Protected Methods
protected
void
finalize()
Called by the virtual machine when there are no longer any (non-weak)
references to the receiver. Subclasses can use this facility to guarantee
that any associated resources are cleaned up before the receiver is
garbage collected. Uncaught exceptions which are thrown during the
running of the method cause it to terminate immediately, but are
otherwise ignored.
Note: The virtual machine assumes that the implementation in class Object
is empty.