English | Site Directory

Android - An Open Handset Alliance Project

java.util.regex
public final class

java.util.regex.Pattern

java.lang.Object
java.util.regex.Pattern Serializable

Represents a pattern used for matching, searching, or replacing strings. Patterns are specified in terms of regular expressions and compiled using an instance of this class. They are then used in conjunction with a Matcher to perform the actual search.

The regular expressions used in this class are actually a superset of those used in other implementations. This means that existing applications will normally work as expected, but in rare cases some regular expression content that is meant to be literal might be interpreted with a special meaning. The most notable examples of this are be the strings "&&" and "--" which represent character class union and intersection. Also, some of the flags are handled slightly different:

  • The flag CASE_INSENSITIVE silently assumes Unicode case-insensitivity.
  • The flag CANON_EQ is not supported at all (throws exception).

Summary

Constants

      Value  
int  CANON_EQ  This constant specifies that a character in a Pattern and a character in the input string only match if they are canonically equivalent.  128  0x00000080 
int  CASE_INSENSITIVE  This constant specifies that a Pattern is matched case-insensitive.  0x00000002 
int  COMMENTS  This constant specifies that a Pattern may contain whitespace or comments.  0x00000004 
int  DOTALL  This constant specifies that the '.' meta character matches arbitrary characters, including line endings, which is normally not the case.  32  0x00000020 
int  LITERAL  This constant specifies that the whole Pattern is to be taken literally, that is, all meta characters lose their meanings.  16  0x00000010 
int  MULTILINE  This constant specifies that the meta characters '^' and '$' match only the beginning and end end of an input line, respectively.  0x00000008 
int  UNICODE_CASE  This constant specifies that a Pattern is matched case-insensitive with regard to all Unicode characters.  64  0x00000040 
int  UNIX_LINES  This constant specifies that a Pattern recognizes only Unix line endings ('\n') in the '.', '^', and '$' meta characters.  0x00000001 

Public Methods

      static  Pattern  compile(String pattern, int flags)
Compiles a regular expression, creating a new Pattern instance in the process.
      static  Pattern  compile(String pattern)
Compiles a regular expression, creating a new Pattern instance in the process.
        int  flags()
Returns the flags that have been set for this Pattern instance.
        Matcher  matcher(CharSequence input)
Returns a matcher for the regular expression and a given input.
      static  boolean  matches(String regex, CharSequence input)
Tries to match a given pattern against a given input.
        String  pattern()
Returns the regular expression that was compiled into this Pattern.
      static  String  quote(String s)
Quotes a given string, escaping all meta-characters with backslashes in the process.
        String[]  split(CharSequence inputSeq, int limit)
Splits the given input sequence around occurences of the Pattern.
        String[]  split(CharSequence input)
Splits a given input around occurences of a regular expression.
        String  toString()
Answers a string containing a concise, human-readable description of the receiver.

Protected Methods

        void  finalize()
Called by the virtual machine when there are no longer any (non-weak) references to the receiver.
Methods inherited from class java.lang.Object

Details

Constants

public static final int CANON_EQ

This constant specifies that a character in a Pattern and a character in the input string only match if they are canonically equivalent. This flag is (currently) not supported in Android.
Constant Value: 128 (0x00000080)

public static final int CASE_INSENSITIVE

This constant specifies that a Pattern is matched case-insensitive. Note that for Android this always takes all Unicode characters into account, whereas the JDK only does this for ASCII characters. That is, it silently assumes that UNICODE_CASE is also set.
Constant Value: 2 (0x00000002)

public static final int COMMENTS

This constant specifies that a Pattern may contain whitespace or comments. Otherwise comments and whitespace are taken as literal characters.
Constant Value: 4 (0x00000004)

public static final int DOTALL

This constant specifies that the '.' meta character matches arbitrary characters, including line endings, which is normally not the case.
Constant Value: 32 (0x00000020)

public static final int LITERAL

This constant specifies that the whole Pattern is to be taken literally, that is, all meta characters lose their meanings.
Constant Value: 16 (0x00000010)

public static final int MULTILINE

This constant specifies that the meta characters '^' and '$' match only the beginning and end end of an input line, respectively. Normally, they match the beginning and the end of the complete input.
Constant Value: 8 (0x00000008)

public static final int UNICODE_CASE

This constant specifies that a Pattern is matched case-insensitive with regard to all Unicode characters. It is used in conjunction with the CASE_INSENSITIVE flag. The CASE_INSENSITIVE flag alone only achieves case-insensitivity with regard to ASCII characters. This flag is (currently) not supported in Android.
Constant Value: 64 (0x00000040)

public static final int UNIX_LINES

This constant specifies that a Pattern recognizes only Unix line endings ('\n') in the '.', '^', and '$' meta characters. This flag is (currently) not supported in Android.
Constant Value: 1 (0x00000001)

Public Methods

public static Pattern compile(String pattern, int flags)

Compiles a regular expression, creating a new Pattern instance in the process. Allows to set some flags that modify the behavior of the Pattern.

Parameters

pattern The regular expression.
flags The flags to set. Basically, any combination of the constants defined in this class is valid. Some of the flags (UNIX_LINES and CANON_EQ) are not supported in Android, though.

Returns

  • The new Pattern instance.

Throws

PatternSyntaxException If the regular expression is syntactically incorrect.

public static Pattern compile(String pattern)

Compiles a regular expression, creating a new Pattern instance in the process.

Parameters

pattern The regular expression.

Returns

  • The new Pattern instance.

Throws

PatternSyntaxException If the regular expression is syntactically incorrect.

public int flags()

Returns the flags that have been set for this Pattern instance.

Returns

  • The flags that have been set. A combination of the constants defined in this class.

public Matcher matcher(CharSequence input)

Returns a matcher for the regular expression and a given input. The matcher can be used to match the pattern against the whole input, find occurences of the pattern in the input, or replace parts of the input.

Parameters

input The input to process.

Returns

  • The resulting Matcher.

public static boolean matches(String regex, CharSequence input)

Tries to match a given pattern against a given input. This is actually nothing but a convenience method that compiles the pattern, builds a matcher for it, and does the match. If the same pattern is used for multiple operations, it is recommended the pattern is compiled explicitly and a reusable matcher is created.

Parameters

regex The regular expression.
input The input to process.

Returns

  • true if and only if the pattern matches the input.

public String pattern()

Returns the regular expression that was compiled into this Pattern.

Returns

  • The regular expression.

public static String quote(String s)

Quotes a given string, escaping all meta-characters with backslashes in the process. The resulting string is guaranteed to have no unescaped occurences of the characters '?+[(){}^$|\./' anymore.

Parameters

s The string to quote.

Returns

  • The quoted string.

public String[] split(CharSequence inputSeq, int limit)

Splits the given input sequence around occurences of the Pattern. The function first determines all occurences of the Pattern inside the input sequence. It then builds an array of the "remaining" strings before, in-between, and after these occurences. An additional parameter determines the maximal number of entries in the resulting array and the handling of trailing empty strings.

Parameters

inputSeq The input sequence.
limit Determines the maximal number of entries in the resulting array.
  • For n > 0, it is guaranteed that the resulting array contains at most n entries.
  • For n < 0, the length of the resulting array is exactly the number of occurences of the pattern +1. All entries are included.
  • For n == 0, the length of the resulting array is at most the number of occurences of the pattern +1. Empty strings at the end of the array are not included.

Returns

  • The resulting array.

public String[] split(CharSequence input)

Splits a given input around occurences of a regular expression. This is a convencience method that is equivalent to calling split(java.lang.CharSequence, int) with a limit of 0.

Parameters

input The input sequence.

Returns

  • The resulting array.

public String toString()

Answers a string containing a concise, human-readable description of the receiver.

Protected Methods

protected void finalize()

Called by the virtual machine when there are no longer any (non-weak) references to the receiver. Subclasses can use this facility to guarantee that any associated resources are cleaned up before the receiver is garbage collected. Uncaught exceptions which are thrown during the running of the method cause it to terminate immediately, but are otherwise ignored.

Note: The virtual machine assumes that the implementation in class Object is empty.

Build m5-rc15i - 10 Jun 2008 13:54