Friday, 27 January 2017

REGULAR EXPRESSION IN PYTHON

Regular expressions are used to extract the required information from the given data by following patterns. Regular expressions are also used to check whether the given input data is in proper format or not. Example – Email-Id verification, Mobile number verification, Password validation and so on. The regular expressions which are supported by Perl are also supported by python. Python is providing the built-in functions to work with the regular expression easily. All the predefined functions which are related to regular expression are present in re module.

Special characters which are used in regular expression are –

  • ·       * ----> It matches zero or more occurrences of preceding character.

          Example –
          ab*c
          ac
          abc
          abbc
          abbbbbbc
          bbdc #Error

  • ·       + ----> It matches one or more occurrences of preceding character.

          Example –
          ab+c
          ac #Error
          abc
          abbc
          abbbbc

  • ·       ? ----> It matches zero or one occurrence of preceding character.

          Example –
          ab?c
          ac
          abc
          abbc #Error

  • ·       . ----> It matches any single character.

Example –
a.c
agc
asc
a$c
abcd #Error

  • ·       [ ] ----> It matches any single character to given list.

Example –
b[aeiou]d
bad
bed
bid
bod
bud
bgd #Error

  • ·       [^] ----> It matches any single character other than in the given list.

Example –
b[^aeiou]d
bad #Error
bed #Error
bid #Error
bod #Error
bud #Error
bgd

  • ·       [-] ----> It matches any single character in the given range.

Example –
x[a-e]y
xay
xby
xey
xfy #Error

[0-9] ---->any single digit.
[a-z] ----> any one lowercase alphabet
[A-Z] ----> any one uppercase alphabet
[a-zA-Z] ----> any one alphabet
[a-zA-Z0-9] ----> any one alphanumeric
[^0-9] ----> any single non-digit
[^A-Z] ----> any one non uppercase alphabet
[^a-z] ----> any one non lowercase alphabet
[^a-zA-Z] ----> any one non-alphabet
[^a-zA-Z0-9] ----> any one non-alphanumeric

  • ·       (|) ----> Match anyone string in the list.

Example –
(java|hadoop|python)
java
hadoop
python
php #Error

  • ·       {m} ----> It matches exact occurrence of preceding character.

Example –
ab{3}c
abc #Error
abbc #Error
abbbc

  • ·  {m,n} ----> It matches minimum m occurrence and maximum n occurrence of preceding character.

Example –
ab{3,5}c
abbc #Error
abbbc
abbbbc

  • ·  {m,} ----> It matches minimum m occurrence and maximum no limit of preceding character.

Example –
ab{3,}c
abc
abbbbbbbc

  • ·       ^ ----> Start of the line

Example –
^Python

  • ·       $ ----> End of the line

Example –
Python$

  • ·       \d or [0-9] ----> any single digit

Example –
[0-9][0-9][0-9] or [0-9]{3} or \d\d\d or \d{3}

  • ·       \D or [^0-9] ---->any single non-digit
  • ·       \w or [a-zA-Z0-9] ----> any alphanumeric
  • ·       \W or [^a-zA-Z0-9] ----> any non alphanumeric (special characters)
  • ·       \s ----> “, , \t, \n
  • ·       \b ----> word boundary

No comments:

Post a Comment