VB.NET에서 정규식 사용하기

VB.NET 2008 에서 정규식 사용하기

정규식 클래스 : System.Text.RegularExpressions.Regex

예제1 - 일치하는 패턴 목록 구하기

Dim regexPattern    As String
Dim regex           As Regex
Dim regexMatches    As MatchCollection
Dim strSource    As String

strSource = "~~~~~원본 문자열~~~~~"
regexPattern = "(?<lbl>([\x27]([l]|[L])([b]|[B])([l]|[L])_\w*[\x27])|([\x22]([l]|[L])([b]|[B])([l]|[L])_\w*[\x22]))"
regex = New Regex(regexPattern)
regexMatches = regex.Matches(strSource)

For j As Integer = 0 To regexMatches.Count - 1
'찾은 문자열에 대해서 어떤 구체적인 작업
Next

예제2 - 일치하는 패턴 치환하기

The program creates a Regex object, passing its constructor a regular expression pattern that will identify text to replace. It calls the object's Replace method, passing it the replacement pattern.

Private Sub btnGo_Click(ByVal sender As System.Object, _ ByVal e As System.EventArgs) Handles btnGo.Click Try Dim reg_exp As New Regex(txtPattern.Text) lblResult.Text = reg_exp.Replace(txtInput.Text, _ txtReplacementPattern.Text) Catch ex As Exception MessageBox.Show(ex.Message) End Try End Sub

The trick in this example lies in the search and replacement patterns. In this example, the search pattern is "(?m)^([^,]*), (.*)$". The pieces of this expression have the following meanings:

(?m)	This is an option directive that indicates a multi-line string. This makes the ^ and $ characters match the beginning and end of a line rather than the beginning and end of the string.
^	Match the beginning of a line.
([^,]*)	Match any character other than comma any number of times. This part is enclosed in parentheses so it forms the first match group.
,	Match a comma followed by a space.
(.*)	Match any character any number of times. This part is enclosed in parentheses so it forms the second match group.
$	Match the end of the line.

The replacement pattern is "$2 $1". This says to replace the stuff that was matched with the second match group, a space, and then the first match group.

This example takes this text:

Archer, Ann Baker, Bob Carter, Cindy Deevers, Dan

And converts it into this:

Ann Archer Bob Baker Cindy Carter Dan Deevers

정규식 표현법

Metacharacters Defined
MChar	Definition
^	Start of a string.
$	End of a string.
.	Any character (except \n newline)
\|	Alternation.
{...}	Explicit quantifier notation.
[...]	Explicit set of characters to match.
(...)	Logical grouping of part of an expression.
*	0 or more of previous expression.
+	1 or more of previous expression.
?	0 or 1 of previous expression; also forces minimal matching when an expression might match several strings within a search string.
\	Preceding one of the above, it makes it a literal instead of a special character. Preceding a special matching character, see below.

Character Escapes http://tinyurl.com/5wm3wl
Escaped Char	Description
ordinary characters	Characters other than . $ ^ { [ ( \| ) ] } * + ? \ match themselves.
\a	Matches a bell (alarm) \u0007.
\b	Matches a backspace \u0008 if in a []; otherwise matches a word boundary (between \w and \W characters).
\t	Matches a tab \u0009.
\r	Matches a carriage return \u000D.
\v	Matches a vertical tab \u000B.
\f	Matches a form feed \u000C.
\n	Matches a new line \u000A.
\e	Matches an escape \u001B.
\040	Matches an ASCII character as octal (up to three digits); numbers with no leading zero are backreferences if they have only one digit or if they correspond to a capturing group number. (For more information, see Backreferences.) For example, the character \040 represents a space.
\x20	Matches an ASCII character using hexadecimal representation (exactly two digits).
\cC	Matches an ASCII control character; for example \cC is control-C.
\u0020	Matches a Unicode character using a hexadecimal representation (exactly four digits).
\*	When followed by a character that is not recognized as an escaped character, matches that character. For example, \* is the same as \x2A.

Character Classes http://tinyurl.com/5ck4ll
Char Class	Description
.	Matches any character except \n. If modified by the Singleline option, a period character matches any character. For more information, see Regular Expression Options.
[aeiou]	Matches any single character included in the specified set of characters.
[^aeiou]	Matches any single character not in the specified set of characters.
[0-9a-fA-F]	Use of a hyphen (–) allows specification of contiguous character ranges.
\p{name}	Matches any character in the named character class specified by {name}. Supported names are Unicode groups and block ranges. For example, Ll, Nd, Z, IsGreek, IsBoxDrawing.
\P{name}	Matches text not included in groups and block ranges specified in {name}.
\w	Matches any word character. Equivalent to the Unicode character categories [\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Nd}\p{Pc}]. If ECMAScript-compliant behavior is specified with the ECMAScript option, \w is equivalent to [a-zA-Z_0-9].
\W	Matches any nonword character. Equivalent to the Unicode categories [^\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Nd}\p{Pc}]. If ECMAScript-compliant behavior is specified with the ECMAScript option, \W is equivalent to [^a-zA-Z_0-9].
\s	Matches any white-space character. Equivalent to the Unicode character categories [\f\n\r\t\v\x85\p{Z}]. If ECMAScript-compliant behavior is specified with the ECMAScript option, \s is equivalent to [ \f\n\r\t\v].
\S	Matches any non-white-space character. Equivalent to the Unicode character categories [^\f\n\r\t\v\x85\p{Z}]. If ECMAScript-compliant behavior is specified with the ECMAScript option, \S is equivalent to [^ \f\n\r\t\v].
\d	Matches any decimal digit. Equivalent to \p{Nd} for Unicode and [0-9] for non-Unicode, ECMAScript behavior.
\D	Matches any nondigit. Equivalent to \P{Nd} for Unicode and [^0-9] for non-Unicode, ECMAScript behavior.

정규 표현식에서 사용하는 Special characters

\	\ 다음에 나오는 특수 문자를 문자열로 인식
^	라인의 처음과 패턴과 매치
$	라인의 끝과 패턴과 매치
*	0개 이상의 문자와 매치(모든것이라는 의미)
+	1개 이상의 문자와 매치, {1,}와 같은 의미
?	0 또는 1개의 문자
.	1개의 문자와 일치
()	한번 match를 수행해서 나온 결과를 기억
\|	OR
{n}	정확히 n개의 문자
{n,}	n개 이상의 문자
{n,m}	n이상 m이하의 문자
[xyz]	캐릭터 셋
[^xyz]	네가티브(-) 캐릭터 셋
[\b]	백스페이스와 매치
\b	단어의 시작 또는 끝에서 빈 문자열과 매치
\B	단어의 시작 또는 끝이 아닌 곳에서의 빈 문자열과 매치
\cX	control 문자와 매치
\d	0부터 9까지의 아라비아 숫자와 매치. [0-9]과 같은 의미
\f	form-feed와 매치
\n	linefeed와 매치
\r	캐리지 리턴과 매치
\s	화이트스페이스 문자와 매치. [ \t\n\r\f\v]과 같은 의미
\S	\s가 아닌 문자들과 매치. [^ \t\n\r\f\v]과 같은 의미
\t	탭 의미
\v	수직 탭 의미
\w	w는 문자가 아닌 0, 1, 2, 3 ... 등과 같은 숫자를 의미
\W	W는 문자가 아닌 요소, 즉 % 등과 같은 특수 문자를 의미함
\n	n은 마지막 일치하는 문장, n은 1-9의 정수

정규 표현식과 함께 사용하는 함수들

exec	문장에서 매치를 위해 검색을 수행하는 정규 표현식 메소드
test	문장에서 매치를 위해 테스트하는 정규표현식 메소드
match	문장에서 매치를 위해 검색을 수행하는 string 메소드
search	문장에서 매치를 위해 테스트하는 string 메소드
replace	문장에서 매치를 위해 검색을 실행하고 문장을 대체하는 String 메소드
split	문장에서 매치하는 부분을 배열에 할당하는 String 메소드

Sample Expression

Pattern	Description
^\d{5}$	5 numeric digits, such as a US ZIP code.
^(\d{5})\|(\d{5}-\d{4}$	5 numeric digits, or 5 digits-dash-4 digits. This matches a US ZIP or US ZIP+4 format.
^(\d{5})(-\d{4})?$	Same as previous, but more efficient. Uses ? to make the -4 digits portion of the pattern optional, rather than requiring two separate patterns to be compared individually (via alternation).
^[+-]?\d+(\.\d+)?$	Matches any real number with optional sign.
^[+-]?\d\.?\d$	Same as above, but also matches empty string.
^(20\|21\|22\|23\|[01]\d)[0-5]\d$	Matches any 24-hour time value.
/\.\*/	Matches the contents of a C-style comment /* … */

--.* SQL Comment --

저작자표시 비영리 변경금지 (새창열림)

'VB.net' 카테고리의 다른 글

visual studio 2010 차트 속성 정리 (0)	2018.10.23
VB.NET DataSet, DataReader (0)	2018.10.01
'윈도우 API를 이용하여 화면보호기 설정값 실시간업데이트 반영 (0)	2018.08.28
폴더 속성 변경 로직 (0)	2018.08.27
DB연결하여 통합검색 개발 가이드 (0)	2018.08.14

내 머릿속 지우개가 있는 개발 일기

VB.NET에서 정규식 사용하기

'VB.net' 카테고리의 다른 글

티스토리툴바

VB.NET에서 정규식 사용하기

'VB.net' 카테고리의 다른 글

관련글

티스토리툴바