Copying and Pasting Regular Expressions

Copy and paste is the simple and easy way to transfer a regular expression between RegexBuddy and your searching, editing and coding tools. You can use the regular Ctrl+C and Ctrl+V shortcut keys to copy and paste the selected part of the regex as is. Or, you can use the Copy and Paste buttons on RegexBuddy’s toolbar to transfer the regular expression in different formats. If you often use RegexBuddy in conjunction with a particular tool or application, you may want to see if it is possible to integrate RegexBuddy with that tool.

The search boxes of text editors, grep utilities, etc. do not require the regular expression to be formatted in any way, beyond being a valid regex pattern. This is exactly the way you enter the regular expression in RegexBuddy, whether you type it in or insert tokens on the Create panel. So you can simply copy and paste the regex “as is”.

If you want to use the regular expression in the source code of an application or script you are developing, the regex needs to be formatted according to the rules of your programming language. Some languages, such as Perl and JavaScript use a special syntax reserved for regular expressions. Other languages rely on external libraries for regular expression support, requiring you to pass regexes to their function calls as strings. This is where things get a bit messy.

Flavors and Matching Modes

Matching modes like “dot matches newline” and “case insensitive” are generally not included as part of the regular expression. That means they aren’t included when copying and pasting regular expressions. The only exceptions are the JavaScript, Perl, PHP preg, and Ruby operators. For all the other string styles, you’ll need to make sure by yourself that you’re using the same matching modes in your actual tool or source code as in RegexBuddy. If you want RegexBuddy to take care of this for you, use the source code snippets on the Use panel rather than direct copy and paste.

In particular, free-spacing mode is something you need to pay attention to. Many string styles, including Basic, C, Java, JavaScript, and Pascal, do not support multi-line strings. When you tell RegexBuddy to copy a regular expression to the clipboard using one of these string styles, RegexBuddy will copy a concatenation of multiple strings to the clipboard, one for each line in your regular expression. Each string will end with a line break, so comments in the regex are terminated correctly. That way your regex will be formatted the same way in your source code as in RegexBuddy.

When pasting the regular expression back into RegexBuddy, RegexBuddy cannot determine from the string alone whether the line breaks in the string are line breaks that your regular expression should match, or whether the line breaks are free-spacing line breaks. To solve this, RegexBuddy simply looks whether you’ve selected free-spacing or exact spacing in the drop-down list on the main toolbar. If you selected “exact spacing”, line breaks in the string are included as \r\n in the regex. If you selected “free-spacing”, line breaks are treated as whitespace that split the regex into multiple lines in RegexBuddy. To make sure RegexBuddy pastes your regular expression correctly, set the free-spacing mode in RegexBuddy to match the free-spacing option in your source code before you paste the regex.

Copying a regular expression doesn’t convert it into the correct regular expression flavor. For example, if you’re editing a regular expression with the PCRE flavor selected and you select the Copy as JavaScript Operator command, you’ll get a PCRE-style regular expression formatted as a JavaScript operator. If you want the regular expression to be fully converted to JavaScript, first convert the regular expression on the Convert panel. After clicking Accept Conversion, you can then select Copy as JavaScript Operator to copy the converted regex as a JavaScript operator.

Copying The Regex as a String or Operator

In regular expressions, metacharacters must be escaped with a backslash. In many programming languages, backslashes appearing in strings must be escaped with another backslash. This means that the regex \\ which matches a single backslash, becomes "\\\\" in Java or C/C++. The regex "\\" which matches a single backslash between double quotes, becomes "\"\\\\\"". How’s that for clarity?

When you generate code snippets on the Use panel, RegexBuddy automatically inserts your regexes correctly formatted as a strings or operators as required by the programming language. To use a regex you prepared in RegexBuddy without creating a code snippet, click the Copy button on the main toolbar. You can copy the regular expression to the clipboard in one of several formats. Directly under the Copy button, you can find the string style used by the programming language you’ve selected in the list of applications. If the source code template for that application uses a different string style, that one will also be listed directly under the Copy button. All other string styles will be listed under the “Copy Regex As” submenu.

As Is: Copies the regex unchanged. Appropriate for tools and applications, but not for source code.

Basic-style string: The style used by programming languages derived from Basic, including Visual Basic and Xojo (REALbasic). A double-quoted string. A double quote in the regex becomes two double quotes in the string.

C string: A string of char in C and C++. A double-quoted string. Backslashes and double quotes are escaped with a backslash. Supports escapes such as \t, \n, \r, and \xFF at the string level.

C Wide string: A string of wchar_t in C and C++. A double-quoted string prefixed with the letter L. Backslashes and double quotes are escaped with a backslash. Supports escapes such as \t, \n, \r, \xFF, and \uFFFF at the string level.

C++11 Raw string: A string of char in C++11 quoted as a Raw string. Raw strings can contain literal line breaks and unescaped quotes and backslashes. Does not support any character escapes. If the regex contains the characters )" then RegexBuddy automatically uses a longer custom delimiter to make sure the delimiter does not occur in the regex. If the regex does not contain any line breaks, quotes, or backslashes a normal double-quoted string is copied as then there is no benefit to using a raw string.

C++11 Wide Raw string: A string of wchar_t in C++11 quoted as a Raw string.

C# string: If the regex contains backslashes, it will be copied as a verbatim string for C# which doesn’t require backslashes to be escaped. Otherwise, it will be copied as a simple double-quoted string.

Delphi string: The style used by Delphi and other programming languages derived from Pascal. A single-quoted string. A single quote in the regex becomes two single quotes in the string.

Delphi Prism string: The style used by Delphi Prism, formerly known as Oxygene or Chrome. Either a single-quoted string on a single line, or a double-quoted string that can span multiple lines. A quote in the regex becomes two quotes in the string.

Groovy string: The Groovy programming language offers 5 string styles. Single-quoted and double-quoted strings require backslashes and quotes to be escaped. Using three single or three double quotes allows the string to span multiple lines. For literal regular expressions, the string can be delimited with two forward slashes, requiring only forward slashes to be escaped.

Java string: The style used by the Java programming language. A double-quoted string. Backslashes and double quotes are escaped with a backslash. Unicode escapes \uFFFF allowed.

JavaScript string: The string style defined in the ECMA-262 standard and used by its implementations like JavaScript, JScript and ActionScript. A single-quoted or double-quoted string. Backslashes and quotes are escaped with a backslash. Unicode escapes \uFFFF and Latin-1 escapes \xFF allowed.

JavaScript operator: A Perl-style // operator that creates a literal RegExp object in the ECMAScript programming language defined in the ECMA-262 standard, and its implementations like JavaScript, JScript and ActionScript. ECMA-262 uses mode modifiers that differ from Perl’s.

Perl-style string: The style used by Perl, where a double-quoted string is interpolated, but a single-quoted string is not. Quotes used to delimit the string, and backslashes, are escaped with a backslash.

Perl operator: A Perl m// operator for match and split actions, and an s/// operator for replace actions.

PHP string: A string for use with PHP’s ereg functions. Backslashes are only escaped when strictly necessary.

PHP ‘//’ preg string: A Perl-style // operator in a string for use with PHP’s preg functions.

PostgreSQL string: A string for PostgreSQL, delimited by double dollar characters.

PowerShell string: A string for PowerShell. Uses “here strings” for multi-line strings. Quotes and non-printables are escaped with backticks.

Python string: Unless the regex contains both single and double quote characters, the regex is copied as a Python “raw string”. Raw strings to not require backslashes to be escaped, making regular expressions easier to read. If the regex contains both single and double quotes, the regex is copied as a regular Python string, with quotes and backslashes escaped.

R string: The string style used by the R programming language. A single-quoted or double-quoted string. Backslashes and quotes are escaped with a backslash. Unicode escapes \U0010FFFF, basic Unicode escapes \uFFFF, and Latin-1 escapes \xFF allowed.

Ruby operator: A Perl-style // operator for use with Ruby. Ruby uses mode modifiers that differ from Perl’s.

Scala string: Copies the regex as a triple-quoted Scala string which avoids having to escape backslashes and allows line breaks which is handy for free-spacing regular expressions.

SQL string: The string style used by the SQL standard. A single-quoted string. A single quote in the regex becomes two single quotes in the string. The string can span multiple lines. Note that not all databases use this string style. E.g. MySQL uses C-style strings, and PostgreSQL uses either C-style strings or dollar-delimited strings.

Tcl word: Delimits the regular expression with curly braces for Tcl. In Tcl parlance, this is called a word.

XML: Replaces ampersands, angle brackets and quotes with XML entities like & suitable for pasting into XML files.

Pasting a String or Operator as The Regular Expression

RegexBuddy can do the opposite conversion when you want to edit a regular expression that is already part of an application’s source code. In your source code editor or IDE, select the entire string or operator that holds the regular expression. Make sure quotes and other delimiters are included. Copy it to the clipboard.

Then switch to RegexBuddy, and click the Paste button in the main toolbar. The Paste button’s menu offers the same options as the Copy menu, except that they work the other way around. If the clipboard holds the Java string "\"\\\\\"", select “Regex form Java string” and RegexBuddy will properly interpret the string as the regular expression "\\", which matches a single backslash between a pair of double quote characters. When you’re done editing the regex, select “Copy as Java string” and you can paste the updated regex as a Java string into your source code.

There is only one item for pasting C strings. It automatically recognizes the L, R, and LR prefixes. You can use this one item to paste C strings, wide C strings, raw C++11 strings, and wide raw C++11 strings.

Just like when copying regular expressions, you need to take care that you’ve set the same matching modes in RegexBuddy as in the application or source code you’ve copied the regex from. Pasting a regular expression also doesn’t change the programming language or application selected in RegexBuddy, nor does it convert the regular expression to the selected application. Make sure to select the correct programming language or application before or after pasting the regular expression.