Insert a Unicode Property

The Insert Token button on the Create panel makes it easy to insert regular expression tokens to match various Unicode properties. See the Insert Token help topic for more details on how to build up a regular expression via this menu.

Character Preview

All of these menu items show a window that let you select the specific property or property value that you want to insert into your regular expression. Most of these dialogs show a preview of the characters that have your chosen property or property value. If you move the mouse over the grid, you can see the hexadecimal and decimal representations of each character’s code point occupies in the Unicode standard.

The last row of the grid may have rectangles that are crossed out with thin gray lines. This simply indicates that there are no more characters that have the property to fill up the last row of the grid.

Characters that can’t be displayed by the current font appear as rectangles in the grid. If you see a great number of rectangles instead of characters then click the Select Font button to change the grid’s font. The font selection dialog only shows fonts that can display (some of) the characters in the grid. It indicates how many characters in the grid each font can display and orders them from the highest number of supported characters to the lowest number. So the fonts near the top of the list are your best choice to see the characters in the selected script.

Non-printable characters are indicated with their 2-letter Unicode category names, such as “Cc” for control characters and “Cn” for unassigned code points. If all characters on the grid are non-printable then the Select Font button is disabled.