Safari 2.0.4 Fails on Unicode Escape Sequences in Regular Expressions

GeekSpeak comments edit

Fought with this one for quite some time today. We use a lot of client-side validation for input fields in the products I work on. While we repeat that validation on the server (as is the way with ASP.NET validation), the client-side validation is important to give the customer earlier feedback about invalid input.

Our products are written to work in a multilingual capacity so the validation expressions need to support characters above and beyond ASCII. That’s great, but it also means we have some work to do to get the regular expressions to work the same on the client as they do on the server. I’ve blogged about this issue before.

ECMAScript standards indicate you use Unicode escape sequences to put these extended characters into regular expressions. So rather than literally putting é right in the expression, you put the equivalent Unicode escape sequence: \u00e9.

Safari 2.0.4 doesn’t seem to handle Unicode escape sequences in its regular expression engine. It understands that code \u00e9 is equivalent to the literal character é, but if you ask in a regular expression if they match, they don’t.

From what I can tell, there is no workaround. It just doesn’t get Unicode escape sequences in JavaScript regular expressions.

I’ve put together some tests to illustrate the point. Browsers that handle the issue correctly will read “true” for all cases; Safari 2.0.4 fails on the Regex tests.

\u00e9 == é: Test Not Run

\u0041 == A: Test Not Run

Regex "\u00e9" matches "é": Test Not Run

Regex "\u0041" matches "A": Test Not Run