Introduction
Early in 2024, Microsoft announced the REGEX suite of functions that they had put into the Insiders' Channel. I have chosen not to be a member of that channel because new functions showing there can be changed or even abandoned before ordinary channel members learn about them.
Anyway, I found out two days ago that I now have access to the three REGEX functions and this page discusses REGEXTRACT() and REGEXTEST() and how they have solved a rather tricky problem for me ... and, therefore, for you!
REGEX = Regular Expressions
Regular expressions are patterns of characters that match, or fail to match, sequences of characters in text.
Regular expressions, at the most basic level, allow computer users and developers to find desired pieces of text and, often, to replace those pieces of text with something that is preferred. At other times, regular expressions are used to test whether a sequence of characters that might be intended to be a credit card number or a National Insurance number has an allowed pattern of characters. Whether it’s finding existing sequences of characters or testing sequences of characters for their suitability (or not) for storage, the key aspect of regular expressions is matching a pattern against a sequence of characters.
Source: Andrew Watt (2005) Beginning Regular Expressions Wrox page 2
Note: that book is OLD but rest assured, I have worked on a series of examples that Andrew Watt presented and they still work in Excel. I am sure there are a lot more things that have been said since 2005 but for what I am presenting here, Andrew's book works!
Three New REGEX Functions
- REGEXTRACT()
- REGEXREPLACE()
- REGEXTEST()
I have spent about a whole working day on these functions now and my initial conclusions are that, at first sight, one might wonder why we need them. I tested out REGEXTEST() for a while and got the hang of it to some extent and I started to warm to the idea.
I learned a while ago that there are entire books devoted to Regular Expressions and they have been a feature of the lives of many database and data analysis wallahs for a long time!
Having perused one or two of those books, I found that a knowledge of regular expressions and how we might use these new functions could prove really worthwhile: hence this page!
What I did
REGEXTEST(text,pattern,[case_sensitivity])
The crucial aspect of this example is to understand what is meant by the pattern. In this example, I was testing the codes in the range A5:A9 to see if they matched this code:
Two out of five of those examples were reported to be TRUE, they matched the pattern.
Two countries divided by a common language! Here is an example to help with that. In the UK, we spell colour, in the USA they spell color. Imagine I am working on a file and I need to be able to identify where colour OR color is being used. Take a look:
Notice, REGEXTEST has found two cases that are TRUE and it fails with colours (D13).From a different source, this time, consider these examples:
And, finally, for a bit more variety, also from exceljet:
Work with these examples and see what you think!
REGEXTRACT(text,pattern,[return_mode],[case_sensitivity])
return_mode
A number that specifies what strings you want to extract. By
default, return mode is 0. The possible values are:
- 0: Return the first string that matches the pattern
- 1: Return all strings that match the pattern as an array
- 2: Return capturing groups from the first match as an array
Note: Capturing groups are parts of a regex pattern surrounded by parentheses "(...)". They allow you to return separate parts of a single match individually
Here is a problem I have been wrestling with for several years.
Power Query
Solution: company names
Solution: company objectives
cell D5=TEXTAFTER(B5:B104,C5:C104) ... again, a dynamic array formula that spills automatically to provide all of the answers all at once!
No comments:
Post a Comment