Regular expressions. are powerful metamathematical tools, advanced techniques for matching patterns in a text or multiple texts — something fun and something useful. They are concise chunks of cryptic characters that can search a single text or multiple texts for precise patterns. Select an input file, do one thing or very many things to the file, then drop the resulting text into an output file.
Stephen Cole Kleene is the mathematician and philosopher who introduced the concept of the regular expression. He worked with Alan Turing and other pioneering types who were intensely active in the 1930’s. Their understanding of a mathematical maneuver called recursion; that led to breakthrough tools in logic — decisions made at superhuman speed and using the processing speed and memory to process words and numbers thrown together and called data. However, beware of the sorcerer’s apprentice phenomenon. Just bewaring.
An example: look for successive occurrences of WTF (upper or lower case) and substitute “what the fart”.
Through recursion you can stop, go backward a certain of characters, query the findings. Do something with it. Once you become familiar with the meta characters and the syntax, you can do a lot of useful things or destroy many useful things. So save the original file in a safe place and know where your output file ends up.
When I was a freelance translator I maintained a translation memory database that kept track of all my translations so that I might be reminded of earlier translations. The software I used was called SDL Trados; however, that was over ten years ago.
Here is one example of how I used regular expression code to insert a carriage return and linefeed whenever a blank space appeared in the original German. Essentially this created records that were one word long — the number of records was the number of words in the text. Then I queried my database for finds. A lot faster than the technique I used when learning German — looking up the words in a large-ass dictionary that I still have on the bottom shelf over there.
The same Unix tools developed in the 1960’s remain in the electrons flowing from my screen to yours. They remind me of Arthur C. Clarke’s three laws:
- When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
- The only way of discovering the limits of the possible is to venture a little way past them into the impossible.
- Any sufficiently advanced technology is indistinguishable from magic.
Hoping this is somewhat illuminating, or mildly amusing 🙂
Thanks for reading.
P.S. Now for something incompletely different, something inspired by Hariod Brawn’s comment below. It’s an article on The Sound (And Taste) Of Music by Layla Eplett — she brings a platter to the conversation and complements Mariano Sigman’s TED Talk:
Thanks for reading this postscript 🙂