Discovering Patterns in Language

Regular expressions. are powerful metamathematical tools, advanced techniques for matching patterns in a text or multiple texts — something fun and something useful. They are concise chunks of cryptic characters that can search a single text or multiple texts for precise patterns. Select an input file, do one thing or very many things to the file, then drop the resulting text into an output file.

regex-to-fa

Stephen Cole Kleene is the mathematician and philosopher who introduced the concept of the regular expression. He worked with Alan Turing and other pioneering types who were intensely active in the 1930’s. Their understanding of a mathematical maneuver called recursion; that led to breakthrough tools in logic — decisions made at superhuman speed and using the processing speed and memory to process words and numbers thrown together and called data. However, beware of the sorcerer’s apprentice phenomenon. Just bewaring.

recursion

An example: look for successive occurrences of WTF (upper or lower case) and substitute “what the fart”.

Through recursion you can stop, go backward a certain of characters, query the findings. Do something with it. Once you become familiar with the meta characters and the syntax, you can do a lot of useful things or destroy many useful things. So save the original file in a safe place and know where your output file ends up.

 

When I was a freelance translator I maintained a translation memory database that kept track of all my translations so that I might be reminded of earlier translations. The software I used was called SDL Trados; however, that was over ten years ago.

Here is one example of how I used regular expression code to insert a carriage return and linefeed whenever a blank space appeared in the original German. Essentially this created records that were one word long — the number of records was the number of words in the text. Then I queried my database for finds. A lot faster than the technique I used when learning German — looking up the words in a large-ass dictionary that I still have on the bottom shelf over there.

mastering.regular.expressions

The same Unix tools developed in the 1960’s remain in the electrons flowing from my screen to yours. They remind me of Arthur C. Clarke’s three laws:

  1. When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
  2. The only way of discovering the limits of the possible is to venture a little way past them into the impossible.
  3. Any sufficiently advanced technology is indistinguishable from magic.

Arthur C. Clarke’s Three Laws

Hoping this is somewhat illuminating, or mildly amusing 🙂

Thanks for reading.

P.S. Now for something incompletely different, something inspired by Hariod Brawn’s comment below. It’s an article on The Sound (And Taste) Of Music by Layla Eplett — she brings a platter to the conversation and complements Mariano Sigman’s TED Talk:

 

SoundTasteofMusic
Layla Eplett

Thanks for reading this postscript 🙂

2 thoughts on “Discovering Patterns in Language”

    1. Hi Hariod, thanks thoroughly and much for the comment. Yes, that honeycomb is busy and buzzy and brain-freezingly inadequate. I’m going to look for a better one forthwith — your accurate and über helpful observations, such as recommending a better font-to-background scheme than my horrid hyper-annoying white font over black background was superbly helpful.
      Good old Jungian synchronicity is always timely and appreciated here, perhaps definitioningly so. Sigman’s talk and accompanying slides — that we have now both watched within an elapse of 48 hours — resoundingly qualify as a singular stroke of good synchronous fun and fortune.
      Wishing you a happy and far from harried pattern-matching weekend.

      Liked by 1 person

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s