CMPS 450 LAB 01: Introduction to Regular Expressions

DUE: before lab next tuesday

resources:
regular languages & regular expressions
posix regex manpage
regex (7) manpage
RE basics
Python RE HOWTO
IEEE Std 1003.1-2001, Section 9.4 BREs and EREs
vim regexes are awesome
REs in Lex

WHAT TO SUBMIT FOR THIS LAB

You do not need to submit your work for the exercises below. You should complete them if you are not already familiar with regular expressions. For submission you will write a rather complex regex and insert it into a perl script that I provide. Copy the perl script lab01 and data file into your directory:

   $ cp /home/fac/donna/public_html/cs450/lab01_files/lab01 . 
   $ cp /home/fac/donna/public_html/cs450/lab01_files/data . 
The lab01 script needs to be executed so make sure you put 700 permissions on the file. Your job is to replace the existing regex in the perl script with another regex that will display the first 6 lines of data and none of the remaining lines. Make your regex as simple as possible. For credit your regex CANNOT contain actual words.

Submit your script as an attachment to donna@cs.csub.edu

Exercises

You can test your REs in vim, perl or at the command line using grep. In vim you must either escape metacharacters such as +,|,(,% or prepend your regexes with \v. E.g., this regex treats '(' like a substring metacharacter without escaping it:
    64,79s/\v.*(it|ing|the).*/DONE/  
When you have tested your regex in vim, hit ESC u to undelete. You can also use this tool: www.rubular.com. Use this file to test your solutions.
01. Write the shortest possible RE that matches all items (the entire string) 
    in the first set but none in the second set. It is possible to use no more 
    than 2 ascii characters in your RE. 

    pit         
    spot        
    spate       
    slap two    
    respite

    pt
    Pot
    peat
    part

 
02. Write an RE that matches all items in the first set but none in the second set. Write the shortest RE possible. rap them tapeth apth wrap/try sap tray 87ap9th apothecary aleht happy them tarpth Apt peth tarreth ddapdg apples shape the
 
03. Give an RE that matches all items in the first set but none in the second. Write the shortest RE possible. affgfking rafgkahe bafghk baffgkit affgfking rafgkahe bafghk baffg kit fgok a fgk affgm afffhk fgok afg.K aff gm afffhgk
 
04. Give an RE that matches all items in the first set but none in the second. assumes word senses. Within does the clustering. In the but when? It was hard to tell he arrive." After she had mess! He did not let it it wasn't hers!' She replied always thought so.) Then in the U.S.A., people often John?", he often thought, but weighed 17.5 grams well ... they'd better not A.I. has long been a very like that", he thought but W. G. Grace never had much
 
05. Which of the following entire strings match this regex? /a(ab)*a/ a) abababa b) aaba c) aabbaa d) aba e) aabababa
 
06. Which of the following entire strings match this regex? /ab+c?/ a) abc b) ac c) abbb d) bbc
 
07. Which of the following entire strings match this regex? /a.[bc]+/ a) abc b) abbbbbbbb c) azc d) abcbcbcbc e) ac f) asccbbbbcbcccc
 
08. Which of the following entire strings match this regex? /abc|xyz/ a) abc b) xyz c) abc|xyz
 
09. Which of the following entire strings match this regex? /[a-z]+[\.\?!]/ a) battle! b) Hot c) green d) swamping. e) jump up. f) undulate? g) is.?
 
10. Which of the following entire strings match this regex? /[a-zA-Z]*[^,]=/ a) Butt= b) BotHEr,= c) Ample d) FIdDlE7h= e) Brittle = f) Other.=
 
11. Which of the following entire strings match this regex? /[a-z][\.\?!]\s+[A-Z]/ Note: \s matches any white space character. a) A. B b) c! d c) e f d) g. H e) i? J f) k L
 
12. Which of the following entire strings match this regex? /(very )+(fat )?(tall|ugly) cat/ a) very fat cat b) fat tall cat c) very very fat ugly cat d) very very very tall cat
 
13. Which of the following entire strings match this regex? /<[^>]+>/ a) <an xml tag> b) <opentag> <closetag> c) </closetag> d) <> e) <with attribute=77>
 
14. In addition to finding patterns, REs are useful for text processing. For example, write an RE for vim that will take a line that looks like this Spade, Samantha J. and converts it to this: Samantha Spade
 
15. Write a regex in vim that will preface a single digit number at the beginning of a line with a '0'; i.e., take this: 1. Question 1. 2. Question 2. ... 10. Question 10. and make it look like this: 01. Question 1. 02. Question 2. ... 10. Question 10.
 
16. Write a regex in vim that will delete all empty lines in a file.
 
For the next questions read the IBM Tutorial on Lex
17. The Lex RE (ab|cd+)?(ef)* matches such strings as abefef, efefef, cdef, or
    cddd, but not abc, abcd, or abcdef. (T/F)

 
18. The Lex RE [\^A-Za-z][^0-9]* matches what?
 
19. The Lex RE ab?c matches either ac or abc. (T/F)
 
20. The Lex ERE [abc-f].+(z|\.)$ matches what?