One important question about LZ compression is "Is it good?" And the answer is "yes", but it isn't necessarily immediately obvious why.
Join me below the fold for some bit-counting and fun with variable length coding.
Tuesday, 19 August 2014
Bonus Workshop Questions - Week 2
Getting caught up! Sorry about the delay.
6. Write a regular expression to solve the following problem:
(e) Find long words whose letters are in alphabetical order.
7. Practice using awk, or alternatively, bash-scripting with grep and sed. For example, write a program that finds all of the email addresses in enron-headers.txt. (You might contrast your observations with enron-emails.txt.) How many words are there in Fathers and Sons by Ivan Turgenev (turgenev.txt)? How many instances are there of the same word repeated twice? How many words in the dictionary (words.txt; it’s actually an inflectional lexicon) have their letters in
alphabetical order? How many of the nine letter words (nines.txt)? [All of these are available on the CIS servers, i.e. nutmeg and dimefox, connection instructions on the LMS.]
Solutions below the fold.
6. Write a regular expression to solve the following problem:
(e) Find long words whose letters are in alphabetical order.
7. Practice using awk, or alternatively, bash-scripting with grep and sed. For example, write a program that finds all of the email addresses in enron-headers.txt. (You might contrast your observations with enron-emails.txt.) How many words are there in Fathers and Sons by Ivan Turgenev (turgenev.txt)? How many instances are there of the same word repeated twice? How many words in the dictionary (words.txt; it’s actually an inflectional lexicon) have their letters in
alphabetical order? How many of the nine letter words (nines.txt)? [All of these are available on the CIS servers, i.e. nutmeg and dimefox, connection instructions on the LMS.]
Solutions below the fold.
Monday, 18 August 2014
Subscribe to:
Posts (Atom)