Vigenere cipher: more moderately secure cryptography

Posted:


Sometimes you need to secure a message could be intercepted by party for whom the message is not meant. One moderately good way to do this is using the Vigenere cipher.

The Vigenere cipher is a kind of caesar or subsistution cipher. It uses a sequence of letters called a Key to determine the substitution. Compare this to the constant shift function of caesar. In Caesar crypt text, all occurences of a particular clear text letter will be the same subsistute letter (i.e. ‘Z’ always stands in for ‘E’). In Vigenere, the substitute for a clear text letter changes depending on where that letter occurs. Neat trick, eh?

The Key is a shared secret known only to those parties authorized to read the message. The encoded text cannot be cracked in the same way that a caesar inciphered text can be, although there are sophisticated avenues of attack known to the NSA and other professionals.

Vigenere works by using a tabula recta, which is is a table of the alphabet in rows and columns, like the following:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
B C D E F G H I J K L M N O P Q R S T U V W X Y Z A 
C D E F G H I J K L M N O P Q R S T U V W X Y Z A B 
D E F G H I J K L M N O P Q R S T U V W X Y Z A B C 
E F G H I J K L M N O P Q R S T U V W X Y Z A B C D 
F G H I J K L M N O P Q R S T U V W X Y Z A B C D E 
G H I J K L M N O P Q R S T U V W X Y Z A B C D E F 
H I J K L M N O P Q R S T U V W X Y Z A B C D E F G 
I J K L M N O P Q R S T U V W X Y Z A B C D E F G H 
J K L M N O P Q R S T U V W X Y Z A B C D E F G H I 
K L M N O P Q R S T U V W X Y Z A B C D E F G H I J 
L M N O P Q R S T U V W X Y Z A B C D E F G H I J K 
M N O P Q R S T U V W X Y Z A B C D E F G H I J K L 
N O P Q R S T U V W X Y Z A B C D E F G H I J K L M 
O P Q R S T U V W X Y Z A B C D E F G H I J K L M N 
P Q R S T U V W X Y Z A B C D E F G H I J K L M N O 
Q R S T U V W X Y Z A B C D E F G H I J K L M N O P 
R S T U V W X Y Z A B C D E F G H I J K L M N O P Q 
S T U V W X Y Z A B C D E F G H I J K L M N O P Q R 
T U V W X Y Z A B C D E F G H I J K L M N O P Q R S 
U V W X Y Z A B C D E F G H I J K L M N O P Q R S T 
V W X Y Z A B C D E F G H I J K L M N O P Q R S T U 
W X Y Z A B C D E F G H I J K L M N O P Q R S T U V 
X Y Z A B C D E F G H I J K L M N O P Q R S T U V W 
Y Z A B C D E F G H I J K L M N O P Q R S T U V W X 
Z A B C D E F G H I J K L M N O P Q R S T U V W X Y 

Notice the letters in column 2 are shifted 1 from column 1. This is important, as you’ll see in a moment.

To encode a message, you need a key. The key needs to be as long as the clear text message. Repeat the key as needed to make the key long enough. For example, if my key is “ORANGE” and my message is “IAMASECRETAGENT”, then you would need to expanded the key to “ORANGEORANGEORA”. To encipher the message, walk through each letter of the clear text. For each letter, find the column that starts with that letter of the clear text and look down that column for the row that starts with the corresponding letter of the Key. The letter you find there is the cipher text substitution for that letter. So, the first letter in this example is “I” with a key letter of “O”, which means that the substitution is “W”. Here’s the completed example:

Clear text: IAMASECRETAGENT
Key: ORANGEORANGEORA
Crypt text: WRMNYIQIEGGKSET

In the real world, messages will comprise more characters than just the uppercase letters of the alphabet. There are two ways to handle this when implementing the Vigenere cipher in code. Either reject all characters that are not part of the tablua recta or simply make the subsistute character the same as the input. The latter technique is the one I favor, but that may make the crypt text easier for someone to crack. For example, what do think this crypt text is refering to?

zT4N://eYSqnA.MxDEh7CmcGPT.jEA/aR/yO/?Hy=BRe7hS


Surely, this is a URL. Further, one can safely assume that “zT4N” is “http” in clear text. That’s a lot of context to give away.

Implementing this complex cipher in Perl is straight forward. The first thing to do is to build the tablua rect:

sub build_matrix {
    my @Wheel = ((0..9), ('A'..'Z'), ('a'..'z'));

    my $M = [[]];
    for (my $y=0; $y < @Wheel; $y++) {
    for (my $x=0; $x < @Wheel; $x++) {
        $M->[$y]->[$x] = $Wheel[($y + $x) % @Wheel];
    }
    }
    return $M;
}

Notice that my alphabet is composed of numbers and both cases of letters. This set of characters is contained in the @Wheel array. By looping through the array twice, it is possible to create a two dimensional array. A careful look at the code reveals our old friend the shift function from the caesar cipher. Each column is shifted by one.

You may not be familiar with the modulus operator %. The modulus operator returns the remainder of an integer division. For example, 13 % 5 is 3. Modulus operations have the interesting property that they always return values between 0 and one less than the second operand. Of course, this fits neatly in with arrays that begin indexes their elements at zero.

One last note. Even though this function returns the tablua recta, the caller stores the array in a globally visible variable called $M. This global is referenced in the encoding function below.

sub enc {
    my ($plain, $key) = @_;

    # Find the row of the cipher
    for (my $y=0; $y < @$M; $y++) {
    if ($M->[$y]->[0] eq $key) {
        # Find the column of plain text
        for (my $x=0; $x < @{$M->[$y]}; $x++) {
        if ($M->[0]->[$x] eq $plain) {
            return $M->[$y]->[$x];
        }
        }
    }
    }
    return $plain;
}

This function expects a clear text letter and the corresponding key letter. The row of the key letter is found in the tablua recta (i.e. $M) and then the column of the clear text letter is found. Whatever letter is found in $M at that location is returned to the caller. Notice that if the plain letter cannot be found in the tablua recta, that character is returned as the subsistution.

What remains is to show the main part of the program that calls these functions:

my @key_text = split //, $key;
my @plain_text = split//, $message;

my $new = "";
for (my $i=0; $i < @plain_text; $i++) {
    $new .= enc($plain_text[$i], $key_text[$i % @key_text]);
}
print "$new\n";

Here, the Key and the message are hard coded in $key and $message respectively. Each of these is split into arrays of characters. For every character in the plain_text array, the encoding function is called and the cipher text is built and later printed.

Notice the return of the modulus operator. Instead of trying to expand the key to match the length of the message, a simple modulus operation is used. This saves a bit on memory and it is easier to implement than the literal expansion of the key.

To decrypt the message, replace the enc() call in the main line with dec(), shown below:

sub dec {
    my ($cipher, $key) = @_;

    # Find the row of the key
    for (my $y=0; $y < @$M; $y++) {
    if ($M->[$y]->[0] eq $key) {
        # Find the column of the cipher 
        for (my $x; $x < @{$M->[0]}; $x++) {
        if ($M->[$y]->[$x] eq $cipher) {
            return $M->[0]->[$x];
        }
        }
    }
    }
    return $cipher;
}

Notice that it is the inverse operation of enc(). It is passed a cipher letter and the corresponding key. The row of the key is found and the column that contains the cipher letter is found. Then the first letter of that corresponding column it found, which represents the clear text letter. If no match is found, the cipher text is returned.

Of course, you don’t need to use this implementation. There is one on CPAN already, although it uses only ‘A’..’Z’ for its alphabet and strips off other characters.

Happy enciphering.