Exploring Cryptography through Shannon Entropy, Primitive Roots, and Cipher Analysis


Introduction

Cryptography is a critical field in cybersecurity, ensuring secure communication and data protection. Understanding the mathematical foundations and statistical properties of cryptographic algorithms enhances our ability to analyze their security. This article explores key concepts such as Shannon Entropy, primitive roots, and the application of these concepts in cipher analysis. We will:


Theory and Research

Shannon Entropy and Diversity Measures

Shannon Entropy is a measure of the unpredictability or randomness in a set of possible outcomes. Introduced by Claude Shannon in 1948, it quantifies the amount of information contained in a message.

The entropy \( H \) of a discrete random variable \( X \) with possible values \( \{ x_1, x_2, ..., x_n \} \) and probability mass function \( P(X = x_i) \) is defined as:

\( H(X) = -\sum_{i=1}^{n} P(x_i) \log_2 P(x_i) \)

Other Diversity Measures:

Primitive Roots

A primitive root modulo \( p \) (where \( p \) is a prime number) is an integer \( g \) such that for every integer \( a \) coprime to \( p \), there exists an integer \( k \) satisfying:

\( g^k \mod p = a \)


Application and Practice

Part 1: Caesar Cipher and Frequency Analysis

Step 1: Compile a Text and Create Letter Frequency Distribution

We selected several web pages to compile a sufficiently large piece of text (e.g., an article with 10,000 characters). We extracted the text, removed any non-alphabetic characters, and converted all letters to uppercase.

Letter Frequency Distribution:

We calculated the frequency of each letter in the text to obtain the original distribution.

Step 2: Apply the Caesar Cipher with a Random Shift

We chose a random shift value, say shift = 7. The Caesar cipher shifts each letter in the plaintext by the shift value, wrapping around the alphabet if necessary.

Encryption Function:

For each letter \( L \):

\( E(L) = ( \text{Index}(L) + \text{shift} ) \mod 26 \)

Where \( \text{Index}(L) \) maps \( A = 0, B = 1, ..., Z = 25 \).

Encrypted Text:

We applied the shift to each letter to produce the ciphertext.

Step 3: Decrypt the Message Using Frequency Analysis

Frequency Analysis:

Decryption Process:

Decrypting the Ciphertext:

Applied the inverse shift to the ciphertext to recover the plaintext.


Part 2 (Optional): Modular Exponentiation Encryption

Step 1: Convert Letters to Numeric Representation

Mapped each letter in the original text to a number:

\( A = 0, B = 1, ..., Z = 25 \)

Step 2: Encode Using Modular Exponentiation

Chose parameters:

Encryption Function:

For each numeric value \( k \):

\( E(k) = N^k \mod P \)

Calculated the encoded values for each letter.

Observations


Comparing Frequency Analysis and RSA Encryption

Caesar Cipher:

RSA Encryption (Simplified with Modular Exponentiation):


Visualization and Entropy Calculation

Letter Frequency Distributions

Shannon Entropy Calculations

Entropy Values (Hypothetical):

\( H_{\text{original}} = 4.18 \) bits

\( H_{\text{Caesar}} = 4.17 \) bits

\( H_{\text{mod\_exp}} = 5.21 \) bits


Findings and Discussion

Part 1 Findings

Part 2 Findings

Importance of Statistical Analysis in Cryptography


Conclusion

This exercise highlighted the significance of mathematical concepts in cryptography:

By applying these concepts, we gain a deeper understanding of the strengths and weaknesses of different encryption methods, reinforcing the importance of robust cryptographic practices in cybersecurity.


Note: The practical implementations involve collecting text data, performing encryption and decryption, and calculating statistical measures. Tools such as Python or JavaScript can be used to automate these tasks and visualize the results.