This program implements a file encryption utility in 8086 assembly language, fulfilling the requirements of a computer architecture assignment. It reads a text file, converts all letters to uppercase, analyzes letter frequencies, derives an encryption key based on frequency ranking, applies a substitution cipher to both letters and digits, and writes the encrypted output to cipher.txt.
Design Overview
The solution is structured into modular procedures:
- INPUT: Reads a filename from the user and loads the file content into memory.
- CAPITALOUTPUT: Converts lowercase letters (a–z) to uppercase (A–Z) in-place and displays the result.
- CNTOUTPUT: Counts occurrences of each uppercase letter using nested loops. The key is derived as the letter that has exactly two other letters with higher frequencies—effectively identifying the third most frequent letter without full sorting.
- ENCRYPTION: Applies distinct transformations:
- Digits (0–9) are substituted using a fixed lookup table (
NUMTABLE). - Letters are shifted forward by the key value modulo 26 (Caesar cipher).
- Digits (0–9) are substituted using a fixed lookup table (
- SAVE: Writes the encrypted buffer to
cipher.txt.
Data Structures
Key data segments include:
ARTICLE: Buffer holding up to 768 bytes of file content.CNTBUF: Stores counts for A–Z in packed BCD format (e.g., 'A', count_high, count_low).NUMTABLE: Digit substitution mapping: "7591368024" (i.e., '0'→'7', '1'→'5', etc.).KEY: Holds the derived shift value (0–25).
Algorithm Details
Letter Case Conversion
Each byte in ARTICLE is checked. If it falls in 'a'–'z', 32 is subtracted to convert to uppercase.
Frequency Counting
For each letter A–Z (outer loop), the entire buffer is scanned (inner loop). Matching characters increment the corresponding counter in CNTBUF. The DAA instruction maintains BCD format during increments.
Key Derivation
The key is the letter whose frequency rank is third highest. For each letter, the algorithm counts how many other letters have strictly greater frequencies. When this count equals 2, the current letter is selected as the key.
Encryption Logic
- Digits: Subtract '0' to get index, then use
NUMTABLEfor substitution. - Letters: Subtract 'A', add the key, take modulo 26, then add 'A' back.
Code Highlights
; Key derivation snippet
LKEY:
MOV AL, [DI] ; Current letter's frequency
MOV BH, 0 ; Counter for letters with higher freq
MOV SI, OFFSET CNTBUF + 1
MOV AH, 26 ; Inner loop counter
CMPKEY:
CMP AL, [SI] ; Compare with another letter's freq
JB JUDGE ; If current < other, increment counter
JMP NEXT_LETTER
JUDGE:
INC BH
NEXT_LETTER:
ADD SI, 3 ; Move to next letter's count
DEC AH
JNZ CMPKEY
CMP BH, 2 ; Check if exactly two letters are more frequent
JE KEYGET ; If yes, use this letter as key
Execution Flow
- Prompt user for input filename.
- Load file contents into
ARTICLE. - Convert to uppercase and display.
- Count letter frequencies and compute key.
- Encrypt buffer contents.
- Write encrypted data to
cipher.txt.
Sample Output
Given an input file containing:
AlexNet is a convolutional neural network trained on over a million images...
The program outputs the uppercase version, frequency table (e.g., A53B21...), encrypted text (e.g., NYRKARG VF N PBAIBYHGVBANY...), and saves the ciphertext to disk.