The SHA-1 algorithm produces a hash value of length 160 bits (= 20 bytes). This is the successor of the SHA algorithm.
Beside RIPMED-160 hash algorithm, SHA-1 (published by the NIST in 1991) is the most commonly used cryptographic hash function.
The algorithm is described in FIPS180-1, an extract from which is provided below.
1. MESSAGE PADDING
The SHA-1 is used to compute a message digest for a message or data file that is provided as input. The message or data file should be considered to be a bit string. The length of the message is the number of bits in the message (the empty message has length 0). If the number of bits in a message is a multiple of 8, for compactness we can represent the message in hex. The purpose of message padding is to make the total length of a padded message a multiple of 512. The SHA-1 sequentially processes blocks of 512 bits when computing the message digest. The following specifies how this padding shall be performed. As a summary, a "1" followed by m "0"s followed by a 64-bit integer are appended to the end of the message to produce a padded message of length 512 * n. The 64-bit integer is l, the length of the original message. The padded message is then processed by the SHA-1 as n 512-bit blocks.
The padded message will contain 16 * n words for some n > 0. The padded message is regarded as a sequence of n blocks M1, M2,..., Mn, where each Mi contains 16 words and M1 contains the first characters (or bits) of the message.
2. FUNCTIONS USED
A sequence of logical functions f0, f1,..., f79 is used in the SHA-1. Each ft, 0 <= t <= 79, operates on three 32-bit words B, C, D and produces a 32-bit word as output. ft(B,C,D) is defined as follows: for words B, C, D.
ft(B,C,D) = (B AND C) OR ((NOT B) AND D) ( 0 <= t <= 19)
ft(B,C,D) = B XOR C XOR D (20 <= t <= 39)
ft(B,C,D) = (B AND C) OR (B AND D) OR (C AND D) (40 <= t <= 59)
ft(B,C,D) = B XOR C XOR D (60 <= t <= 79).
3. CONSTANTS USED
A sequence of constant words K(0), K(1),..., K(79) is used in the SHA-1. In hex these are given by
K = 5A827999 ( 0 <= t <= 19)
Kt = 6ED9EBA1 (20 <= t <= 39)
Kt = 8F1BBCDC (40 <= t <= 59)
Kt = CA62C1D6 (60 <= t <= 79).
4. COMPUTING THE MESSAGE DIGEST
The message digest is computed using the final padded message. The computation uses two buffers, each consisting of five 32-bit words, and a sequence of eighty 32-bit words. The words of the first 5-word buffer are labelled A,B,C,D,E. The words of the second 5-word buffer are labelled H0, H1, H2, H3, H4. The words of the 80-word sequence are labelled W0, W1,..., W79. A single word buffer TEMP is also employed.
To generate the message digest, the 16-word blocks M1, M2,..., Mn defined in Section 4 are processed in order. The processing of each Mi involves 80 steps.
Before processing any blocks, the {Hi} are initialized as follows: in hex,
H0 = 67452301
H1 = EFCDAB89
H2 = 98BADCFE
H3 = 10325476
H4 = C3D2E1F0.
Now M1, M2,..., Mn are processed. To process Mi, we proceed as follows:
After processing Mn, the message digest is the 160-bit string represented by the 5 words
H0 H1 H2 H3 H4.