New URL: http://www.iki.fi/~kartturi/matikka/kl10exmp.txt 2.15 Programming Examples (for the PDP-10 central processors (K[A/I/L/S]10)) Excerpt from: DECsystem-10 DECSYSTEM-20 Processor Reference Manual AA-H391A-TK Fifth Edition, July 1980 (First Edition, May 1968) Chapter 2, User Operations, pages 2-112 - 2-120 Before continuing to more system-related subjects, let us consider some simple programs that demonstrate the use of a variety of the instructions described thus far. Processor Identification The instruction repertoires of all PDP-10 processors and the 166 processor used in the PDP-6 are very similar, and most programs require no changes to run on any of them. Because of minor differences and the fact that some instructions are not available on the earlier machines, a program that is to be compatible with all should have some way of distinguishing which machine it is running on. This simple test suffices. JFCL 17,.+1 ;Clear flags JRST .+1 ;Change PC JFCL 1,PDP6 ;PDP-6 has PC Change flag MOVNI AC,1 ;Others do not, make AC all 1s AOBJN AC,.+1 ;Increment both halves JUMPN AC,KA10 ;KA10 if AC = 1000000 (carry BLT AC,0 ;between halves) JUMPE AC,KI10 ;KI10 if AC = 0 MOVEI AC,1 ;KL10 or KS10 if AC = 1,,1 SETZ AC+1, ;Big binary integer MOVEI AC+3,1 ;One digit byte EXTEND AC,[CVTBDO] ;Convert will abort TLNE AC+3,200000 ;Test effect on N JRST KL10 ;KL10 if N set JRST KS10 ;KS10 if N unaffected Parity Parity procedures are used regularly to check the accuracy of stored information. Parity generation and checking is generally handled automatically by memory and high speed, block-oriented peripheral devices, but must be handled by the program for character-oriented devices. Consider 8-bit characters, for which the program carries out two prcedures: for output it generates a parity bit from seven data bits to produce an 8-bit character with parity; following input it checks the parity of the eight bits received. In either case however, the program can simply find the parity of an 8-bit character, by regarding the seven output data bits as eight including an irrelevant extra bit. The two procedures then differ only in the final action. In the first case the program uses the result to adjust the eighth bit for correct parity, whereas in the second it checks the result for an indication of error. Assuming the character is right justified in accumulator A and the rest of A is clear, as it would be were the character placed in A by a load-byte instruction or a DATAI, the simplest and quickest procedure would be to use A to index an XCT into a table, each of whose locations contains an instruction that adjusts the parity for output or jumps to a routine for erroneous input. This procedure would normally be unacceptable because of the very large memory requirements. However the table can be reduced to sixteen entries without excessive loss in speed, by exclusive oring the left and right halves of the character and indexing on the result (parity is invariant under the exclusive OR function, which essentially disposes of pairs of 1s). This example, which uses a second accumulator T for character manipulation, requires six memory references to generate odd parity. (Numbers of memory references and locations given do not include those for the POPJ, which we will regard as subroutine overhead. Similarly every example also requires that the program give a PUSHJ to get to the subroutine). PARITY: MOVEI T,(A) ;Copy character in T LSH T,-4 ;Line up halves XORI T,(A) ;Reduce paritywise to 4 bits ANDI T,17 ;Wipe out unwanted bits XCT PARTAB(T) ;Execute indicated table item POPJ P, PARTAB: XORI A,200 ;0 - change high bit JFCL ;1 - no-op JFCL ;2 XORI A,200 ;3 JFCL ;4 XORI A,200 ;5 XORI A,200 ;6 JFCL ;7 JFCL ;10 XORI A,200 ;11 XORI A,200 ;12 JFCL ;13 XORI A,200 ;14 JFCL ;15 JFCL ;16 XORI A,200 ;17 To handle even parity, interchange the JFCLs and XORIs in the table, or change the MOVEI T,(A) to MOVEI T,200(A). The next example does exactly the same thing but substitutes a little more computation for use of a table. In other words it takes a little more time (7 1/2 memory references average) but less than half the memory. PARITY: MOVEI T,200(A) ;Copy character with high bit LSH T,-4 ;complemented, then fold copy into 4 XORI T,(A) ;bits with opposite parity TRCE T,14 ;Are left two both 0 ? TRNN T,14 ;Or both 1 ? XORI A,200 ;Yes, change high bit TRCE T,3 ;Are right two both 0 ? TRNN T,3 ;Or both 1 ? XORI A,200 ;Yes, change for even, restore for odd POPJ P, Note that this example does not require the rest of A to be clear. For even parity change the address in the MOVEI from 200 to 0. Finally let us consider the extreme of substituting computation for memory. Starting with the character abcdefgh right justified in A, we first copy it in T and then duplicate it twice to the left producing abc def gha bcd efg hab cde fgh where the bits (in positions 12-35) are grouped corresponding to the octal digits in the word. Anding this with 001 001 001 001 001 001 001 001 retains only the least significant bit in each 3-bit set, so we can represent the result by cfadgbeh where each letter represents an octal digit having the same value (0 or 1) as the bit originally represented by the same letter. Multiplying this by 11111111 (octal) generates the following partial products: c f a d g b e h c f a d g b e h c f a d g b e h c f a d g b e h c f a d g b e h c f a d g b e h c f a d g b e h c f a d g b e h Since any digit is at most 1, there can be no carry out of any column with fewer than eight digits unless there is a carry into it. Hence the octal digit produced by summing the center column (the one containing all the bits of the character) is even or odd as the sum of the bits is even or odd. Thus its least significant bit (bit 14 of the low order word in the product) is the parity of the character, 0 if even, 1 if odd. The above may seem a very complicated procedure to do something trivial, but it is effected by this quite simple sequence: PARITY: MOVEI T,(A) ;Copy in T IMULI T,200401 ;Duplicate twice AND T,ONES ;Pick LSBs IMUL T,ONES ;Generate product TLNN T,10 ;Is bit 14 odd ? XORI A,200 ;No, change parity POPJ P, . . . ONES: 11111111 This procedure uses a minimum of both memory references and memory space, but takes considerably more time because the instructions themselves are slow. The following table shows the trade-off of memory references against memory space for the above four procedures. The time is proportional to the number of references except in case 4. References Locations 1. 2 257 2. 6 21 3. 7 1/2 9 4. 7 1/2 7 Reversing Order of Digits Suppose we wish to reverse the order of the digits in the 6-bit character abcdef, where the letters represent the bits of the character. We first duplicate it three times to left and shift the result left one place producing a bcd efa bcd efa bcd efa bcd ef0 where the bits are grouped corresponding to the octal digits in the word. Anding this with 1 000 100 100 010 010 000 001 000 gives a 000 e00 b00 0f0 0c0 000 00d 000 Now it just so happens this number is configured such that the residues of the values of its bits modulo 2^8 - 1 are in exactly the opposite order from the bits of the original character and have the binary orders of magnitude 0-5. In other words this number is equal to the sum of the numbers in the upper row below, and dividing each of these summands by 255 gives the remainder listed in the lower row. Dividend f * 2^13 e * 2^20 d * 2^3 c * 2^10 b * 2^17 a * 2^24 Remainder f * 2^5 e * 2^4 d * 2^3 c * 2^2 b * 2^1 a * 2^0 The remainder in a division is equal to the sum, modulo the divisor, of the remainders left by dividing each of the dividend summands by the same divisor. And the sum of the terms in the lower row is obviously fedcba. The above procedure is implemented by this sequence (due to Schroeppel; HAKMEM 140, page 78 (Artificial Intelligence Memorandum, No. 239, February 29, 1972, MIT Artificial Intelligence Laboratory)). where the character is right justified in accumulator A (with the rest of A clear), and its reverse appears right justified in accumulator A+1. IMUL A,[2020202] ;4 copies shifted left one AND A,[104422010] ;Pick bits for reverse IDIVI A,377 ;Divide by 2^8 - 1 To reverse eight bits we can use a similar procedure (also due to Schroeppel) where again the original character is right justified in A and its reverse appears right justified in A+1. But this time we cannot manage the manipulation within a single length word, so we must use multiply, divide and a pair of ANDs. MUL A,[100200401002] ;5 copies in A and A+1 AND A+1,[20420420020] ;Pick bits for reverse via ANDI A,41 ;residues mod 2^10 - 1 DIVI A,1777 ;Divide by 2^10 - 1 Counting Ones Suppose we wish to count the number of 1s in a word. We could of course check every bit in the word. But there is a quicker way if we remember that in any word and its twos complement the rightmost 1 is in the same position, both words are all 0s to the right of this 1, and no corresponding bits are the same to the left (the parts of both words at the left of the rightmost 1 are complements). Hence using the negative of a word as a mask for the word in a test instruction selects only the rightmost 1 for modification. The example uses three accumulators: the word being tested (which is lost) is in T, the count is kept in CNT, and the mask created in each step is stored in TEMP. MOVEI CNT,0 ;Clear CNT MOVN TEMP,T ;Make mask to select rightmost 1 TDZE T,TEMP ;Clear rightmost 1 in T AOJA CNT,.-2 ;Increase count and jump back ... ;Skip to here if no 1s left in T CNT is increased by one every time a 1 is deleted from T. After all 1s have been removed, the TDZE skips. The preceding example uses little memory, but contains a loop so the time it takes is proportional to the number of 1s. The next example takes more memory but is constant; hence it is slower than the above when there are few 1s (less than eight), but is much faster when there are many. The word, which is lost, is in accumulator A, and the answer appears in accumulator A+1 (for convenience we let B = A+1). The routine (due to Gosper, Mann and Leonard; Ibid, item 169, page 79) has three distinct parts and is based on the fact that in a binary word, counting 1s is equivalent to calculating the sum of the digits. The first part, of seven instructions, manipulates the octal digits of the word so as to replace each digit by the number of 1s in it. Taking D as an octal digit and [x] as the largest integer contained in x, the algorithm used to make the substitution is D - [D/2] - [D/4] Of course the computer always acts in binary terms regardless of programmer interpretation. In this case the procedure carried out on each 3-bit piece abc is abc - ab - a The instructions effect this algorithm by shifting a copy of the word right one place, masking out the LSB of each shifted octal digit to prevent it from interfering with the next digit at the right (i.e. to isolate the digits), and subtracting the shifted word from the original. The same process is then repeated, this time masking out what was originally the middle bit in each digit. That this algorithm gives the correct substitution is evident from the following table, in which it is seen that the bottom number in a given column is the sum of the bits in the octal digit given at the top of the column. Original digit 0 1 2 3 4 5 6 7 Subtract 0 0 1 1 2 2 3 3 = 0 1 1 2 2 3 3 4 Subtract 0 0 0 0 1 1 1 1 Number of 1s 0 1 1 2 1 2 2 3 We have now replaced the original word with a set of twelve numbers, whose sum is equal to the number of 1s in the original. The next three instructions add together pairs of adjacent numbers so as to replace the twelve by six whose sum is still the same. Since these new numbers are isolated in 6-bit pieces of the word, we shall revise our point of view, and regard them as digits in a number in base 64. Now any number is simply the sum of the values of its digits, i.e., of its digits each multiplied by an appropriate power of the base. Dividing each such summand by 1 less than the base gives the digit itself as remainder. Hence the third part of the routine just divides our 6-digit number by 63, producing in B a remainder that is the sum of the remainders from the individual digits, i.e., that is the sum of the digits. (In general terms this is the statement that the sum S of the digits in any number N in base b mod (b-1) -- provided b is deliberately chosen such that S < b-1. The condition holds here of course as the number of 1s in a PDP-10 word is at most 36. And it is in fact to make this condition hold that the routine converts from base 8 to base 64). MOVE B,A ;Copy in B LSH B,-1 ;Right one AND B,[333333,,333333] ;Mask out LSBs SUB A,B ;D - [D/2] LSH B,-1 ;Right one again AND B,[333333,,333333] ;Mask out middle bits SUBB A,B ;D - [D/2] - [D/4]; two copies LSH B,-3 ;Shift right one octal digit ADD A,B ;Add numbers in digit pairs AND A,[070707,,070707] ;Throw out extra pair sums IDIVI A,77 ;Divide by 63, sum in B If it is known that the 1s in the word are entirely contained within bits 22-35 (the right fourteen bits), we can use the following somewhat shorter routine, which is faster than the loop for more than seven 1s. It first treats the number in quaternary, replacing each digit with the number of 1s in it, and then converts from quaternary to hexadesimal. MOVEI B,(A) LSH B,-1 ANDI B,12525 ;Mask out LSBs SUBB A,B ;D - [D/2]; two copies LSH B,-2 ;Right one quaternary digit ANDI A,31463 ;Mask out some of digits in A ANDI B,31463 ;The rest in B ADDI A,(B) ;Now combine digit pairs IDIVI A,17 ;Divide by 15, sum in B Note that the pair of ANDIs gets rid of one out of each set of two identical bit pairs before adding. This is done because there can be digit overflow, i.e. a resulting hexadecimal digit can have more than two significant bits. Number Conversion In the standard algorithm for converting a number N to its equivalent in base b, one performs the series of divisions N/b = q1 + r1/b r1 < b q1/b = q2 + r2/b r2 < b q2/b = q3 + r3/b r3 < b . . . q /b = 0 + r /b r < b n-1 n n The number in base b is then rn...r3r2r1. For example, the octal equivalent of 61 decimal is 75: 61/8 = 7 + 5/8 7/8 = 0 + 7/8 The following decimal print routine converts a 36-bit positive integer in accumulator T to decimal and types it out. The contents of T and T+1 are destroyed. The routine is called by a PUSHJ P,DECPNT where P is the stack pointer. DECPNT: IDIVI T,12 ;^O12 = ^D10 PUSH P,T+1 ;Save remainder SKIPE T ;All digits formed ? PUSHJ P,DECPNT ;No, compute next one DECPN1: POP P,T ;Yes, take out in opposite order ADDI T,60 ;Convert to ASCII (60 is code for 0) JRST TTYOUT ;Type out This routine repeats the division until it produces a zero quotient. Hence it suppresses leading zeros, but since it is executed at least once it outputs one "0" if the number is zero. The TTYOUT routine returns with a POPJ P, to DECPN1 until all digits are typed, then to calling program. In section 0 space can be saved in the stack by storing the computed digits in the left halves of the locations that contain the jump addresses. This is accomplished in the decimal print routine by changing PUSH P,T+1 to HRLM T+1,(P) and POP P,T to HLRZ T,(P) The routine can handle a 36-bit unsigned integer if the IDIVI T,12 is replaced by LSHC T,-^D35 ;Shift right 35 bits into T+1 LSH T+1,-1 ;Vacate the T+1 sign bit DIVI T,12 ;Divide double length integer by 10 *End-Of-Excerpt*