/* perm19sum: by Antti Karttunen, Jan 30 - Feb 04 2005. Form the "circular permutation sum": T = p(1)*p(2) + p(2)*p(3) + p(3)*p(4) +... + p(18)*p(19) + p(19)*p(1). Note an important optimization, shortening the execution time by n: because all the circular rotations of that sum result the same value, we can fix one element of the permutation (e.g. 1 or the highest), and permute only the elements 1..n-1. (So also do the reversals, but finding all the equivalence classes of the bracelets is a more involved operation, and would yield at most twofold improvement.) See Leroy Quet, Emeric Deutsch, Robert G. Wilson v, et al. at SeqFan mailing list, January 2x - 28 2005. Here we have used 12!/10 = 47900160 as an "approximation" for the FPGA's default speed of 50 million clock cycles per second, and 87360 as an "approximation" for the number of seconds in one day (86400), so our value is 960 seconds, i.e. 16 minutes longer. (Maybe we are running this in Mars?) n n! ~ secs with 50 MHz ~ days with 50 MHz 10 3 628 800 11 39 916 800 < 1s 12 479 001 600 10 13 6 227 020 800 130 14 87 178 291 200 1 820 ~ half an hour 15 1 307 674 368 000 27 300 5/16 16 20 922 789 888 000 436 800 5 17 355 687 428 096 000 7 425 600 85 ~ 3 kk 18 6 402 373 705 728 000 133 660 800 1 530 ~ 4 yr 19 121 645 100 408 832 000 2 539 555 200 29 070 ~ 80 yr 20 2 432 902 008 176 640 000 50 791 104 000 581 400 21 51 090 942 171 709 440 000 1 066 613 184 000 12 209 400 22 1 124 000 727 777 607 680 000 23 465 490 048 000 268 606 800 23 25 852 016 738 884 976 640 000 539 706 271 104 000 6 177 956 400 Note that the absolute maximum upper bound for such sums of permutations of {1..n} is n^3, which from 10 to 21 grows like this: 1000 1331 1728 2197 2744 3375 4096 4913 5832 6859 8000 9261. Note that 18^3 = 5832, which well fits into 13 bits. In reality, max,min sequences for the first 10 values run as (computed by Robert G. Wilson v with Mathematica): {{1, 1}, {4, 4}, {11, 11}, {25, 21}, {48, 37}, {82, 58}, {129, 87}, {191, 123}, {270, 169}, {368, 224}} We rewrite the terms of: p(1)*p(2) + p(2)*p(3) + p(3)*p(4) +... + p(n-1)*p(n) + p(n)*p(1). as: p1*(pn+p2) + p3*(p2+p4) + p5*(p4+p6) + p7*(p6+p8) + p9*(p8+p10) + p11*(p10+p12) + p13*(p12+p14) + p15*(p14+p16) + ... + p(n-1)*(p(n-2)+pn). Because we compute the sum also for the smaller permutations than those of {1..19}, we actually compute this: p1*(pn+p2) + p3*(p2+p4) + p5*(p4+p6) + p7*(p6+p8) + p9*(p8+p10) + p11*(p10+p12) + p13*(p12+p14) + p15*(p14+p16) + p17*(p16+p18). + p19*(p18+p1). where n is the size of permutation (in range [3,18]), and the term pn is given by the calling module, duplicating the information in one of p1 - p18. When n is less than 18, the "calling module" should give the inputs p(n+1) - p(18) as zeros, so that the extraneous multiplications and additions do not affect the result. To see how this works: For n=1 we get p1*p1 = 1*1 = 1, computed as 1*(1+0) + 0*(0+0) + ... For n=2 we get p1*p2 + p2*p1. (e.g. 1*2 + 2*1 = 4) computed as p1*(p2+p2) + 0*(p2+0) + ... = p2*p1 + p2*p1 For n=3 we get p1*p2 + p2*p3 + p3*p1 (e.g. 1*2 + 2*3 + 3*1 = 1*3 + 3*2 + 2*1 = 11) computed as p1*(p3+p2) + p3*(p2+0) + 0*(0+0) + ... = p1*p3 + p1*p2 + p3*p2 For n=4 we get p1*p2 + p2*p3 + p3*p4 + p4*p1 computed as p1*(p4+p2) + p3*(p2+p4) + 0*(p4+0) + ... = p1*p4 + p1*p2 + p3*p2 + p3*p4 ... For n=17 we get p1*p2 + p2*p3 + p3*p4 + ... + p16*p17 + p17*p1 computed as p1*(p17+p2) + p3*(p2+p4) + p5*(p4+p6) + ... + p17*(p16+0) + 0*(0+p1) For n=18 we get p1*p2 + p2*p3 + p3*p4 + ... + p16*p17 + p17*p18 + p18*p1 computed as p1*(p18+p2) + p3*(p2+p4) + p5*(p4+p6) + ... + p17*(p16+p18) + 0*(p18+p1) For n=19 we get p1*p2 + p2*p3 + p3*p4 + ... + p16*p17 + p17*p18 + p18*p19 + p19*p1 computed as p1*(0+p2) + p3*(p2+p4) + p5*(p4+p6) + ... + p17*(p16+p18) + p19*(p18+p1) */ // Note that the pn should duplicate the last non-zero term, // unless n is 19, in which case pn should be zero ! module perm19sum(output [13:0] psum, input [4:0] p1, input [4:0] p2, input [4:0] p3, input [4:0] p4, input [4:0] p5, input [4:0] p6, input [4:0] p7, input [4:0] p8, input [4:0] p9, input [4:0] p10, input [4:0] p11, input [4:0] p12, input [4:0] p13, input [4:0] p14, input [4:0] p15, input [4:0] p16, input [4:0] p17, input [4:0] p18, input [4:0] p19, input [4:0] pn); // Note that 17*18 = 306, and even 22*23 = 506, which both fit // into nine bits (i.e. the most significant bit is bit-8.) // However, 16*(17+18) = 560, (16+18)*17 = 578, (16+17)*18 = 594 // (17+18)*19 = 665, (18+19)*20 = 740, (19+20)*21 = 819, (20+21)*22 = 902 // (21+22)*23 = 989, // which needs ten bits. // (And (22+23)*24 = 1080 needs eleven). parameter w = 10; wire [w-1:0] m1 = p1 * (p2+pn); wire [w-1:0] m3 = p3 * (p2+p4); wire [w-1:0] m5 = p5 * (p6+p4); wire [w-1:0] m7 = p7 * (p6+p8); wire [w-1:0] m9 = p9 * (p10+p8); wire [w-1:0] m11 = p11 * (p10+p12); wire [w-1:0] m13 = p13 * (p14+p12); wire [w-1:0] m15 = p15 * (p14+p16); wire [w-1:0] m17 = p17 * (p18+p16); wire [w-1:0] m19 = p19 * (p18+p1); // Each summand is ten bits, there are four levels in the summation // so the absolute max size of psum is 14 bits. // (In reality it fits to much smaller space). assign psum = ((((m1+m3)+(m5+m7)) + ((m9+m11)+(m13+m15))) + (m17+m19)); endmodule