Double Product combine1: Maximum use of data abstraction: Best: 10.73 (12%), Overall Best: 10.74 40-most: 10.75 cycles/element Double Product combine2: Take vec_length() out of loop: Best: 10.78 (6%), Overall Best: 10.89 40-most: 10.98 cycles/element Double Product combine3: Array reference to vector data: Best: 4.45 (2%), Overall Best: 4.48 40-most: 4.49 cycles/element Double Product combine3w: Update *dest within loop only with write: Best: 4.44 (2%), Overall Best: 4.47 40-most: 4.49 cycles/element Double Product combine4: Array reference, accumulate in temporary: Best: 4.47 (14%), Overall Best: 4.48 40-most: 4.48 cycles/element Double Product combine4b: Include bonds check in loop: Best: 4.47 (2%), Overall Best: 4.49 40-most: 4.50 cycles/element Double Product combine4p: Pointer reference, accumulate in temporary: Best: 4.48 (20%), Overall Best: 4.48 40-most: 4.51 cycles/element Double Product combine5: Array code, unrolled by 2: Best: 4.47 (2%), Overall Best: 4.48 40-most: 4.50 cycles/element Double Product combine5p: Pointer code, unrolled by 2, for loop: Best: 4.25 (2%), Overall Best: 4.47 40-most: 4.50 cycles/element Double Product unroll2aw: Array code, unrolled by 2, while loop: Best: 4.47 (8%), Overall Best: 4.48 40-most: 4.52 cycles/element Double Product unroll3a: Array code, unrolled by 3: Best: 4.47 (8%), Overall Best: 4.48 40-most: 4.53 cycles/element Double Product unroll4a: Array code, unrolled by 4: Best: 4.43 (2%), Overall Best: 4.48 40-most: 4.53 cycles/element Double Product unroll5a: Array code, unrolled by 5: Best: 4.47 (2%), Overall Best: 4.48 40-most: 4.53 cycles/element Double Product unroll6a: Array code, unrolled by 6: Best: 4.47 (8%), Overall Best: 4.48 40-most: 4.53 cycles/element Double Product unroll7a: Array code, unrolled by 7: Best: 4.47 (6%), Overall Best: 4.48 40-most: 4.52 cycles/element Double Product unroll8a: Array code, unrolled by 8: Best: 4.46 (2%), Overall Best: 4.48 40-most: 4.51 cycles/element Double Product unroll9a: Array code, unrolled by 9: Best: 4.41 (2%), Overall Best: 4.48 40-most: 4.51 cycles/element Double Product unroll10a: Array code, unrolled by 10: Best: 4.48 (16%), Overall Best: 4.48 40-most: 4.51 cycles/element Double Product unroll16a: Array code, unrolled by 16: Best: 4.45 (2%), Overall Best: 4.48 40-most: 4.50 cycles/element Double Product unroll2: Pointer code, unrolled by 2: Best: 4.43 (2%), Overall Best: 4.48 40-most: 4.51 cycles/element Double Product unroll3: Pointer code, unrolled by 3: Best: 4.42 (2%), Overall Best: 4.48 40-most: 4.52 cycles/element Double Product unroll4: Pointer code, unrolled by 4: Best: 4.47 (8%), Overall Best: 4.48 40-most: 4.53 cycles/element Double Product unroll8: Pointer code, unrolled by 8: Best: 4.47 (6%), Overall Best: 4.48 40-most: 4.53 cycles/element Double Product unroll16: Pointer code, unrolled by 16: Best: 4.46 (2%), Overall Best: 4.48 40-most: 4.53 cycles/element Double Product combine6: Array code, unrolled by 2, Superscalar x2: Best: 2.23 (6%), Overall Best: 2.24 40-most: 2.29 cycles/element Double Product unroll4x2a: Array code, unrolled by 4, Superscalar x2: Best: 2.23 (6%), Overall Best: 2.25 40-most: 2.29 cycles/element Double Product unroll8x2a: Array code, unrolled by 8, Superscalar x2: Best: 2.23 (2%), Overall Best: 2.25 40-most: 2.29 cycles/element Double Product unroll3x3a: Array code, unrolled by 3, Superscalar x3: Best: 1.49 (14%), Overall Best: 1.50 40-most: 1.52 cycles/element Double Product unroll4x4a: Array code, unrolled by 4, Superscalar x4: Best: 1.11 (2%), Overall Best: 1.13 40-most: 1.16 cycles/element Double Product unroll5x5a: Array code, unrolled by 5, Superscalar x5: Best: 0.95 (10%), Overall Best: 0.95 40-most: 0.98 cycles/element Double Product unroll6x6a: Array code, unrolled by 6, Superscalar x6: Best: 0.92 (2%), Overall Best: 0.97 40-most: 0.98 cycles/element Double Product unroll7x7a: Array code, unrolled by 7, Superscalar x7: Best: 0.95 (6%), Overall Best: 0.96 40-most: 1.01 cycles/element Double Product unroll8x4a: Array code, unrolled by 8, Superscalar x4: Best: 1.14 (4%), Overall Best: 1.16 40-most: 1.21 cycles/element Double Product unroll8x8a: Array code, unrolled by 8, Superscalar x8: Best: 0.95 (2%), Overall Best: 0.97 40-most: 1.03 cycles/element Double Product unroll9x9a: Array code, unrolled by 9, Superscalar x9: Best: 0.96 (2%), Overall Best: 0.99 40-most: 1.04 cycles/element Double Product unroll10x10a: Array code, unrolled by 10, Superscalar x10: Best: 0.95 (2%), Overall Best: 0.95 40-most: 1.02 cycles/element Double Product unroll2x6a: Array code, unrolled by 12, Superscalar x6: Best: 0.88 (2%), Overall Best: 0.97 40-most: 1.01 cycles/element Double Product unroll12x12a: Array code, unrolled by 12, Superscalar x12: Best: 0.94 (4%), Overall Best: 0.95 40-most: 1.00 cycles/element Double Product unroll16x16a: Array code, unrolled by 16, Superscalar x16: Best: 0.92 (4%), Overall Best: 0.94 40-most: 0.97 cycles/element Double Product unroll20x20a: Array code, unrolled by 20, Superscalar x20: Best: 1.17 (10%), Overall Best: 1.17 40-most: 1.21 cycles/element Double Product unroll8x2: Pointer code, unrolled by 8, Superscalar x2: Best: 2.24 (2%), Overall Best: 2.25 40-most: 2.28 cycles/element Double Product unroll8x4: Pointer code, unrolled by 8, Superscalar x4: Best: 1.14 (12%), Overall Best: 1.15 40-most: 1.17 cycles/element Double Product unroll8x8: Pointer code, unrolled by 8, Superscalar x8: Best: 0.95 (2%), Overall Best: 0.96 40-most: 1.01 cycles/element Double Product unroll9x3: Pointer code, unrolled by 9, Superscalar x3: Best: 1.49 (2%), Overall Best: 1.51 40-most: 1.56 cycles/element Double Product unrollx2as: Array code, Unroll x2, Superscalar x2, noninterleaved: Best: 2.23 (2%), Overall Best: 2.25 40-most: 2.29 cycles/element Double Product combine7: Array code, unrolled by 2, different associativity: Best: 2.23 (6%), Overall Best: 2.25 40-most: 2.29 cycles/element Double Product unroll3aa: Array code, unrolled by 3, Different Associativity: Best: 1.48 (2%), Overall Best: 1.49 40-most: 1.55 cycles/element Double Product unroll4aa: Array code, unrolled by 4, Different Associativity: Best: 1.13 (16%), Overall Best: 1.14 40-most: 1.18 cycles/element Double Product unroll5aa: Array code, unrolled by 5, Different Associativity: Best: 0.94 (2%), Overall Best: 0.96 40-most: 1.01 cycles/element Double Product unroll6aa: Array code, unrolled by 6, Different Associativity: Best: 0.85 (2%), Overall Best: 0.97 40-most: 1.00 cycles/element Double Product unroll7aa: Array code, unrolled by 7, Different Associativity: Best: 0.95 (2%), Overall Best: 0.97 40-most: 1.00 cycles/element Double Product unroll8aa: Array code, unrolled by 8, Different Associativity: Best: 0.96 (2%), Overall Best: 0.97 40-most: 1.01 cycles/element Double Product unroll9aa: Array code, unrolled by 9, Different Associativity: Best: 0.95 (8%), Overall Best: 0.98 40-most: 1.00 cycles/element Double Product unroll10aa: Array code, unrolled by 10, Different Associativity: Best: 0.94 (6%), Overall Best: 0.97 40-most: 1.00 cycles/element Double Product unroll12aa: Array code, unrolled by 12, Different Associativity: Best: 0.92 (2%), Overall Best: 0.97 40-most: 1.02 cycles/element Double Product simd_v1: SSE code, 1*VSIZE-way parallelism: Best: 15.52 (2%), Overall Best: 15.56 40-most: 15.60 cycles/element Double Product simd_v2: SSE code, 2*VSIZE-way parallelism: Best: 12.57 (2%), Overall Best: 12.77 40-most: 12.85 cycles/element Double Product simd_v4: SSE code, 4*VSIZE-way parallelism: Best: 5.25 (2%), Overall Best: 5.33 40-most: 5.38 cycles/element Double Product simd_v8: SSE code, 8*VSIZE-way parallelism: Best: 3.77 (2%), Overall Best: 3.85 40-most: 3.86 cycles/element Double Product simd_v10: SSE code, 10*VSIZE-way parallelism: Best: 3.42 (2%), Overall Best: 3.45 40-most: 3.51 cycles/element Double Product simd_v12: SSE code, 12*VSIZE-way parallelism: Best: 3.56 (2%), Overall Best: 3.62 40-most: 3.65 cycles/element Double Product simd_v2a: SSE code, 2*VSIZE-way parallelism, reassociate: Best: 11.91 (2%), Overall Best: 11.92 40-most: 11.96 cycles/element Double Product simd_v4a: SSE code, 4*VSIZE-way parallelism, reassociate: Best: 5.71 (10%), Overall Best: 5.72 40-most: 5.75 cycles/element Double Product simd_v8a: SSE code, 8*VSIZE-way parallelism, reassociate: Best: 3.41 (2%), Overall Best: 3.61 40-most: 3.64 cycles/element