| OLD | NEW |
| 1 ; | 1 ; |
| 2 ; Copyright (c) 2011 The WebM project authors. All Rights Reserved. | 2 ; Copyright (c) 2011 The WebM project authors. All Rights Reserved. |
| 3 ; | 3 ; |
| 4 ; Use of this source code is governed by a BSD-style license | 4 ; Use of this source code is governed by a BSD-style license |
| 5 ; that can be found in the LICENSE file in the root of the source | 5 ; that can be found in the LICENSE file in the root of the source |
| 6 ; tree. An additional intellectual property rights grant can be found | 6 ; tree. An additional intellectual property rights grant can be found |
| 7 ; in the file PATENTS. All contributing project authors may | 7 ; in the file PATENTS. All contributing project authors may |
| 8 ; be found in the AUTHORS file in the root of the source tree. | 8 ; be found in the AUTHORS file in the root of the source tree. |
| 9 ; | 9 ; |
| 10 | 10 |
| (...skipping 51 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
| 62 usub8 r6, r5, r4 ; calculate difference with reversed operands | 62 usub8 r6, r5, r4 ; calculate difference with reversed operands |
| 63 pld [r2, r3, lsl #1] | 63 pld [r2, r3, lsl #1] |
| 64 sel r6, r6, lr ; select bytes with negative difference | 64 sel r6, r6, lr ; select bytes with negative difference |
| 65 | 65 |
| 66 ; calculate partial sums | 66 ; calculate partial sums |
| 67 usad8 r4, r7, lr ; calculate sum of positive differences | 67 usad8 r4, r7, lr ; calculate sum of positive differences |
| 68 usad8 r5, r6, lr ; calculate sum of negative differences | 68 usad8 r5, r6, lr ; calculate sum of negative differences |
| 69 orr r6, r6, r7 ; differences of all 4 pixels | 69 orr r6, r6, r7 ; differences of all 4 pixels |
| 70 ; calculate total sum | 70 ; calculate total sum |
| 71 adds r8, r8, r4 ; add positive differences to sum | 71 adds r8, r8, r4 ; add positive differences to sum |
| 72 subs r8, r8, r5 ; substract negative differences from sum | 72 subs r8, r8, r5 ; subtract negative differences from sum |
| 73 | 73 |
| 74 ; calculate sse | 74 ; calculate sse |
| 75 uxtb16 r5, r6 ; byte (two pixels) to halfwords | 75 uxtb16 r5, r6 ; byte (two pixels) to halfwords |
| 76 uxtb16 r7, r6, ror #8 ; another two pixels to halfwords | 76 uxtb16 r7, r6, ror #8 ; another two pixels to halfwords |
| 77 smlad r11, r5, r5, r11 ; dual signed multiply, add and accumulate (1) | 77 smlad r11, r5, r5, r11 ; dual signed multiply, add and accumulate (1) |
| 78 | 78 |
| 79 ; 2nd 4 pixels | 79 ; 2nd 4 pixels |
| 80 ldr r4, [r0, #4] ; load source pixels a, row N | 80 ldr r4, [r0, #4] ; load source pixels a, row N |
| 81 ldr r6, [r0, #5] ; load source pixels b, row N | 81 ldr r6, [r0, #5] ; load source pixels b, row N |
| 82 ldr r5, [r9, #4] ; load source pixels c, row N+1 | 82 ldr r5, [r9, #4] ; load source pixels c, row N+1 |
| (...skipping 21 matching lines...) Expand all Loading... |
| 104 usub8 r6, r5, r4 ; calculate difference with reversed operands | 104 usub8 r6, r5, r4 ; calculate difference with reversed operands |
| 105 sel r6, r6, lr ; select bytes with negative difference | 105 sel r6, r6, lr ; select bytes with negative difference |
| 106 | 106 |
| 107 ; calculate partial sums | 107 ; calculate partial sums |
| 108 usad8 r4, r7, lr ; calculate sum of positive differences | 108 usad8 r4, r7, lr ; calculate sum of positive differences |
| 109 usad8 r5, r6, lr ; calculate sum of negative differences | 109 usad8 r5, r6, lr ; calculate sum of negative differences |
| 110 orr r6, r6, r7 ; differences of all 4 pixels | 110 orr r6, r6, r7 ; differences of all 4 pixels |
| 111 | 111 |
| 112 ; calculate total sum | 112 ; calculate total sum |
| 113 add r8, r8, r4 ; add positive differences to sum | 113 add r8, r8, r4 ; add positive differences to sum |
| 114 sub r8, r8, r5 ; substract negative differences from sum | 114 sub r8, r8, r5 ; subtract negative differences from sum |
| 115 | 115 |
| 116 ; calculate sse | 116 ; calculate sse |
| 117 uxtb16 r5, r6 ; byte (two pixels) to halfwords | 117 uxtb16 r5, r6 ; byte (two pixels) to halfwords |
| 118 uxtb16 r7, r6, ror #8 ; another two pixels to halfwords | 118 uxtb16 r7, r6, ror #8 ; another two pixels to halfwords |
| 119 smlad r11, r5, r5, r11 ; dual signed multiply, add and accumulate (1) | 119 smlad r11, r5, r5, r11 ; dual signed multiply, add and accumulate (1) |
| 120 | 120 |
| 121 ; 3rd 4 pixels | 121 ; 3rd 4 pixels |
| 122 ldr r4, [r0, #8] ; load source pixels a, row N | 122 ldr r4, [r0, #8] ; load source pixels a, row N |
| 123 ldr r6, [r0, #9] ; load source pixels b, row N | 123 ldr r6, [r0, #9] ; load source pixels b, row N |
| 124 ldr r5, [r9, #8] ; load source pixels c, row N+1 | 124 ldr r5, [r9, #8] ; load source pixels c, row N+1 |
| (...skipping 21 matching lines...) Expand all Loading... |
| 146 usub8 r6, r5, r4 ; calculate difference with reversed operands | 146 usub8 r6, r5, r4 ; calculate difference with reversed operands |
| 147 sel r6, r6, lr ; select bytes with negative difference | 147 sel r6, r6, lr ; select bytes with negative difference |
| 148 | 148 |
| 149 ; calculate partial sums | 149 ; calculate partial sums |
| 150 usad8 r4, r7, lr ; calculate sum of positive differences | 150 usad8 r4, r7, lr ; calculate sum of positive differences |
| 151 usad8 r5, r6, lr ; calculate sum of negative differences | 151 usad8 r5, r6, lr ; calculate sum of negative differences |
| 152 orr r6, r6, r7 ; differences of all 4 pixels | 152 orr r6, r6, r7 ; differences of all 4 pixels |
| 153 | 153 |
| 154 ; calculate total sum | 154 ; calculate total sum |
| 155 add r8, r8, r4 ; add positive differences to sum | 155 add r8, r8, r4 ; add positive differences to sum |
| 156 sub r8, r8, r5 ; substract negative differences from sum | 156 sub r8, r8, r5 ; subtract negative differences from sum |
| 157 | 157 |
| 158 ; calculate sse | 158 ; calculate sse |
| 159 uxtb16 r5, r6 ; byte (two pixels) to halfwords | 159 uxtb16 r5, r6 ; byte (two pixels) to halfwords |
| 160 uxtb16 r7, r6, ror #8 ; another two pixels to halfwords | 160 uxtb16 r7, r6, ror #8 ; another two pixels to halfwords |
| 161 smlad r11, r5, r5, r11 ; dual signed multiply, add and accumulate (1) | 161 smlad r11, r5, r5, r11 ; dual signed multiply, add and accumulate (1) |
| 162 | 162 |
| 163 ; 4th 4 pixels | 163 ; 4th 4 pixels |
| 164 ldr r4, [r0, #12] ; load source pixels a, row N | 164 ldr r4, [r0, #12] ; load source pixels a, row N |
| 165 ldr r6, [r0, #13] ; load source pixels b, row N | 165 ldr r6, [r0, #13] ; load source pixels b, row N |
| 166 ldr r5, [r9, #12] ; load source pixels c, row N+1 | 166 ldr r5, [r9, #12] ; load source pixels c, row N+1 |
| (...skipping 21 matching lines...) Expand all Loading... |
| 188 add r2, r2, r3 ; set dst_ptr to next row | 188 add r2, r2, r3 ; set dst_ptr to next row |
| 189 sel r6, r6, lr ; select bytes with negative difference | 189 sel r6, r6, lr ; select bytes with negative difference |
| 190 | 190 |
| 191 ; calculate partial sums | 191 ; calculate partial sums |
| 192 usad8 r4, r7, lr ; calculate sum of positive differences | 192 usad8 r4, r7, lr ; calculate sum of positive differences |
| 193 usad8 r5, r6, lr ; calculate sum of negative differences | 193 usad8 r5, r6, lr ; calculate sum of negative differences |
| 194 orr r6, r6, r7 ; differences of all 4 pixels | 194 orr r6, r6, r7 ; differences of all 4 pixels |
| 195 | 195 |
| 196 ; calculate total sum | 196 ; calculate total sum |
| 197 add r8, r8, r4 ; add positive differences to sum | 197 add r8, r8, r4 ; add positive differences to sum |
| 198 sub r8, r8, r5 ; substract negative differences from sum | 198 sub r8, r8, r5 ; subtract negative differences from sum |
| 199 | 199 |
| 200 ; calculate sse | 200 ; calculate sse |
| 201 uxtb16 r5, r6 ; byte (two pixels) to halfwords | 201 uxtb16 r5, r6 ; byte (two pixels) to halfwords |
| 202 uxtb16 r7, r6, ror #8 ; another two pixels to halfwords | 202 uxtb16 r7, r6, ror #8 ; another two pixels to halfwords |
| 203 smlad r11, r5, r5, r11 ; dual signed multiply, add and accumulate (1) | 203 smlad r11, r5, r5, r11 ; dual signed multiply, add and accumulate (1) |
| 204 subs r12, r12, #1 | 204 subs r12, r12, #1 |
| 205 smlad r11, r7, r7, r11 ; dual signed multiply, add and accumulate (2) | 205 smlad r11, r7, r7, r11 ; dual signed multiply, add and accumulate (2) |
| 206 | 206 |
| 207 bne loop | 207 bne loop |
| 208 | 208 |
| 209 ; return stuff | 209 ; return stuff |
| 210 ldr r6, [sp, #40] ; get address of sse | 210 ldr r6, [sp, #40] ; get address of sse |
| 211 mul r0, r8, r8 ; sum * sum | 211 mul r0, r8, r8 ; sum * sum |
| 212 str r11, [r6] ; store sse | 212 str r11, [r6] ; store sse |
| 213 sub r0, r11, r0, lsr #8 ; return (sse - ((sum * sum) >> 8)) | 213 sub r0, r11, r0, lsr #8 ; return (sse - ((sum * sum) >> 8)) |
| 214 | 214 |
| 215 ldmfd sp!, {r4-r12, pc} | 215 ldmfd sp!, {r4-r12, pc} |
| 216 | 216 |
| 217 ENDP | 217 ENDP |
| 218 | 218 |
| 219 c80808080 | 219 c80808080 |
| 220 DCD 0x80808080 | 220 DCD 0x80808080 |
| 221 | 221 |
| 222 END | 222 END |
| OLD | NEW |