docs/filtering.md - Issue 1703653002: Add all wiki pages as MarkDown documents + README.md

Side by Side Diff: docs/filtering.md

Issue 1703653002: Add all wiki pages as MarkDown documents + README.md (Closed) Base URL: https://chromium.googlesource.com/libyuv/libyuv@master

Patch Set: Updated after review and fixed links Created 4 years, 10 months ago

Use n/p to move between diff chunks; N/P to move between comments. Draft comments are only viewable by you.

Jump to:

View unified diff | Download patch

OLD	NEW
(Empty)
	1 # Introduction

	2

	3 This document discusses the current state of filtering in libyuv. An emphasis on maximum performance while avoiding memory exceptions, and minimal amount of cod e/complexity. See future work at end.

	4

	5 # LibYuv Filter Subsampling

	6

	7 There are 2 challenges with subsampling

	8

	9 1. centering of samples, which involves clamping on edges

	10 2. clipping a source region

	11

	12 Centering depends on scale factor and filter mode.

	13

	14 # Down Sampling

	15

	16 If scaling down, the stepping rate is always src_width / dst_width.

	17

	18 dx = src_width / dst_width;

	19

	20 e.g. If scaling from 1280x720 to 640x360, the step thru the source will be 2.0, stepping over 2 pixels of source for each pixel of destination.

	21

	22 Centering, depends on filter mode.

	23

	24 Point downsampling takes the middle pixel.

	25

	26 x = dx >> 1;

	27

	28 For odd scale factors (e.g. 3x down) this is exactly the middle. For even scale factors, this rounds up and takes the pixel to the right of center. e.g. scale of 4x down will take pixel 2.

	29

	30 Bilinear filter, uses the 2x2 pixels in the middle.

	31

	32 x = dx / 2 - 0.5;

	33

	34 For odd scale factors (e.g. 3x down) this is exactly the middle, and point sampl ing is used.

	35 For even scale factors, this evenly filters the middle 2x2 pixels. e.g. 4x down will filter pixels 1,2 at 50% in both directions.

	36

	37 Box filter averages the entire box so sampling starts at 0.

	38

	39 x = 0;

	40

	41 For a scale factor of 2x down, this is equivalent to bilinear.

	42

	43 # Up Sampling

	44

	45 Point upsampling use stepping rate of src_width / dst_width and a starting c oordinate of 0.

	46

	47 x = 0;

	48 dx = src_width / dst_width;

	49

	50 e.g. If scaling from 640x360 to 1280x720 the step thru the source will be 0.0, s tepping half a pixel of source for each pixel of destination. Each pixel is repl icated by the scale factor.

	51

	52 Bilinear filter stretches such that the first pixel of source maps to the fi rst pixel of destination, and the last pixel of source maps to the last pixel of destination.

	53

	54 x = 0;

	55 dx = (src_width - 1) / (dst_width - 1);

	56

	57 This method is not technically correct, and will likely change in the future.

	58

	59 * It is inconsistent with the bilinear down sampler. The same method could be u sed for down sampling, and then it would be more reversible, but that would prev ent specialized 2x down sampling.

	60 * Although centered, the image is slightly magnified.

	61 * The filtering was changed in early 2013 - previously it used:

	62

	63 x = 0;

	64 dx = (src_width - 1) / (dst_width - 1);

	65

	66 Which is the correct scale factor, but shifted the image left, and extruded the last pixel. The reason for the change was to remove the extruding code from the low level row functions, allowing 3 functions to sshare the same row functions - ARGBScale, I420Scale, and ARGBInterpolate. Then the one function was ported t o many cpu variations: SSE2, SSSE3, AVX2, Neon and 'Any' version for any number of pixels and alignment. The function is also specialized for 0,25,50,75%.

	67

	68 The above goes still has the potential to read the last pixel 100% and last pixe l + 1 0%, which may cause a memory exception. So the left pixel goes to a fract ion less than the last pixel, but filters in the minimum amount of it, and the m aximum of the last pixel.

	69

	70 dx = FixedDiv((src_width << 16) - 0x00010001, (dst << 16) - 0x00010000);

	71

	72 Box filter for upsampling switches over to Bilinear.

	73

	74 # Scale snippet:

	75

	76 #define CENTERSTART(dx, s) (dx < 0) ? -((-dx >> 1) + s) : ((dx >> 1) + s)

	77 #define FIXEDDIV1(src, dst) FixedDiv((src << 16) - 0x00010001, \

	78 (dst << 16) - 0x00010000);

	79

	80 // Compute slope values for stepping.

	81 void ScaleSlope(int src_width, int src_height,

	82 int dst_width, int dst_height,

	83 FilterMode filtering,

	84 int* x, int* y, int* dx, int* dy) {

	85 assert(x != NULL);

	86 assert(y != NULL);

	87 assert(dx != NULL);

	88 assert(dy != NULL);

	89 assert(src_width != 0);

	90 assert(src_height != 0);

	91 assert(dst_width > 0);

	92 assert(dst_height > 0);

	93 if (filtering == kFilterBox) {

	94 // Scale step for point sampling duplicates all pixels equally.

	95 *dx = FixedDiv(Abs(src_width), dst_width);

	96 *dy = FixedDiv(src_height, dst_height);

	97 *x = 0;

	98 *y = 0;

	99 } else if (filtering == kFilterBilinear) {

	100 // Scale step for bilinear sampling renders last pixel once for upsample .

	101 if (dst_width <= Abs(src_width)) {

	102 *dx = FixedDiv(Abs(src_width), dst_width);

	103 x = CENTERSTART(dx, -32768);

	104 } else if (dst_width > 1) {

	105 *dx = FIXEDDIV1(Abs(src_width), dst_width);

	106 *x = 0;

	107 }

	108 if (dst_height <= src_height) {

	109 *dy = FixedDiv(src_height, dst_height);

	110 y = CENTERSTART(dy, -32768); // 32768 = -0.5 to center bilinear.

	111 } else if (dst_height > 1) {

	112 *dy = FIXEDDIV1(src_height, dst_height);

	113 *y = 0;

	114 }

	115 } else if (filtering == kFilterLinear) {

	116 // Scale step for bilinear sampling renders last pixel once for upsample .

	117 if (dst_width <= Abs(src_width)) {

	118 *dx = FixedDiv(Abs(src_width), dst_width);

	119 x = CENTERSTART(dx, -32768);

	120 } else if (dst_width > 1) {

	121 *dx = FIXEDDIV1(Abs(src_width), dst_width);

	122 *x = 0;

	123 }

	124 *dy = FixedDiv(src_height, dst_height);

	125 y = dy >> 1;

	126 } else {

	127 // Scale step for point sampling duplicates all pixels equally.

	128 *dx = FixedDiv(Abs(src_width), dst_width);

	129 *dy = FixedDiv(src_height, dst_height);

	130 x = CENTERSTART(dx, 0);

	131 y = CENTERSTART(dy, 0);

	132 }

	133 // Negative src_width means horizontally mirror.

	134 if (src_width < 0) {

	135 x += (dst_width - 1) *dx;

	136 dx = -dx;

	137 src_width = -src_width;

	138 }

	139 }

	140

	141 # Future Work

	142

	143 Point sampling should ideally be the same as bilinear, but pixel by pixel, round to nearest neighbor. But as is, it is reversible and exactly matches ffmpeg at all scale factors, both up and down. The scale factor is

	144

	145 dx = src_width / dst_width;

	146

	147 The step value is centered for down sample:

	148

	149 x = dx / 2;

	150

	151 Or starts at 0 for upsample.

	152

	153 x = 0;

	154

	155 Bilinear filtering is currently correct for down sampling, but not for upsamplin g.

	156 Upsampling is stretching the first and last pixel of source, to the first and la st pixel of destination.

	157

	158 dx = (src_width - 1) / (dst_width - 1);<br>

	159 x = 0;

	160

	161 It should be stretching such that the first pixel is centered in the middle of t he scale factor, to match the pixel that would be sampled for down sampling by t he same amount. And same on last pixel.

	162

	163 dx = src_width / dst_width;<br>

	164 x = dx / 2 - 0.5;

	165

	166 This would start at -0.5 and go to last pixel + 0.5, sampling 50% from last pixe l + 1.

	167 Then clamping would be needed. On GPUs there are numerous ways to clamp.

	168

	169 1. Clamp the coordinate to the edge of the texture, duplicating the first and la st pixel.

	170 2. Blend with a constant color, such as transparent black. Typically best for f onts.

	171 3. Mirror the UV coordinate, which is similar to clamping. Good for continuous tone images.

	172 4. Wrap the coordinate, for texture tiling.

	173 5. Allow the coordinate to index beyond the image, which may be the correct data if sampling a subimage.

	174 6. Extrapolate the edge based on the previous pixel. pixel -0.5 is computed fro m slope of pixel 0 and 1.

	175

	176 Some of these are computational, even for a GPU, which is one reason textures ar e sometimes limited to power of 2 sizes.

	177 We do care about the clipping case, where allowing coordinates to become negativ e and index pixels before the image is the correct data. But normally for simpl e scaling, we want to clamp to the edge pixel. For example, if bilinear scaling from 3x3 to 30x30, we’d essentially want 10 pixels of each of the original 3 pi xels. But we want the original pixels to land in the middle of each 10 pixels, at offsets 5, 15 and 25. There would be filtering between 5 and 15 between the original pixels 0 and 1. And filtering between 15 and 25 from original pixels 1 and 2. The first 5 pixels are clamped to pixel 0 and the last 5 pixels are cla mped to pixel 2.

	178 The easiest way to implement this is copy the original 3 pixels to a buffer, and duplicate the first and last pixels. 0,1,2 becomes 0, 0,1,2, 2. Then implemen t a filtering without clamping. We call this source extruding. Its only necess ary on up sampling, since down sampler will always have valid surrounding pixels .

	179 Extruding is practical when the image is already copied to a temporary buffer. It could be done to the original image, as long as the original memory is resto red, but valgrind and/or memory protection would disallow this, so it requires a memcpy to a temporary buffer, which may hurt performance. The memcpy has a per formance advantage, from a cache point of view, that can actually make this tech nique faster, depending on hardware characteristics.

	180 Vertical extrusion can be done with a memcpy of the first/last row, or clamping a pointer.

	181

	182

	183 The other way to implement clamping is handle the edges with a memset. e.g. Rea d first source pixel and memset the first 5 pixels. Filter pixels 0,1,2 to 5 to 25. Read last pixel and memset the last 5 pixels. Blur is implemented with th is method like this, which has 3 loops per row - left, middle and right.

	184

	185 Box filter is only used for 2x down sample or more. Its based on integer sized boxes. Technically it should be filtered edges, but thats substantially slower (roughly 100x), and at that point you may as well do a cubic filter which is mor e correct.

	186

	187 Box filter currently sums rows into a row buffer. It does this with

	188

	189 Mirroring will use the same slope as normal, but with a negative.

	190 The starting coordinate needs to consider the scale factor and filter. e.g. box filter of 30x30 to 3x3 with mirroring would use -10 for step, but x = 20. widt h (30) - dx.

	191

	192 Step needs to be accurate, so it uses an integer divide. This is as much as 5% of the profile. An approximated divide is substantially faster, but the inaccur acy causes stepping beyond the original image boundaries. 3 general solutions:

	193

	194 1. copy image to buffer with padding. allows for small errors in stepping.

	195 2. hash the divide, so common values are quickly found.

	196 3. change api so caller provides the slope.

OLD	NEW

« no previous file with comments | « docs/environment_variables.md ('k') | docs/formats.md » ('j') | no next file with comments »