The dangers behind image resizing (2021)

306 点作者 qwertyforce超过 2 年前

24 条评论

planede超过 2 年前

Problems with image resizing is a much deeper rabbit hole than this. Some important talking points:1. The form of interpolation (this article).2. The colorspace used for doing the arithmetic for interpolation. You most likely want a linear colorspace here.3. Clipping. Resizing is typically done in two phases, once resizing in x then in y direction, not necessarily in this order. If the kernel used has values outside of the range [0, 1] (like Lanczos) and for intermediate results you only capture the range [0,1], then you might get clipping in the intermediate image, which can cause artifacts.4. Quantization and dithering.5. If you have an alpha channel, using pre-multiplied alpha for interpolation arithmetic.I'm not trying to be exhaustive here. ImageWorsener's page has a nice reading list[1].[1] <a href="https://entropymine.com/imageworsener/" rel="nofollow">https://entropymine.com/imageworsener/</a>

评论 #34819078 未加载

评论 #34818004 未加载

评论 #34818726 未加载

评论 #34819395 未加载

评论 #34817527 未加载

评论 #34819007 未加载

评论 #34821206 未加载

评论 #34818693 未加载

评论 #34825657 未加载

评论 #34817502 未加载

评论 #34820064 未加载

version_five超过 2 年前

I'd argue that if your ML model is sensitive to the anti-aliasing filter used in image resizing, you've got bigger problems than that. Unless it's actually making a visible change that spoils whatever it is the model supposed to be looking for. To use the standard cat / dog example, filter choice or resampling choice is not going to change what you've got a picture of, and if your model is classifying based in features that change with resampling, it's not trustworthy.If one is concerned about this, one could intentionally vary the resampling or deliberately add different blurring filters during training to make the model robust to these variations

评论 #34817922 未加载

评论 #34821378 未加载

brucethemoose2超过 2 年前

For those going down this rabbit hole, perceptual downscaling is state of the art, and the closest thing we have to a Python implementation is here (with a citation of the original paper): <a href="https://github.com/WolframRhodium/muvsfunc/blob/master/muvsfunc.py#L3671">https://github.com/WolframRhodium/muvsfunc/blob/master/muvsf...</a>Other supposedly better CUDA/ML filters give me strange results.

评论 #34818974 未加载

评论 #34821203 未加载

account42超过 2 年前

> The definition of scaling function is mathematical and should never be a function of the library being used.Horseshit. Image resizing or any other kind of resampling is essentially always about filling in missing information. The is no mathematical model that will tell you for certain what the missing information is.

评论 #34818303 未加载

评论 #34817492 未加载

评论 #34817513 未加载

评论 #34817472 未加载

jcynix超过 2 年前

Now that's an interesting topic for photographers who like to experiment with anamorphic lenses for panoramas.An anamorphic lens (optically) "squeezes" the image onto the sensor, and afterwards the digital image has to be "desqueezed" (i.e. upscaled in one axis) to give you the "final" image. Which in turn is downscaled to be viewed on either a monitor or a printout.But the resulting images I've seen until now nevertheless look good. I think that's because in natural images you have not that many pixel-level details. And we mostly see downscaled images on the web or in youtube videos most of the time ...

thrdbndndn超过 2 年前

I'm shocked. I don't even know this is a thing.By that I mean, I know what bilinear/bicubic/lanczos resizing algorithms are, and I know they should at least have acceptable results (compared to NN).But I don't know famous libraries (especially OpenCV which is a computer vision library!) could have such poor results.Also a side note, IIRC bilinear and bicubic have constants in the equation. So technically when you're comparing different implementations you need to make sure this input (parameters) is the same. But this shouldn't excuse the extreme poor results in some.

评论 #34824318 未加载

评论 #34827748 未加载

godshatter超过 2 年前

If their worry is the differences between algorithms in libraries in different execution environments, shouldn't they either find a library they like that can be called from all such environments or if they can't find one or there is no single library that can be used in all environments then shouldn't they just write their own using their favorite algorithm? Why make all libraries do this the same way? Which one is undeniably correct?

评论 #34820656 未加载

JackFr超过 2 年前

Hmmm. With respect to feeding an ML system, are visual glitches and artifacts important? Wouldn't the most important thing to use a transformation which preserves as much information as possible and captures relevant structure? If the intermediate picture doesn't look great, who cares if the result is good.Ooops. Just thought about generative systems. Nevermind.

评论 #34822955 未加载

IYasha超过 2 年前

So, what are the dangers? (what's the point of the article?) That you'll get different model with same originals processed by different algorithms?The comparison of resizing algorithms is not something new, importance of adequate input data is obvious, difference in image processing algorithms availability is also understandable. Clickbaity.

评论 #34823685 未加载

评论 #34820725 未加载

ricardobeat超过 2 年前

Was hoping to see libvips in the comparison, which is widely used.I wonder why it's not adopted by any of these frameworks?

intrasight超过 2 年前

I was sort of expecting them to describe this danger to resizing: one can feed a piece of an image into one of these new massive ML models and get back the full image - with things that you didn't want to share. Like cropping out my ex.IS ML sort of like a universal hologram in that respect?

pallas_athena超过 2 年前

If you upscale (with interpolation) some sensitive image (think security camera), could that be dismissed in court as it "creates" new information that wasn't there in the original image?

hgomersall超过 2 年前

The bigger problem is that the pixel domain is not a very good domain to be operating in. How many hours and of training and thousands of images are used to essentially learn about Gabor filters.

biscuits1超过 2 年前

This article throws a red flag on proving negative(s). This is impossible with maths. The void is filled by human subjectivity. In a graphical sense, "visual taste."

mythz超过 2 年前

What are some good image upscaler libraries that exist? I'm assuming the high quality ones would need to use some AI model to fill in missing detail.

评论 #34817951 未加载

评论 #34818320 未加载

erulabs超过 2 年前

Image resizing is one of those things that most companies seem to build in-house over and over. There are several hosted services, but obviously sending your users photos to a 3rd party is pretty weak. For those of us looking for a middle-ground: I've had great success with imgproxy (<a href="https://github.com/imgproxy/imgproxy">https://github.com/imgproxy/imgproxy</a>) which wraps libvips and well is maintained.

singularity2001超过 2 年前

funny that they use tf and pytorch in this context without even mentioning their fantastic upsampling capabilities

est超过 2 年前

Is there any hacks/study to maximize the downsampling errors?E.g. looks totally different on original vs 224x224 pictures

评论 #34820936 未加载

WithinReason超过 2 年前

torch.nn.functional.interpolate has an "antialias" switch that's off by default

评论 #34817544 未加载

dark-star超过 2 年前

downscaling images introduces artifacts and throws away information! news at 5!

AtNightWeCode超过 2 年前

Thought this article was going to be about DDOS...

fIREpOK超过 2 年前

I favored cropping even back in 2021

thr0wnawaytod4y超过 2 年前

Came here for a new ImageTragick but got actual resizing problems

cynicalsecurity超过 2 年前

Finally someone said it.