The dangers behind image resizing (2021)

306 pointsby qwertyforceover 2 years ago

24 comments

planedeover 2 years ago

Problems with image resizing is a much deeper rabbit hole than this. Some important talking points:1. The form of interpolation (this article).2. The colorspace used for doing the arithmetic for interpolation. You most likely want a linear colorspace here.3. Clipping. Resizing is typically done in two phases, once resizing in x then in y direction, not necessarily in this order. If the kernel used has values outside of the range [0, 1] (like Lanczos) and for intermediate results you only capture the range [0,1], then you might get clipping in the intermediate image, which can cause artifacts.4. Quantization and dithering.5. If you have an alpha channel, using pre-multiplied alpha for interpolation arithmetic.I'm not trying to be exhaustive here. ImageWorsener's page has a nice reading list[1].[1] <a href="https://entropymine.com/imageworsener/" rel="nofollow">https://entropymine.com/imageworsener/</a>

评论 #34819078 未加载

评论 #34818004 未加载

评论 #34818726 未加载

评论 #34819395 未加载

评论 #34817527 未加载

评论 #34819007 未加载

评论 #34821206 未加载

评论 #34818693 未加载

评论 #34825657 未加载

评论 #34817502 未加载

评论 #34820064 未加载

version_fiveover 2 years ago

I'd argue that if your ML model is sensitive to the anti-aliasing filter used in image resizing, you've got bigger problems than that. Unless it's actually making a visible change that spoils whatever it is the model supposed to be looking for. To use the standard cat / dog example, filter choice or resampling choice is not going to change what you've got a picture of, and if your model is classifying based in features that change with resampling, it's not trustworthy.If one is concerned about this, one could intentionally vary the resampling or deliberately add different blurring filters during training to make the model robust to these variations

评论 #34817922 未加载

评论 #34821378 未加载

brucethemoose2over 2 years ago

For those going down this rabbit hole, perceptual downscaling is state of the art, and the closest thing we have to a Python implementation is here (with a citation of the original paper): <a href="https://github.com/WolframRhodium/muvsfunc/blob/master/muvsfunc.py#L3671">https://github.com/WolframRhodium/muvsfunc/blob/master/muvsf...</a>Other supposedly better CUDA/ML filters give me strange results.

评论 #34818974 未加载

评论 #34821203 未加载

account42over 2 years ago

> The definition of scaling function is mathematical and should never be a function of the library being used.Horseshit. Image resizing or any other kind of resampling is essentially always about filling in missing information. The is no mathematical model that will tell you for certain what the missing information is.

评论 #34818303 未加载

评论 #34817492 未加载

评论 #34817513 未加载

评论 #34817472 未加载

jcynixover 2 years ago

Now that's an interesting topic for photographers who like to experiment with anamorphic lenses for panoramas.An anamorphic lens (optically) "squeezes" the image onto the sensor, and afterwards the digital image has to be "desqueezed" (i.e. upscaled in one axis) to give you the "final" image. Which in turn is downscaled to be viewed on either a monitor or a printout.But the resulting images I've seen until now nevertheless look good. I think that's because in natural images you have not that many pixel-level details. And we mostly see downscaled images on the web or in youtube videos most of the time ...

thrdbndndnover 2 years ago

I'm shocked. I don't even know this is a thing.By that I mean, I know what bilinear/bicubic/lanczos resizing algorithms are, and I know they should at least have acceptable results (compared to NN).But I don't know famous libraries (especially OpenCV which is a computer vision library!) could have such poor results.Also a side note, IIRC bilinear and bicubic have constants in the equation. So technically when you're comparing different implementations you need to make sure this input (parameters) is the same. But this shouldn't excuse the extreme poor results in some.

评论 #34824318 未加载

评论 #34827748 未加载

godshatterover 2 years ago

If their worry is the differences between algorithms in libraries in different execution environments, shouldn't they either find a library they like that can be called from all such environments or if they can't find one or there is no single library that can be used in all environments then shouldn't they just write their own using their favorite algorithm? Why make all libraries do this the same way? Which one is undeniably correct?

评论 #34820656 未加载

JackFrover 2 years ago

Hmmm. With respect to feeding an ML system, are visual glitches and artifacts important? Wouldn't the most important thing to use a transformation which preserves as much information as possible and captures relevant structure? If the intermediate picture doesn't look great, who cares if the result is good.Ooops. Just thought about generative systems. Nevermind.

评论 #34822955 未加载

IYashaover 2 years ago

So, what are the dangers? (what's the point of the article?) That you'll get different model with same originals processed by different algorithms?The comparison of resizing algorithms is not something new, importance of adequate input data is obvious, difference in image processing algorithms availability is also understandable. Clickbaity.

评论 #34823685 未加载

评论 #34820725 未加载

ricardobeatover 2 years ago

Was hoping to see libvips in the comparison, which is widely used.I wonder why it's not adopted by any of these frameworks?

intrasightover 2 years ago

I was sort of expecting them to describe this danger to resizing: one can feed a piece of an image into one of these new massive ML models and get back the full image - with things that you didn't want to share. Like cropping out my ex.IS ML sort of like a universal hologram in that respect?

pallas_athenaover 2 years ago

If you upscale (with interpolation) some sensitive image (think security camera), could that be dismissed in court as it "creates" new information that wasn't there in the original image?

hgomersallover 2 years ago

The bigger problem is that the pixel domain is not a very good domain to be operating in. How many hours and of training and thousands of images are used to essentially learn about Gabor filters.

biscuits1over 2 years ago

This article throws a red flag on proving negative(s). This is impossible with maths. The void is filled by human subjectivity. In a graphical sense, "visual taste."

mythzover 2 years ago

What are some good image upscaler libraries that exist? I'm assuming the high quality ones would need to use some AI model to fill in missing detail.

评论 #34817951 未加载

评论 #34818320 未加载

erulabsover 2 years ago

Image resizing is one of those things that most companies seem to build in-house over and over. There are several hosted services, but obviously sending your users photos to a 3rd party is pretty weak. For those of us looking for a middle-ground: I've had great success with imgproxy (<a href="https://github.com/imgproxy/imgproxy">https://github.com/imgproxy/imgproxy</a>) which wraps libvips and well is maintained.

singularity2001over 2 years ago

funny that they use tf and pytorch in this context without even mentioning their fantastic upsampling capabilities

estover 2 years ago

Is there any hacks/study to maximize the downsampling errors?E.g. looks totally different on original vs 224x224 pictures

评论 #34820936 未加载

WithinReasonover 2 years ago

torch.nn.functional.interpolate has an "antialias" switch that's off by default

评论 #34817544 未加载

dark-starover 2 years ago

downscaling images introduces artifacts and throws away information! news at 5!

AtNightWeCodeover 2 years ago

Thought this article was going to be about DDOS...

fIREpOKover 2 years ago

I favored cropping even back in 2021

thr0wnawaytod4yover 2 years ago

Came here for a new ImageTragick but got actual resizing problems

cynicalsecurityover 2 years ago

Finally someone said it.