I found this interesting and tried the question with the top models from Antrophic, Openai, Google and Mistral.
Which all gave the wrong results. But if you preface the question with "Of these two decimal numbers ", the answers changed and the results where correct.
I suspect what we are seeing is that the models handles the numbers as version numbers, and not decimal numbers.
This is disappointing and confusing, but it also imo. underlines that giving them context on what you try to get them to do is worthwhile.