Some good basic info, but at the same time there are some inaccuracies. WAV is not a lossless format, it's a container, it can contain any compressed audio format, even mp3. You can have PCM inside WAV, which is indeed lossless, but you're not going to see that in the wild too often. Going with 16k is also questionable, since most readily available pre-existing datasets, were recorded in 8k (which is what telephony codecs mostly use).