See also djb’s crit-bit trees, which are a tighter and simpler version of PATRICIA trees - <a href="https://cr.yp.to/critbit.html" rel="nofollow">https://cr.yp.to/critbit.html</a> - and agl‘s nice literate code explanation of how they work - <a href="https://github.com/agl/critbit/" rel="nofollow">https://github.com/agl/critbit/</a><p>And my qp tries, which are smaller and faster than crit-bit trees <a href="https://dotat.at/prog/qp/" rel="nofollow">https://dotat.at/prog/qp/</a>
Radix trees are often used in routing tables. On Linux, for IPv4, there is an additional level of compression to avoid the cost of many intermediate nodes. I've described the algorithm used on Linux here: <a href="https://vincent.bernat.im/en/blog/2017-ipv4-route-lookup-linux#route-lookup-in-a-trie" rel="nofollow">https://vincent.bernat.im/en/blog/2017-ipv4-route-lookup-lin...</a>
An obvious way to save space that is not mentioned is of course not to have alphabet sized pointer arrays in each node and if the alphabet is big, for example Unicode characters, it is not really an option to begin with. In that case you would use some kind of dictionary to store the pointers to child nodes. You can also dynamically switch between different node implementations depending on the number of child nodes.
Compressing tree data is not trivial. E.g. using per-node heap allocation could take many times the space of the actual data (memory used by the allocator, metainformation, alignment padding, etc.). Using a custom allocator can reduce the memory usage dramatically.