Are there any alternatives to kafka that are also modeled after the "message queue as a log" concept? In particular, I'd like to be able to reconsume arbitrary ranges of 'already processed' log/event/message data as well as trust the log as the ultimate deterministic source of truth for the state of the system.<p>The reason I don't just use kafka is because the "quorum" style scaling is overkill for our needs since the app will be for internal use only and will likely never exceed more than 1000 simultaneous users. Also, I've heard that zookeeper (required to use kafka) has its own technical overhead that I'd like to avoid dealing with if possible.
Conceptually, how do these types of pub/sub messaging systems work at scale? How does the number of subscribers impact the efficiency of updates being distributed to the subscribers? Is the server pushing these messages to them all simultaneously, or is there some logic that might result in one subscriber receiving an update faster than another? Is the publishing server opening up a ton of ports to handle the communication, or from a networking/ports perspective how is this handled?
Great post, but one thing to note is that the number of partitions DOES NOT need to be equivalent to the number of consumers. A consumer can reasonably consume multiple partitions, as most do. On the other hand, it does provide an upper bound to the number of consumers in one group, as a consumer can only reasonably consume as few as one of the partitions.