Problem
We have a set of items with given keys. Multiple items may have the same key. We also have a set of peers, each of them being able to store a number of items. The peers do not have any special rights over items: any peer can store any item.
Our goal is to have as many items with the same key "meet" each other. We don't care if they meet items with different keys. By meeting we mean at some point in time being located in the same peer. We do not care whether keys are always at the same peer, we just want, over time, that items meet others with the same key.
To put it differently: the items want to speed-date each other. There are many rooms (peers) that can hold only a finite number of items at a time. Items only want to date others with a shared interest (key). Items want meet as many other items with the same key as possible yet minimize the time spent travelling ...
The evaluation criteria are:
number of meetings -> should be high
number of messages -> should be low
load -> should be evenly distributed
Input data:
- large number of keys
- keys are very unevenly distributed
Possible solutions
Random
Move items around randomly, hope they will meet each other. Performs well for (c), bad for (a), (b). See http://larkc.eu/marvin/random.html
DHT
Place all items on the node responsible for the key. Performs well for (a), (b) bad for (c), considering that the distribution of keys is very uneven. Note that standard load balancing for DHTs will not help us much, since items with the same key will still reside in different replica nodes. See http://larkc.eu/marvin/dht.html
SpeedDating
We have cooked up something in the middle where items meet a lot without disturbing load balancing. See http://larkc.eu/marvin/speeddate.html
Question: does this look like an existing problem? Does it look like publish-subscribe? Are there existing solutions that we may have re-invented here?
