Abstract
The recent rise in the popularity of micro-blogging has been accompanied by concerns over censorship and anonymity in centralized systems. Thus, we consider the problem of implementing a micro-blogging social network over an unstructured Peer-to-Peer network. The problem can be decomposed into two sub-problems, dissemination (also known as replication) and retrieval, which are coupled. For example, the more a blog post is disseminated, the fewer nodes need to be queried in order to retrieve it with a high probability. Both dissemination and retrieval incur bandwidth costs. In this paper, we investigate the optimal replication of data, in the sense of minimizing bandwidth, and the balance between the number of nodes a micro-blog post is replicated to and the number of nodes that must be queried. Minimizing the system bandwidth is critical if our proposed system is to scale from small to larger networks. Our theoretical, probabilistic analysis predicts that micro-blog posts should be replicated onto approximately 20% and 6% of nodes in networks of 10, 000 and 100, 000 nodes respectively in order to minimize the overall bandwidth of the system.
Original language | English |
---|---|
Title of host publication | Distributed Computing Systems Workshops (ICDCSW), 2013 IEEE 33rd International Conference on |
Number of pages | 6 |
Publisher | IEEE Computer Society Press |
Publication date | 2013 |
Pages | 184-189 |
ISBN (Print) | 978-1-4799-3247-4 |
DOIs | |
Publication status | Published - 2013 |
Externally published | Yes |