No Access

OPTIMIZING ACQUAINTANCE SELECTION IN A PDMS

JIAN XU

University of British Columbia, Department of Computer Science, 201-2366 Main Mall, Vancouver, BC V6T 1Z4, Canada

Search for more papers by this author

and

RACHEL POTTINGER

University of British Columbia, Department of Computer Science, 201-2366 Main Mall, Vancouver, BC V6T 1Z4, Canada

Search for more papers by this author

https://doi.org/10.1142/S0218843011002183Cited by:2 (Source: Crossref)

Abstract

In a Peer Data Management System (PDMS), autonomous peers share semantically rich data. For queries to be translated across peers, a peer must provide a mapping to other peers in the PDMS; peers connected by such mappings are called acquaintances. To maximize PDMS query answering performance, a peer needs to optimize its choice of acquaintances. This paper investigates the acquaintance selection problem and introduces a novel framework for performing this acquaintance selection. Our framework includes two selection schemes that effectively and efficiently estimate mapping effectiveness. The "one-shot" scheme clusters peers and estimates the improvement in query answering based on cluster properties. The "two-hop" scheme estimates using locally available information at multiple rounds. Our empirical study shows that both schemes effectively help acquaintance selection and scale to large PDMSs.

Keywords:

References

S. Agarwal et al. , Beyond pairwise clustering , CVPR ( 2005 ) . Google Scholar
F. Barry Williams, Libraries of data models, http://www.databaseanswers.org/data_models/ . Google Scholar
P. Bernsteinet al., Data management for peer-to-peer computing: A vision, WebDB (2002) pp. 89–94. Google Scholar
C. M. Bishop , Pattern Recognition and Machine Learning ( Springer , 2006 ) . Google Scholar
M. Castro, P. Druschel, Y. C. Hu and A. Rowstron, Proximity neighbor selection in tree based structured peer-to-peer overlays, Technical report, Microsoft (2003) . Google Scholar
V. Cholvi , P. Felber and E. W. Biersack , Efficient search in unstructured peer-to-peer networks , SPAA ( 2004 ) . Google Scholar
B.-G. Chun , B. Y. Zhao and J. Kubiatowicz , Impact of neighbor selection on performance and resilience of structured p2p networks , IPTPS ( 2005 ) . Google Scholar
C. Cramer and T. Fuhrmann , Proximity neighbor selection for a dht in wireless multi-hop networks , Peer-to-Peer Computing ( 2005 ) . Google Scholar
A. Doan and A. Y. Halevy, Semantic integration research in the database community: A brief survey, AI Magazine 26 (1) (2005) 83–94 . Google Scholar
P. S. Dodds, R. Muhamad and D. J. Watts, Science 301, 827 (2003). Crossref, Web of Science, Google Scholar
X. Dong, A. Y. Halevy and C. Yu, Data integration with uncertainty, VLDB '07 (2007) pp. 687–698. Google Scholar
X. L. Dong et al. , Global detection of complex copying relationships between sources , PVLDB ( 2010 ) . Google Scholar
S. N. Dorogovtsev, J. F. F. Mendes and A. N. Samukhin, Nucl. Phys. B 307, 653 (2003). Google Scholar
S. Dubnovet al., Machine Learning 47(1), 35 (2002). Crossref, Web of Science, Google Scholar
G. Elidanet al., Data perturbation for escaping local maxima in learning, AAAI/IAAI (2002) pp. 132–139. Google Scholar
R. Faginet al., PODS (2004) pp. 83–94. Google Scholar
P. Haase and R. Siebes, Peer selection in peer-to-peer networks with semantic topologies, ICSNW (2004) pp. 108–125. Google Scholar
A. Y. Halevyet al., Piazza: Data management infrastructure for semantic web applications, ICDE (2003) pp. 505–516. Google Scholar
I. A. Klampanos and J. M. Jose , An architecture for information retrieval over semi-collaborating peer-to-peer networks , SAC ( 2004 ) . Google Scholar
A. Kosowski, M. Malafiejski and P. Zylinski, On bounded load routings for modeling k-regular connection topologies, ISAAC (2005) pp. 614–623. Google Scholar
A. Loseret al., Semantic overlay clusters within super-peer networks, DBISP2P (2003) pp. 33–47. Google Scholar
J. Madhavan and A. Y. Halevy , Composing mappings among data sources , VLDB ( 2003 ) . Google Scholar
H. A. Mahmoud and A. Aboulnaga, Schema clustering and retrieval for multi-domain pay-as-you-go data integration systems, SIGMOD '10: Proceedings of the 2010 Int. Conf. on Management of Data (2010) pp. 411–422. Google Scholar
F. Mandreoli et al. , Using semantic mappings for query routing in a pdms environment , SEBD ( 2006 ) . Google Scholar
C. Martel and V. Nguyen, Analyzing kleinberg's (and other) small-world models, PODC '04 (2004) pp. 179–188. Google Scholar
C. H. Ng, K. C. Sia and C.-H. Chan, Advanced peer clustering and firework query model in the peer-to-peer network, in WWW(Poster) (2003) . Google Scholar
M. Pavan and M. Pelillo, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 29(1), 167 (2007). Crossref, Web of Science, Google Scholar
W. Penzoet al., Semantic peer, here are the neighbors you want!, EDBT '08: Proceedings of the 11th Int. Conf. on Extending Database Technology (2008) pp. 26–37. Google Scholar
P. Raftopoulou and E. Petrakis, Icluster: A self-organizing overlay network for p2p information retrieval, in eds. C. Macdonald, I. Ounis, V. Plachouras, I. Ruthven and R. White, Advances in Information Retrieval, Vol. 4956 of Lecture Notes in Computer Science (Springer Berlin/Heidelberg, 2008), 10.1007/978-3-540-78646-7, pp. 65–76 . Google Scholar
P. Raftopoulou, E. Petrakis and C. Tryfonopoulos, Distributed and Parallel Databases 26, 181 (2009), DOI: 10.1007/S10619-009-7046-7. Crossref, Web of Science, Google Scholar
P. Raftopoulou and E. G. M. Petrakis, A measure for cluster cohesion in semantic overlay networks, LSDS-IR '08: Proceeding of the 2008 ACM Workshop on Large-Scale Distributed Systems for Information Retrieval (2008) pp. 59–66. Google Scholar
M. K. Ramanathan , V. Kalogeraki and J. Pruyne , Finding good peers in peer-to-peer networks , IPDPS ( 2002 ) . Google Scholar
A. Robles-Kelly and E. R. Hancock, Pairwise clustering with matrix factorisation and the em algorithm, Conference on Computer Vision (2002) pp. 63–77. Google Scholar
P. Rodríguez-Gianolliet al., Data sharing in the hyperion peer database system, VLDB (2005) pp. 1291–1294. Google Scholar
N. Shental et al. , Pairwise clustering and graphical models , Neural Information Processing Systems ( 2003 ) . Google Scholar
L. Sidirourgoset al., Distributed and Parallel Databases 23, 45 (2008), DOI: 10.1007/S10619-007-7021-0. Crossref, Web of Science, Google Scholar
H. A. Simon, Biometrika 42(3), 425 (1955). Crossref, Web of Science, Google Scholar
K. Smithet al., Exploring schema similarity at multiple resolutions, SIGMOD '10: Proceedings of the 2010 Int. Conf. on Management of Data (2010) pp. 1179–1182. Google Scholar
K. Sripanidkulchai , B. M. Maggs and H. Zhang , Efficient content location using interest-based locality in peer-to-peer systems , INFOCOM ( 2003 ) . Google Scholar
M. Steyvers and J. B. Tenenbaum, Cognitive Science: A Multidisciplinary Journal 29(1), 41 (2005). Crossref, Web of Science, Google Scholar
I. Stoicaet al., IEEE/ACM Transactions on Networking 11(1), 17 (2003). Crossref, Web of Science, Google Scholar
C. Tempich , A. Loser and J. Heizmann , Community based ranking in peer-to-peer networks , ODBASE ( 2005 ) . Google Scholar
D. Tsoumakos and N. Roussopoulos , Agno: An adaptive group communication scheme for unstructured p2p networks , Euro-Par ( 2005 ) . Google Scholar
J. W. Tukey , Exploratory Data Analysis ( Addison-Wesley Publishing , 1977 ) . Google Scholar
Z. Y. X. Tong and D. Zhang , Efficient content location based on interest-cluster in peer-to-peer system , ICEBE ( 2005 ) . Google Scholar

Vol. 20, No. 01

Metrics

Downloaded 32 times

History

Keywords

PDF download

OPTIMIZING ACQUAINTANCE SELECTION IN A PDMS

Abstract

Recommended