Title: Querying, Exploring and Mining Geo-textual Data
Day 1:
Session 1: Querying Static Geo-textual Data
Session 2: Querying Geo-textual Data Stream
Session 3: Exploring Geo-textual Data
Day 2:
Session 1: Location Extraction
Session 2: User mobility behavior modeling
Session 3: Location Recommendation
Gao CONG is currently an Assistant Professor in the School of Computer Engineering, Nanyang Technological University (NTU). Before joining NTU, he was an Assistant professor in Aalborg University, Denmark (2008-2009). Before that, he worked as a researcher at the Microsoft Research Asia, China. From 2004 to 2006, He worked as a postdoc research fellow in the database group within the University of Edinburgh. He earned his Ph.D. in Computer Science from National University of Singapore in 2004. His current research interests include database, data mining, text mining, and information retrieval. He is active in the Database and Data Mining research community. His work was published in premier database, data mining, information retrieval and natural language processing conferences, such as ACM SIGMOD, VLDB, ICDE, ACM KDD, WWW, ACM SIGIR, ACM CIKM, ACL, etc. He also served as a PC member for the aforementioned conferences.
Title: Parallelizing Sequential Graph Computations
This talk presents GRAPE, a parallel system for graph computations. GRAPE differs from previous systems in its ability to parallelize existing sequential graph algorithms, without the need for recasting the algorithms into a new model. Underlying GRAPE are a simple programming model and a principled approach, based on a simultaneous fixpoint computation with partial evaluation and incremental computation. We show that sequential graph algorithms can be "plugged into" GRAPE and get parallelized. As long as the sequential algorithms are correct, their GRAPE parallelization guarantees to converge at correct answers under a monotone condition. Moreover, MapReduce, BSP and PRAM can be optimally simulated on GRAPE. In addition to the ease of programming, GRAPE achieves comparable performance to the state-of-the-art graph systems. We will also demonstratean application of GRAPE in social media marketing.
This is joint work with Jingbo Xu, Yinghui Wu, Wenyuan Yu, Jiaxin Jiang, Zeyu Zheng, Bohan Zhang, Yang Cao and Chao Tian.
Professor Wenfei Fan is the Chair of Web Data Management in the School of Informatics, University of Edinburgh, UK, and the director of the International Research Center on Big Data, Beihang University, China. Prior to his move to the UK, he worked for Bell Labs, Lucent Technologies in the USA. He received his PhD from the University of Pennsylvania, USA, and his MS and BS from Peking University, China. Professor Fan is a Fellow of the Royal Society of Edinburgh, UK, a Fellow of the ACM, USA, a National Professor of the 1000-Talent Program and a Yangtze River Scholar, China. He is a recipient of the Alberto O. Mendelzon Test-of-Time Award of ACM PODS 2015 and 2010, ERC Advanced Fellowship in 2015, the Roger Needham Award in 2008, the Outstanding Overseas Young Scholar Award in 2003, the NSF Career Award in 2001, and several Best Paper Awards (SIGMOD 2017, VLDB 2010, ICDE 2007, and Computer Networks 2002). His current research interests include database theory and systems, in particular big data, data quality, data integration, distributed query processing, query languages, recommender systems, social networks and Web services.
Title: Approximate Query Processing in Database Systems
摘要
I received a BA from the Mathematics Department at UCSD, an MSc from the Computer Science and Engineering Department at OSU (my advisor at OSU was Renee Miller, who is now at Toronto), and a PhD from the College of Computing at Georgia Tech (my advisor at Georgia Tech was Ed Omiecinski). I am the recipient of a 2008 Alfred P. Sloan Foundation Research Fellowship, a National Science Foundation CAREER award, a 2007 ACM SIGMOD Best Paper Award, and a 2017 IEEE ICDE Best Paper Award . I have been at Rice since January, 2009, and I was on the faculty of the computer science department at the University of Florida from 2002, through August, 2010.
Title:
摘要
Floris Geerts holds a research professor position at the University of Antwerp, Belgium. Before that, he held a senior research fellow position in the database group at the University of Edinburgh, and a postdoc position in the data mining group at the University of Helsinki. He received his PhD in 2001 at the University of Hasselt, Belgium. His research interests include the theory and practice of databases and the study of data quality in particular. He received several best paper awards (ICDM 2001, ICDE 2007, ADBIS 2015) and was recipient of the 2015 Alberto O. Mendelzon Test of Time Award (PODS 2015). He is an associate editor of ACM TODS, was general chair of EDBT/ICDT 2015 and will be PODS PC chair in 2017.
Title: Do we really understand SQL?
SQL has been with us for ages, and is the tool of choice for data scientists, but do we really understand it? Even in the core fragment there are many examples that debunk myths taught to us by database textbooks. Part of the reason is the lack of a formal semantics of SQL. The Standard, written is natural language, can hardly be viewed as such and indeed different vendors interpret it in different ways. All attempts to provide a proper semantics made too many simplifying assumptions and failed to capture the behavior of the language.
In this talk I'll explain some of the unexpected behavior of SQL that makes formal semantics a challenge. I will then provide such a semantics for the core language (essentially the fragment corresponding to relational algebra) and then discuss one application of it: whether SQL really needs a 3-valued logic for handling nulls.
Leonid Libkin is Professor of Foundations of Data Management in the School of Informatics at the University of Edinburgh. He was previously a Professor at the University of Toronto and a member of research staff at Bell Laboratories in Murray Hill. He received his PhD from the University of Pennsylvania in 1994. His main research interests are in the areas of data management and applications of logic in computer science. He has written five books and about 200 technical papers (including 12 JACM). His awards include a Marie Curie Chair Award, a Royal Society Wolfson Research Merit Award, and five Best Paper Awards. He has chaired programme committees of major database conferences (ACM PODS, ICDT) and was the conference chair of the 2010 Federated Logic Conference. He has given many invited conference talks and has served on multiple program committees and editorial boards. He is an ACM fellow, a fellow of the Royal Society of Edinburgh, and a member of Academia Europaea.
Title: Advances in Big Graph Processing
Graphs are very important parts of Big Data and widely used for modelling complex structured data with a broad spectrum of applications such as bioinformatics, web search, social network, road network, etc. Over the last decade, tremendous research efforts have been devoted to many fundamental problems in managing and analysing graph data. In this talk, I will first overview our recent research efforts in processing big graphs including scalable processing theory and techniques, distributed computation, and system framework.We will also look to the future of the area.
Xuemin Lin is a UNSW Scientia Professor in the School of Computer Science and Engineering at the University of New South Wales. Currently, he is the head of database research group in the School of Computer Science and Engineering at UNSW. Xuemin is elvated to Fellow of IEEE in Nov, 2015. Xuemin got his PhD in Computer Science from the University of Queensland (Australia) in 1992 and his BSc in Applied Math from Fudan University (China) in 1984. During 1984-1988, he studied for PhD in Applied Math at Fudan University.Xuemin started his teaching in the Computer Science Department at the University of Western Australia in Nov 1994 after a 2 years research fellow appointment at the University of Queensland. He joined the School of Computer Science and Engineering at the University of New South Wales in Nov 1997. In 2005, he was a visiting researcher in Microsoft Asia Research Lab and visited Tokyo University as a JSPS fellow. He has been a concurrent Professor at East Normal University since 2009 and adjunct Chair Professor at Fudan University since 2016. Xuemin currently is an Associate Editor-in-Chief of IEEE Transactions on Knowledge and Data Engineering and an associate editor of World Wide Web Journal. He was an associate editor of ACM Transactions on Database Systems (2008-2014) and an associate editor of IEEE Transactions on Knowledge and Data Engineering (2013 Feb - 2015 Jan).
Title: Graph Processing: The Power of RDBMS
To support analytics on massive graphs such as online social networks, RDF, Semantic Web, etc. many new graph algorithms are designed to query graphs for a specific problem, and many distributed graph processing systems are developed to support graph querying by programming.
In this talk, we focus on RDBM, which has been well studied over decades to manage large datasets, and we revisit the issue how RDBMS can support graph processing at the SQL level. Our work is motivated by the fact that there are many relations stored in RDBMS that are closely related to a graph in real applications and need to be used together to query the graph, and RDBMS is a system that can query and manage data while data may be updated over time.
To support graph processing, we propose 4 new relational algebra operations, MM-join, MV-join, anti-join, and union-by-update. Here, MM-join and MV-join are join operations between two matrices and between a matrix and a vector, respectively, followed by aggregation computing over groups, given a matrix/vector can be represented by a relation. Both deal with the semiring by which many graph algorithms can be supported. The anti-join removes nodes/edges in a graph when they are unnecessary for the following computing. The union-by-update addresses value updates to compute PageRank, for example. The 4 new relational algebra operations can be defined by the 6 basic relational algebra operations with group-by-&-aggregation. We revisit SQL recursive queries and show that the 4 operations with others are ensured to have a fixpoint, following the techniques studied in Datalog, and enhance the recursive WITH clause in SQL'99. We conduct extensive performance studies to test graph algorithms using large real graphs in 3 major RDBMSs. We show that RDBMSs are capable of dealing with graph processing in reasonable time. The focus of this work is at SQL level. There is high potential to improve the efficiency by main-memory RDBMSs, efficient join proce.
Dr. Jeffrey Xu Yu is a Professor in the Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong. His current main research interests include keywords search in relational databases, graph mining, graph query processing, and graph pattern matching. Dr. Yu served/serves in over 300 organization committees and program committees in international conferences/workshops including the PC Co-chair of APWeb’04, WAIM’06, APWeb/WAIM’07, WISE’09, PAKDD’10, DASFAA’11, ICDM’12, NDBC’13, ADMA’14, CIKM’15, and Bigcomp'17. Dr. Yu served as an Information Director and a member in ACM SIGMOD executive committee (2007-2011), an associate editor of IEEE Transactions on Knowledge and Data Engineering (2004-2008), an associate editor in VLDB Journal (2007-2013), and the chair of the steering committee in Asia Pacific Web Conference (2013-2016). Currently, he serves as associate editor in ACM Transactions on Database Systems, WWW Journal, the International Journal of Cooperative Information Systems, the Journal of Information Processing, and Journal on Health Information Science and Systems.Jeffrey Xu Yu is a member of ACM, a senior member of IEEE, and a member of IEEE Computer Society
VLDB中国数据库学院
大数据与脑机智能高精尖创新中心
北京航空航天大学
中国计算机学会数据库专业委员会
Tel:010-82338234;
E-mail:dongxh@buaa.edu.cn
Tel:010-82338234;
E-mail:mayunyan@act.buaa.edu.cn