RECENT RESULTS ON CARDINALITY ESTIMATION AND INFORMATION THEORETIC INEQUALITIES

Hung Q. Ngo
Author affiliations

Authors

  • Hung Q. Ngo Relational AI

DOI:

https://doi.org/10.15625/1813-9663/37/3/16129

Keywords:

Shannon-type inequalities, Cardinality Estimation, Conjunctive Queries, Information Theory

Abstract

I would like to dedicate this little exposition to Prof. Phan Dinh Dieu, one of the giants and pioneers of Mathematics in Computer Science in Vietnam.

In the past 15 years or so, new and exciting connections between fundamental problems in database theory and information theory have emerged. There are several angles one can take to describe this connection. This paper takes one such angle, influenced by the author's own bias and research results. In particular, we will describe how the cardinality estimation problem -- a corner-stone problem for query optimizers -- is deeply connected to information theoretic inequalities. Furthermore, we explain how inequalities can also be used to derive a couple of classic geometric inequalities such as the Loomis-Whitney inequality.

A purpose of the article is to introduce the reader to these new connections, where theory and practice meet in a wonderful way. Another objective is to point the reader to a research area with many new open questions.

 

Metrics

Metrics Loading ...

References

S. Abiteboul, R. Hull, and V. Vianu, Foundations of Databases. Addison-Wesley, 1995.

M. Abo Khamis, P. G. Kolaitis, H. Q. Ngo, and D. Suciu, Bag query containment and

information theory," in Proceedings of the 39th ACM SIGMOD-SIGACT-SIGAI Symposium

on Principles of Database Systems, PODS 2020, Portland, OR, USA, June 14-19, 2020,

D. Suciu, Y. Tao, and Z. Wei, Eds. ACM, 2020, pp. 95{112. [Online]. Available:

https://doi.org/10.1145/3375395.3387645

||, Decision problems in information theory," in 47th International Colloquium on

Automata, Languages, and Programming, ICALP 2020, July 8-11, 2020, Saarbrucken, Germany

(Virtual Conference), ser. LIPIcs, A. Czumaj, A. Dawar, and E. Merelli, Eds., vol. 168.

Schloss Dagstuhl - Leibniz-Zentrum fur Informatik, 2020, pp. 106:1{106:20. [Online]. Available:

https://doi.org/10.4230/LIPIcs.ICALP.2020.106

M. Abo Khamis, H. Q. Ngo, and A. Rudra, FAQ: questions asked frequently," in Proceedings

of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems,

PODS 2016, San Francisco, CA, USA, June 26 - July 01, 2016, T. Milo and W. Tan, Eds.

ACM, 2016, pp. 13{28. [Online]. Available: http://doi.acm.org/10.1145/2902251.2902280

M. Abo Khamis, H. Q. Ngo, and D. Suciu, Computing join queries with functional

dependencies," in Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on

Principles of Database Systems, PODS 2016, San Francisco, CA, USA, June 26 - July

, 2016, T. Milo and W. Tan, Eds. ACM, 2016, pp. 327{342. [Online]. Available:

http://doi.acm.org/10.1145/2902251.2902289

||, What do shannon-type inequalities, submodular width, and disjunctive datalog have to

do with one another?" in Proceedings of the 36th ACM SIGMOD-SIGACT-SIGAI Symposium

on Principles of Database Systems, PODS 2017, Chicago, IL, USA, May 14-19, 2017,

E. Sallinger, J. V. den Bussche, and F. Geerts, Eds. ACM, 2017, pp. 429{444. [Online].

Available: http://doi.acm.org/10.1145/3034786.3056105

CARDINALTITY ESTIMATION AND INFORMATION INEQUALITIES 15

N. Alon, On the number of subgraphs of prescribed type of graphs with a given number

of edges," Israel J. Math., vol. 38, no. 1-2, pp. 116{130, 1981. [Online]. Available:

http://dx.doi.org.gate.lib.bualo.edu/10.1007/BF02761855

A. Atserias, M. Grohe, and D. Marx, Size bounds and query plans for relational joins," in

FOCS. IEEE Computer Society, 2008, pp. 739{748.

B. Bollobas and A. Thomason, Projections of bodies and hereditary properties of

hypergraphs," Bull. London Math. Soc., vol. 27, no. 5, pp. 417{424, 1995. [Online]. Available:

https://doi-org.gate.lib.bualo.edu/10.1112/blms/27.5.417

W. Cai, M. Balazinska, and D. Suciu, Pessimistic cardinality estimation: Tighter upper

bounds for intermediate join cardinalities," in Proceedings of the 2019 International Conference

on Management of Data, SIGMOD Conference 2019, Amsterdam, The Netherlands, June 30

- July 5, 2019, P. A. Boncz, S. Manegold, A. Ailamaki, A. Deshpande, and T. Kraska, Eds.

ACM, 2019, pp. 18{35. [Online]. Available: https://doi.org/10.1145/3299869.3319894

T. Chan, Recent progresses in characterising information inequalities," Entropy, vol. 13, no. 2,

pp. 379{401, 2011. [Online]. Available: https://doi-org.gate.lib.bualo.edu/10.3390/e13020379

T. H. Chan and R. W. Yeung, On a relation between information inequalities and group

theory," IEEE Transactions on Information Theory, vol. 48, no. 7, pp. 1992{1995, 2002.

[Online]. Available: http://dx.doi.org/10.1109/TIT.2002.1013138

F. R. K. Chung, R. L. Graham, P. Frankl, and J. B. Shearer, Some intersection theorems for

ordered sets and graphs," J. Combin. Theory Ser. A, vol. 43, no. 1, pp. 23{37, 1986. [Online].

Available: http://dx.doi.org.gate.lib.bualo.edu/10.1016/0097-3165(86)90019-1

T. M. Cover and J. A. Thomas, Elements of information theory, 2nd ed. Hoboken, NJ: Wiley-

Interscience [John Wiley & Sons], 2006.

I. Csiszar, Axiomatic characterizations of information measures," Entropy, vol. 10, no. 3, pp.

{273, 2008. [Online]. Available: https://doi.org/10.3390/e10030261

E. Friedgut, Hypergraphs, entropy, and inequalities," Amer. Math. Monthly, vol. 111, no. 9,

pp. 749{760, 2004. [Online]. Available: https://doi-org.gate.lib.bualo.edu/10.2307/4145187

E. Friedgut and J. Kahn, On the number of copies of one hypergraph in another," Israel J.

Math., vol. 105, pp. 251{256, 1998. [Online]. Available: http://dx.doi.org.gate.lib.bualo.edu/

1007/BF02780332

G. Gottlob, S. T. Lee, G. Valiant, and P. Valiant, Size and treewidth bounds

for conjunctive queries," J. ACM, vol. 59, no. 3, p. 16, 2012. [Online]. Available:

http://doi.acm.org/10.1145/2220357.2220363

H. Harmouch and F. Naumann, Cardinality estimation: An experimental survey,"

Proc. VLDB Endow., vol. 11, no. 4, pp. 499{512, 2017. [Online]. Available: http:

//www.vldb.org/pvldb/vol11/p499-harmouch.pdf

A. Hertzschuch, C. Hartmann, D. Habich, and W. Lehner, Simplicity done right for join

ordering," in 11th Conference on Innovative Data Systems Research, CIDR 2021, Virtual

Event, January 11-15, 2021, Online Proceedings. www.cidrdb.org, 2021. [Online]. Available:

http://cidrdb.org/cidr2021/papers/cidr2021 paper01.pdf

HUNG Q. NGO

S. Jukna, Extremal combinatorics, 2nd ed., ser. Texts in Theoretical Computer Science. An

EATCS Series. Springer, Heidelberg, 2011, with applications in computer science. [Online].

Available: https://doi-org.gate.lib.bualo.edu/10.1007/978-3-642-17364-6

V. Leis, B. Radke, A. Gubichev, A. Kemper, and T. Neumann, Cardinality estimation

done right: Index-based join sampling," in 8th Biennial Conference on Innovative Data

Systems Research, CIDR 2017, Chaminade, CA, USA, January 8-11, 2017, Online Proceedings.

www.cidrdb.org, 2017. [Online]. Available: http://cidrdb.org/cidr2017/papers/p9-leis-cidr17.

pdf

L. H. Loomis and H. Whitney, An inequality related to the isoperimetric inequality," Bull.

Amer. Math. Soc, vol. 55, pp. 961{962, 1949.

F. Matus, Innitely many information inequalities," in IEEE International Symposium on

Information Theory, ISIT 2007, Nice, France, June 24-29, 2007. IEEE, 2007, pp. 41{44.

[Online]. Available: https://doi.org/10.1109/ISIT.2007.4557201

T. Milo and W. Tan, Eds., Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium

on Principles of Database Systems, PODS 2016, San Francisco, CA, USA, June 26 - July 01,

ACM, 2016. [Online]. Available: http://doi.acm.org/10.1145/2902251

H. Q. Ngo, Worst-case optimal join algorithms: Techniques, results, and open problems," in

Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database

Systems, Houston, TX, USA, June 10-15, 2018, J. V. den Bussche and M. Arenas, Eds. ACM,

, pp. 111{124. [Online]. Available: https://doi.org/10.1145/3196959.3196990

H. Q. Ngo, E. Porat, C. Re, and A. Rudra, Worst-case optimal join algorithms: [extended

abstract]," in PODS, 2012, pp. 37{48.

H. Q. Ngo, C. Re, and A. Rudra, Skew strikes back: new developments in the theory

of join algorithms," SIGMOD Record, vol. 42, no. 4, pp. 5{16, 2013. [Online]. Available:

http://doi.acm.org/10.1145/2590989.2590991

J. Radhakrishnan, Entropy and counting," in Computational Mathematics, Modelling and Algorithms,

J. C. Misra, Ed. Narosa Pub House, 2003, pp. 146{168.

R. W. Yeung, Information Theory and Network Coding, 1st ed. Springer Publishing Company,

Incorporated, 2008.

Z. Zhang and R. W. Yeung, On characterization of entropy function via information

inequalities," IEEE Transactions on Information Theory, vol. 44, no. 4, pp. 1440{1452, 1998.

[Online]. Available: http://dx.doi.org/10.1109/18.681320

Downloads

Published

26-09-2021 — Updated on 07-10-2021

Versions

How to Cite

[1]
H. Q. Ngo, “RECENT RESULTS ON CARDINALITY ESTIMATION AND INFORMATION THEORETIC INEQUALITIES”, JCC, vol. 37, no. 3, p. 223–238, Oct. 2021.

Issue

Section

SPECIAL ISSUE DEDICATED TO THE MEMORY OF PROFESSOR PHAN DINH DIEU - PART A