A week on coding and information

Day one

Today is the first day of the Latin American Workshop on Coding and Information, a satellite event of the celebrated International Congress of Mathematicians which will be held in Rio de Janeiro on August (you know, Fields medals and stuff). The workshop is devoted to discussing questions of mathematical foundations of secure and reliable data storage and transmission, including applications to the quantum world, cryptography, and biology.

How to solve a sudoku effectively?

Number of indexed papers on “Field Theory and Polynomials”, based on the MathSciNet database. The boom on Finite Fields coincide with developments in Cryptography in the late 60s.

In the Finite Fields world, we can only operate with a finite number of elements (wait, isn’t it always the case in the digital world?), and the name of the game is how to make arithmetical operations consistent (weird things may happen with arithmetics, such as the Freshmen’s dream). The obvious examples of Finite Fields are Boolean fields (where only the elements {0,1} are allowed), but it doesn’t stop there — virtually any combinatorial problems can be turn into finite fields. Prof Panario explained some of them, including how to share a secret key throughout a compromised link, relations to combinatorial designs and how to solve sudokus with Latin squares (which unfortunately have nothing to do with Latin America, as patiently explained Prof Panario after my embarrassingly out-of-scope question).

A 3x3 Graeco-Latin square — all rows and columns contain all latin and greek letters only once. Can you construct a 6x6 one?

There are still beautiful open problems in this well-established field. A top example is the Prime-Power conjecture on the existence of certain latin squares, where brute force computational methods fall short.

That’s IT

Proceeding on the concepts of conditional entropy, Prof Costa discussed the problems with drawing conclusions from marginalised distributions. One should exercise great care with marginal distributions so as not to fall into traps such as Bertrand’s box paradox.

Prof Costa and a hands-on explanation of Bertrand’s paradox.

Further down the road, Information Theory can provide powerful tools for analysing data dependent systems. Some ubiquitous examples are the Data Processing inequality, Fano’s inequality and Jensen’s Inequality. If you happen to have missed Prof Costa’s talk and don’t know these, you might be in need of an IT book (NB: entirely free advertising).

Quote attributed to Tom Cover (citation needed), lead information theorist, author of the essential “Elements on Information Theory” (Wiley)

With these tools, along with coding techniques, one can essentially derive all important results that establish the ultimate limits of point-to-point data transmission and storage. The case of multiple users is significantly harder, and is the topic of Prof Costa’s second lecture on Tuesday…

For the second part of school, click here.

NB: Claude Shannon’s recently released biography includes testimonials by true information theorists and is a must to get familiarised with the story of the field.

Data science. All things data governance, machine learning and open data.