Kısa süre önce Mozilla **Common Voice** ses veri setinin yeni sürümünü duyurdu (Temmuz, 2020). Ses verisi toplanan diller arasında **Türkçe** olduğu için dikkatimi çekti. Yıllar önce benzer bir projeye katkıda bulunmuş biri olarak daha detaylı inceleyince **beni şaşırtan bir durumla karşılaştım!** Dünyada neredeyse **80 milyon kişi** tarafından konuşulan **Türkçe**, bu veri seti içinde, İngilizceyi geçtim, **10 kat daha az** insanın konuştuğu **Katalanca** gibi bir dilden bile daha az veri ile temsil ediliyor: Veri setindeki Katalanca veri miktarı Türkçe’den **24 kat**, İngilizce ise **86 kat** daha fazla!

Bu diller için kaç saat kayıt yapıldığına bakınca yine çarpıcı bir durum var:

Bu açık lisanslı ses veri setinde çeşitli diller için kaç farklı ses mevcut diye merak edersek karşımıza çıkan manzara ise aşağıdaki gibi:

Bu durumda akla gelen iki soru var:

- Türkçe, Katalanca’dan çok daha fazla insan tarafından konuşulmasına rağmen neden bu veri setinde bu kadar az temsil ediliyor?

ama daha da önemlisi:

- Türkçe ses verisini çoğaltmak ve Mozilla Common Voice projesine katkıda bulunmak için ne yapabiliriz?

Bu arada, Katalanca bu bakımdan Hollandacayı da geçmiş durumda, yani durum Türkçeye özgü değil gibi görünüyor:

Below is an original piece created by the **G**enerative **A**dversarial Deep Learning **N**etwork (**GAN, **in particular the famous** MuseGAN **network architecture). The MuseGAN deep learning network system was able to create this after training for **only 1000 epochs** on a moderate laptop for **2 hours** (without using **GPUs**), based on the data set at https://github.com/czhuang/JSB-Chorales-dataset (a set of** 229 Bach chorales**). In other words, this is definitely not representative of what Deep Learning can achieve as best because such a system can be easily trained for longer on much more powerful systems (see further examples below). The focus of these examples is the fact that you can also start to experiment with deep learning systems that start to **model** musical aspects **without explicit musical teaching**, hard-encoded rules in software, etc.

You can click on the image below to visit SoundCloud and listen to MP3 file generated by MuseScore.

Among the **actual Bach chorales** in the data set, the “closest” one to the artificially generated example (“close” in the sense of Euclidean distance) can be seen below.

Running Foster’s examples, and changing the “**noise vector**” that encodes the chord-related aspects, and keeping “style”, “melody”, and “groove” noise vectors the same for **gan.generator.predict** function, we get:

Similarly, changing only the “style noise vector”, a sample output is

Changing only the melody noise vector, an example result is:

and finally changing groove noise vector:

In the beginning, I wrote there’s no such thing as “enough Bach”. And therefore you can compare the previous examples with a Google Doodle dedicated to a very similar task: harmonizing a few bars of notes in the style of Bach chorales, using deep learning neural networks. Careful readers will realize that I took first voice of the original Bach composition above to see how Google’s deep learning system would “complete” it by adding 3 more voices below. You can listen to it at http://g.co/doodle/prvuk

Of course, Google isn’t the only technology giant dedicating brain and compute resources to further the legacy of Bach. Another big player, Sony Computer Science Laboratories also worked on polyphonic music generation in the style of Bach, demonstrated by their DeepBach system. You can read more about Deep Bach by Flow Machines at http://www.flow-machines.com/history/projects/deepbach-polyphonic-music-generation-bach-chorales/ and interactive music generation at Sony at https://csl.sony.fr/project/interactive-music-generation/

Now I wonder what **Kemal Ebcioğlu** would think about these new approaches and results, because in his PhD thesis published in **1986**, he describes his CHORAL system, built as a purely symbolic system using BSL, a new logic programming language that he also designed to encode constraints and rules for Bach-like harmonization of four-part chorales. I also wonder what David Cope thinks, because he’s the creator of one of the most famous symbolic music composition systems, named “Emily Howell“; you can see one of the compositions generated by Emily below:

I’m sure the new year of 2020 will bring us even more surprises in the intersection of music, AI and Deep Learning, because as you are reading these, music researchers, performers and programmers are busy applying such methods to Beethoven’s unfinished 10th symphony, to be premiered in Bonn on 28 April, 2020.

Meanwhile, thanks to available data sets, open source deep learning libraries and applications, and relatively cheap and easily available computing facilities such as CoLab and similar ones from AWS and Microsoft Azure, we’ll continue to experiment with even more interesting examples of computer-generated music and other types of art.

]]>The example is about the famous German mathematician **Gauss**: He wrote a remarkable book in 1798, humbly titled as “Disquisitiones Arithmeticae” (“Arithmetical Investigations”). The book was first published in 1801, and only **6 years later** it was translated into **French** and published in **1807** as “**Recherches arithmétiques**“.

The translator of this important book was **Antoine Charles Marcelin Poullet-Delisle**, a **math teacher** at a **high school**: Lycée d’Orléans. Another French *high school teacher*, Louis Poinsot, wrote a long review about the translation in a **daily newspaper** on **21 March 1807**, Saturday. Poinsot was a mathematics teacher at Lycée Bonaparte in Paris, just like the French translator of Gauss’s book.

The archives of the daily newspaper where Poinsot published his review of “**Recherches arithmétiques**” is available online at DigiNole Home » FSU Digital Library » Napoleonic Collections » Le Moniteur universel » Moniteur universel

And you can read the review on the second page of the newspaper:

Below you can find the full text that I converted from the online archive using Google OCR (Optical Character Recognition), followed by the automatic English translation (if you want to fix the mistakes of the automatic English translation, please feel free to share your corrections):

Sciences.

Recherches arithmétiques , par M. C. Fr. Gauss (de Brunswick) , traduites par A. C. M. Poullet-Delisle , professeur de mathématiques au Lycée d’Orléans. — Paris 1807.

Au titre modeste de cet ouvrage, quelques personnes pourraient croire d’abord que l’auteur a’y considère que les questions les plus simples. Mais ces Recherches arithmétiques sont au fond , des recherches très-savantes sur les proprietes generales des nombres , et sur l’analyse indéterminée à laquelle elles se lient : matière profonde, inépuisable, la plus neuve et la plus ardue peut-être de toutes les parties des mathématiques.

L’arithmétique élémentaire n’est guères autre chose, comme on sait, que l’art de la numération qui peut s’établir d’une infinité de manières, suivant l’échelle ou la base que l’on veut choisir. Mais les nombres, considérés en eux-mêmes ont des propriétés qui ne dépendent point du tout de la manière dont on les représente. Ainsi il y a des nombres qui ne peuvent etre divises par aucun autre , et qu’on appelle premiers ou simples, parce que tous les autres s’en composent par la multiplication : il y a les nombres figures qu’on nomme triangles, quarrés, pentagones , hexagones, etc. selon que leurs unités considérées comme des points, par exemple, pourraient être arrangées symétriquement en triangle , en quarré, en pentagone, etc. Il y a les di différentes *puissances *des nombres qu’on produit en les multipliant par eux-mêmes, et une foule d’autres formés par diverses lois et par toutes les combinaisons régulières de celles-là. Or tous ces nombres et leurs propriétés demeurent toujours les mêmes dans tous les systêmes possibles de numération; et de-là résulte un certain genre de spéculations mathématiques d’où naissent plusieurs vérités ou théorèmes qui, avec le peu qu’on ait trouvé jusqu’ici de l’art de se conduire dans ces recherches, constituent cette arithmétique transcendante qu’on nomme aujourd’hui la théorie des nombres.

Ni les bornes, ni la nature de cette feuille ne nous permettraient des développements bien étendus sur cette matière. Pour ceux d’ailleurs qui voudraient l’approfondir, les détails seraient tou. jours insuffisants, et pour les autres, superflus. Nous serons seulement observer que les propriétés qu’on étudie aujourd’hui dans les nombres, n’ont aucun rapport à ces vertus merveilleuses qu’y cherchaient les anciens philosophes. Elles ne regardent que leurs simples manieres d’être les uns à l’égard des autres, comme sous le point de vue des diviseurs qu’ils peuvent avoir, des résidus qu’ils laissent; et en général, on les peut rapporter à certaines possibilités ou imposbilités de décomposer les nombres formés par une certaine loi, en d’autres formés par une loi différente, ou par la même. Elles peuvent donc se partager naturellement en deux grandes classes; les propriétés positives et les propriétes négatives, dont on peut se faire une idée par les théorêmes suivans :

*Un nombre quelconque est loujours décomposable en trois triangles, ou en quatre quarres , ou en cinq pentagones , ou en sit hexcugones, el ainsi à l’infini* : zéro pouvant quelquefois être compté dans cette décomposition comme un nombre figure.)

*Une puissance quelconque au-dessus de la seconde, ne peut jamais se décomposer en deux puissances semblables. Ainsi deux cubes ne peuvent former un cube ; deux bi-quarrés, un biquarré, et de même à l’infini.*

L’élégante simplicité de ces théorèmes, leur expression si claire et si bien terminée qu’on les voit à plein du premier coup d’œil; et cependant l’extrême difficulté qui se fait bientôt sentir quand on en veut essayer la démonstration rigoureuse, peuvent expliquer le singulier attrait des questions de ce genre , et l’espèce de passion avec laquelle on s’y livre, dès qu’une fois on a commencé de s’en occuper. Cette science d’ailleurs reste presque toujours nouvelle et promet à chaque pas des découvertes. Les progrès de l’algèbre et l’invention des nouveaux calculs Ont mis successivement les géomètres, en état de vaincre les plus hautes dificultés de la géoméwie et de la mécanique. Un cleve aujourd’hui résout facilement des problemes qui ont arrêté les plus vigoureux génies ; et c’est ce qui fait que l’on ne peut exactement comparer les géometres séparés par d’un peu longs intervalles. Mais ces méthodes qui nous ont si bien fait suivre et mesurec toutes les affections des gran deurs continues, ne nous ont presque rien appris sur le calcul et les combinaisons des quantités discretes. On ne sait guere y appliquer jusque ici plus d’analyse que n’en avait Fermat il y a deux siécles ; et comme ce génie s’était fait sans doute , pour y pénétrer , quelqu’art nouveau dont le secret ne nous est point parvenu, personne n’a pu rétablir encore toutes les démonstrations perdues de la plupart de ses théorêmes, et entre autres, celles des deux précédens qui sont dûs à cet homme extraordinaire. Ainsi, quand les autres branches des mathématiques se sont élevées par degrés à la perfection, et que les géometres y ont si bien réalisé cette allégorie, qu’un enfant monté sur les épaules of Hercule voit plus loin que lui, la doctrine des nombres, malgré leurs travaux, est restée, pour ainsi dire, immobile ; comme pour étre, dans tous les tems , l’épreuve de leurs forces et la mesure de la pénétration de leur esprit.

C’est pourquoi M. Gauss, par un ouvrage aussi profond et aussi neuf que ses *Recherches arithmétiques *, s’annonce certainement comine une des meilleures têtes mathématiques de l’Europe. Il paraît, dans sa préface, qu’en 1795, où il tourna pour la premiere fois ses réflexions ciu coté des nombres, il n’avait encore aucune idée de ce qui avait été fait avant lui sur cette matiere, même par les modernes Il avait donc presqu’achevé les quatre premieres sections de son livre, sans avoir vu ni les mémoires d’Euler et de Lagrange, qui, entr’autres découvertes ont résolu un assez grand nombre des problêmes de Fermat; ni les mémoires de M. Legendre, si digne de les suivre dans cette camiere, et de nous donner cet excellent ouvrage ou, avec beaucoup de choses nouvelles qui lui appartiennent, il rassemble et met en ordre tout ce qui avait paru jusque-là sur cette importante théorie. M. Gauss se trouvant ensuite à même de lire les écrits de ces géomètres célèbres, ne tarda point à reconnaître qu’il avait employé int plus grande partie de ses méditations à des choses faites depuis long-tems. *Mais animé d’une nouvelle ardeur, je m’efforçai, dit-il, en suivant les pas de ces hommes de génie, de cultiver plus avant le champ de l’arithmetique, et talle a été l’origine des sections V , VI et VII*; lesquelles composent plus des trois quarts du volume. Quoique nous ne puissions entrer ici, ni dans l’histoire de la science qui remonte à Diophante et même à Euclide, ni dans la discussion des choses qui appartiennent aux différens géomètres, et particulièrement à M. Gauss, nous ne pouvons néanmoins passer sous silence le résultat aussi nouveau qu’inattendu , qu’on trouve à la VII section de ses Recherches sur la théorie des divisions égales du cercle, ou de l’inscription des polygones réguliers. On avait lieu de croire depuis Euclide, que les divisions du cercle par les nombres deux, trois, cinq, et celles qui en résultent, étaient les seules possibles par la regle et le compas. M. Gauss démontre qu’on peut encore exécuter par les mêmes moyens les divisions par dix-sept et par tous les nombres qui, comme deux, trois, cinq et dix-sept sont formés d’une puissance de deux, plus un ; et sont en même tems premiers. Quant aux autres divisions, elles dépendent d’opérations supérieures à celles du compas, ou d’équations qui passent le second degré. M. Gauss assigne les moindres degrés où elles se peuvent réduire ; et nous pouvons , dit-il, démontrer en toute rigueur que ces équations ne sauraient être évitées ni abaissées : et quoique les limites de cet ouvrage ne nous permettent pas d’en développer ici la démonstration, nous avons cru devoir en avertir, pour éviter que quelqu’un n’essayát l’autres divisions géométriques que celles données par notre théorie , et n’employant inutilement son tems à cette recherche.

On pourrait s’étonner d’abord de trouver dans ce livre des problèmes de géométrie, et de les voir résolus par les nombres. Mais dans les sciences mathématiques toutes les vérités, se tiennent par une chaîne nécessaire. Aucune idée n’y peut éclore sans éclairer la plupart des théories qui, à leur tour, perfectionnent les arts qui leur répondent; et la découverte de M. Gauss ne fait que confirmer cette vérité si souvent re ne fait que confirmer cette vérité si souvent reconnue par les géometres : que leurs spéculations les plus frivoles en apparence , développent à la fin quelqu’utilité nouvelle qu’on n’y avait point cherchée, et qu’on voit toujours leurs méditations, au moins innocentes durant leur vie , tourner avec le tems au profit des arts et à l’avantage de la société.

Ce que nous venons de dire sur les *Recherches arithmétiques* peut donner une idée de la profondeur et de la difficulté de l’ouvrage. Il ne fallait donc pas un moins bon géometre, ni un moins habile traducteur que M. Delisle, pour le bien rendre. Car il ne s’agit pas ici une traduction ordinaire, qui ne demande que la connaissance de deux langues. Il faut sans doute une intelligence Tare, pour comprendre à fond des démonstrations aussi délicates : une attention bien soutenue pour suivre des énumérations si laborieuses, examiner si elles sont complettes, afin de se pénétrer des raisonnemens de l’ameur, de faire saillir dans le discours les points ou la difficulté se forme, et les points où elle se dénoue; et de s’approprier enfin tellement les choses qu’elles paraissent non seulement écrites, mais pensées dans la langue même du traducteur.

M. Delisle doit être récompensé de son travail et de son zèle par le succès. Mais il le sera plus encore par les idées qu’une telle lecture n’a pu manquer de lui faire naître. On voit même, dans son modeste avertissement, qu’il avait d’abord été tenté de joindre au texte des remarques étendues, ou, en éclaircissant la matière, il aurait pu naturellement placer les considérations nouvelles qui se sont offertes a son esprit dans le cours de cette traduction. Mais, à l’exception d’un très-petit nombre de notes très-simples et qui se justifient d’elles-mêmes, il n’a voulu donner que l’ouvrage de M. Gauss tel qu’il est ; aimant mieux attendre , pour son propre travail, que le tems et la méditation l’ait mûri davantage, et rendu plus digne des regards et de l’attention des géometres.

M. Delisle a fait hommage de sa traduction à M. *Laplace*; hommage bien naturel à plus d’un titre. Ce géometre, en effet, à la gloire brillante de s’être élevé des monumens dans les sciences, et d’en étendre chaque jour le domaine, veut unir la gloire plus douce encore d’accueillir tous les travaux utiles, d’encourager les jeunes talens, et de seconder leurs efforts dans une carrière aussi difficile. C’est une chose dont il semble qu’on ne devrait guere avoir à louer les bommes d’un ordre supérieur ; puisque, n’ayant d’autre objet que la science qu’ils aiment, ils ne font par-la qu’avancer leur ouvrage, et se préparer des successeurs capables de les apprécier et d’être touchés de leur gloire. Toutefois les exemples en sont moins nombreux qu’on ne pourrait le desirer; et nous nous félicitons, comme M. Delisle, d’avoir aussi cette occasion d’être l’interprète de la reconnaissance des jeunes géomètres, pour un homme qui verse à-la-fois de si vives lumieres sur les sciences , et répand une si noble émulation parmi ceux qui les cultivent.

L. POINSOT, professeur de mathématiques *au Lycée-Bonaparte.*

Recherches arithmétiques, by Mr. C. Fr. Gauss (of Brunswick), translated by A. C. M. Poullet-Delisle, professor of mathematics at Lycée d’Orléans. – Paris 1807.

In the modest title of this book, some people might think that the author considers the simplest questions. But these arithmetical investigations are at bottom, very scholarly research on the general properties of numbers, and on the indeterminate analysis to which they are bound: deep, inexhaustible matter, the newest and most arduous perhaps of all parts of mathematics.

Elementary arithmetic is nothing more, as we know, than the art of numeration, which can be established in an infinite number of ways, according to the scale or base that we wish to choose. But numbers, considered in themselves, have properties which do not depend at all upon the manner in which they are represented. Thus there are numbers which can not be divided by any other, and which are called prime or simple, because all the others are composed by the multiplication: there are the numbers figures which one names triangles, squares , pentagons, hexagons, etc. according to whether their units considered as points, for example, could be arranged symmetrically in triangles, squares, pentagons, etc. There are the different powers of numbers which are produced by multiplying them by themselves, and a crowd of others formed by various laws and by all the regular combinations of these. Now all these numbers and their properties remain the same in all possible systems of numeration; and from there arises a certain kind of mathematical speculation from which are born several truths or theorems which, with the little that has hitherto been found in the art of conducting oneself in these investigations, constitute this transcendental arithmetic today names the number theory.

Neither the bounds nor the nature of this sheet would allow us to develop well on this subject. For those who would like to deepen, the details would be all. insufficient days, and for others, superfluous. We shall only observe that the properties studied today in numbers have nothing to do with those wonderful virtues sought by ancient philosophers. They look only at their simple ways of being towards each other, as from the point of view of the divisors they may have, of the residues they leave; and, in general, they may be related to certain possibilities or impossibilities of decomposing the numbers formed by a certain law, in others formed by a different law, or by the same. They can, therefore, naturally be divided into two great classes; the positive properties and the negative properties, which we can get an idea from the following theorems:

*Any number is always decomposable into three triangles, or four quarters, or five pentagons, or sit hexagons, and so to infinity: zero can sometimes be counted in this decomposition as a number figure.)*

*Any power above the second, can never be broken down into two similar powers. Thus two cubes can not form a cube; two bi-square, one biquarré, and likewise to infinity.*

The elegant simplicity of these theorems, their expression so clear and so well finished that we see them at full sight at first glance; and yet the extreme difficulty which is soon felt when one wishes to try the rigorous demonstration of it, may explain the singular attraction of questions of this kind, and the kind of passion with which one gives oneself to it, as soon as we started to take care of it. This science, moreover, remains almost always new and promises discoveries at every step. The progress of algebra and the invention of new calculations have successively put geometers in a position to overcome the highest difficulties of geometry and mechanics. A key today easily resolves problems that have arrested the most vigorous geniuses; and this is why we can not exactly compare the geometers separated by a few long intervals. But these methods, which have been so well followed and measured by all the affections of continuous granulators, have taught us very little about the calculation and combinations of discrete quantities. It is scarcely possible to apply to it till now more analysis than Fermat did two centuries ago; and as this genius had, no doubt, entered to enter any new art, the secret of which has not reached us, no one has been able to re-establish all the lost demonstrations of most of his theorems, and, among others, those of the two preceding which are due to this extraordinary man. Thus, when the other branches of mathematics have been gradually elevated to perfection, and the geometricians have so well realized this allegory, that a child mounted on the shoulders of Hercules sees farther than him, the doctrine of numbers, in spite of their labors, remained, so to speak, motionless; as if to be, at all times, the test of their strength and the measure of the penetration of their minds.

That is why Mr. Gauss, by a work as profound and as new as his Arithmetical Investigations, is certainly one of the best mathematical heads of Europe. It appears, in his preface, that in 1795, when he turned for the first time his reflections on numbers, he had no idea of what had been done before him on this matter, even by the modernists. He had thus almost completed the first four sections of his book, without having seen either the memoirs of Euler and Lagrange, who, among other discoveries, solved quite a large number of Fermat’s problems; neither the memoirs of M. Legendre, so worthy of following them in this picture, and of giving us this excellent work, or, with many new things belonging to him, he gathers and puts in order all that had hitherto appeared on this important theory. Mr. Gauss, being then in a position to read the writings of these famous geometers, was not long in recognizing that he had used most of his meditations for things long since. But animated by a new ardor, I strove, he said, following the steps of these men of genius, to cultivate further the field of arithmetic, and he was the origin of sections V, VI and VII; which make up more than three quarters of the volume. Although we can not enter here, either in the history of science going back to Diophantus or even to Euclid, or in the discussion of things belonging to different geometers, and particularly to Mr. Gauss, we can not, however, pass over in silence the new and unexpected result found in the seventh section of his Research on the Theory of Equal Divisions of the Circle, or the inscription of regular polygons. It had been reason to believe since Euclid, that the divisions of the circle by the numbers two, three, five, and those which result from it, were the only ones possible by the rule and the compass. Mr. Gauss shows that we can still execute by the same means the divisions by seventeen and by all the numbers which, like two, three, five, and seventeen, are formed of a power of two, plus one; and are at the same time first. As for the other divisions, they depend on operations superior to those of the compass, or equations which pass the second degree. Mr. Gauss assigns the least degrees where they can be reduced; and we can, he says, show with all rigor that these equations can not be avoided or lowered: and although the limits of this work do not allow us to develop here the demonstration, we thought it necessary to warn, to avoid that no one should attempt other geometrical divisions than those given by our theory, and not employing his time unnecessarily in this search.

One might wonder at first to find in this book problems of geometry, and to see them solved by numbers. But in the mathematical sciences all truths stand by a necessary chain. No idea can blossom without enlightening most theories which, in their turn, perfect the arts which answer them; and the discovery of Mr. Gauss only confirms this truth so often only confirms the truth so often recognized by geometers: that their most frivolous speculations appear to develop in the end any new utility There was no point in it, and one always sees their meditations, at least innocent during their life, turn with the time for the benefit of the arts and for the benefit of society.

What we have just said about arithmetic research can give an idea of the depth and difficulty of the work. It was not necessary, therefore, to have a less good geometer or a less skillful translator than M. Delisle, to render it well. For it is not an ordinary translation here, which requires only the knowledge of two languages. It is doubtless necessary to have a rare intelligence, to understand thoroughly such delicate demonstrations: a well-sustained attention to follow such laborious enumerations, to examine whether they are complete, in order to penetrate the arguments of the lover, to project into the discourse the points where the difficulty is formed, and the points at which it resolves itself; and finally to appropriate things so much that they seem not only written but thought in the translator’s own language.

Mr. Delisle must be rewarded for his work and his zeal for success. But it will be even more so by the ideas that such a reading could not fail to bring to birth. We even see, in his modest warning, that he had at first been tempted to add extensive remarks to the text, or, in clarifying the subject, he could naturally have placed the new considerations which were offered to his mind in the during this translation. But with the exception of a very small number of very simple notes, which justify themselves, he only wanted to give the work of Mr. Gauss as he is; Loving better, for his own work, that time and meditation have matured him more, and made him more worthy of the attention and attention of geometers.

Mr. Delisle paid tribute to his translation to Mr. Laplace; tribute very natural in more ways than one. This geometer, in fact, to the brilliant glory of having raised monuments in the sciences, and of extending the domain every day, wants to unite the still gentler glory of receiving all useful works, of encouraging young talents, and second their efforts in such a difficult career. It is something of which it seems that one should hardly have to praise the men of a superior order; since, having no other object than the science they love, they do not advance its work, and prepare successors capable of appreciating them and being touched by their glory. However, the examples are less numerous than might be desired; and we, like Mr. Delisle, welcome this opportunity to be the interpreter of the recognition of young geometricians, for a man who pours so bright light on the sciences, and spreads a noble emulation among those who cultivate them.

L. POINSOT, professor of mathematics at Lycée-Bonaparte.

]]>- Can simply adding a “
**newline**” character**change the translation of a word**?

This sounds weird, because for a human being, the obvious reaction would be:

- What does that even mean? Probably you’ve accidentally hit ENTER or something, and that can’t possibly affect the meaning of a word, why do you even ask that?

Well, if the translation system in question based on **statistical natural language processing** and neural network algorithms such as **deep learning**, then things get a little more complex. Let’s first look at a sentence without any superfluous newline inserted:

and now, let’s hit ENTER right after the Dutch word “afzetzone”, to see the translation change magically:

The point here is not if the word “afzetzone” is translated correctly, but rather, how come its translation changes by simply adding one more “white space” after the word.

If you’re a lay person, you’ll probably be baffled by this example, and if you’re an NLP expert, specializing in deep learning techniques, you’ll probably scratch your head and then smile, and if you’re one of the scientists or engineers actually working on the Google Translate software’s debugging, well, then you might give a different reaction.

All in all, keep in mind that in today’s technological landscape, there are super **complex systems** behind **simple interfaces**, and such “**glitches**” barely scratch the surface of this, providing a little, and opaque glimpse into a popular Artificial Intelligence product.

It is largely because of lack of knowledge of what statistics is that the person untrained in it trusts himself with a tool quite as dangerous as any he may pick out from the whole armamentarium of scientific methodology. –Edwin B. Wilson (1927), quoted in Stephen M. Stigler, *The Seven Pillars of Statistical Wisdom*.

Imagine you’re responsible for testing some aspects of a complex software product, and one of your colleagues comes up with the following request:

- Hey, can you write a
**self-contained**function to test the results of software component X, and returns**TRUE**if the**data set**generated by X is**normally distributed**, and**FALSE**otherwise?

What’s a poor software developer to do?

Well, you cherish the fond memories of your first statistics class that you took more than 20 years ago, and say: “I’ll plot a histogram of the data, and see if it’s normal!”

But of course, in less than a second you realize that **manual visual inspection of a plot** will not make **an automated test**, not at all! So as a brilliant software developer with math background, you say, “easy, I’ll just grab my secret weapon, that is, Python and its SciPy library to smash through this little statistical challenge!” You’re happy that you can stand on the shoulders of the giants, and use a well-documented, simple function such as scipy.stats.normaltest.

But then, real life happens, and for whatever business and technological reasons, you realize that you can’t make third-party libraries such as SciPy a part of your test suite on a whim. Not for a single test at least. And after all, don’t you like to have **self-contained code, with minimal dependencies?** You say to yourself, “let’s get down to the basics, and build a short, simple, correct, and self-contained function that tests for normality.”

So, the first question is… given a bunch of data points, how do you test for normality, what’s the correct algorithm? Well, actually, that should have been the **second** question, first one, being the essential question of:

- Dear colleague,
**WHY**do you want to test for normality?

Let’s assume that you’ve exhausted your colleague with Five Whys, and it’s clear why you need that test.

So, back to normality testing: how do you test for normality? Again, the first question should’ve been different:

- Can you test for normality? In other words, do we have an algorithm that we can trust for sure to return
**TRUE**, if the data set in question has the normal distribution of values?

As a statistically-savvy software developer, you know you have to tread carefully now, because you’re in the domain of fundamental statistics, and you don’t want to write a function that can mislead people with results that can be easily misinterpreted.

So you do your research, after all it’s 21. century, and R Project, as well as, Python have solid statistical libraries. Knowing that R people are a little more sensitive about statistical correctness, both in terms of implementation and interpretation, you reach for one of the normality tests in R, and read its documentation:

- SnowsPenultimateNormalityTest: “The
**theory**for this test is based on the**probability**of getting a rational number from a truly continuous distribution defined on the reals. The main goal of this test is to quickly give a**p-value**for those that feel it necessary to test the**uninteresting**and**uninformative***null hypothesis*that the data represents an exact normal, and allows the user to then move on to**much more important questions**, like “is the data**close enough to the normal to use normal theory inference?**“. After running this test (or**better****instead of running this and any other test of normality**) you should ask yourself**what it means to test for normality**and**why you would want to do so**. Then plot the data and explore the interesting/useful questions.”

“Thank you very much!” you say, “but I’m not in the business of plotting data, right? I need to automate this stuff, no way to involve manual visual inspection by a human!”. So you continue your research, only to come across this discussion:

- “Normality tests don’t do what most think they do. Shapiro–Wilk test, Anderson-Darling test, and others are
**null hypothesis tests****AGAINST**the assumption of normality. These**should not be used to determine whether to use normal theory statistical procedures**. In fact they are of virtually no value to the data analyst. Under what conditions are we interested in rejecting the null hypothesis that the data are normally distributed? I have never come across a situation where a normal test is the right thing to do. When**the sample size is small**, even big departures from normality are**not detected**, and when your**sample size is large**, even the**smallest deviation from normality will lead to a rejected null**.”

So now, you’re even more confused. And to make things more interesting, you come across this discussion: “Is normality testing ‘essentially **useless**‘?”:

- “It’s not an argument. It is a (a bit strongly stated) fact that formal normality tests
**always reject on the huge sample sizes**we work with today. It’s even easy to prove that when n gets large, even the**smallest deviation from perfect normality**will lead to a**significant result**. And as every data set has some degree of randomness, no single data set will be a perfectly normally distributed sample. But in applied statistics the question is not whether the data/residuals … are perfectly normal, but normal enough for the assumptions to hold.” - “When thinking about whether normality testing is ‘essentially useless’, one first has to think about what it is supposed to be useful for. Many people (well… at least, many scientists)
**misunderstand the question the normality test answer**. The question normality tests answer: Is there**convincing evidence of any deviation from the Gaussian ideal?**With m**oderately large real data sets**, the answer is**almost always yes**.The question scientists often expect the normality test to answer: Do the data deviate enough from the Gaussian ideal to “forbid” use of a test that assumes a Gaussian distribution? Scientists often want the normality test to be the referee that decides when to abandon conventional (ANOVA, etc.) tests and instead analyze transformed data or use a rank-based non-parametric test or a re-sampling or bootstrap approach. For this purpose, normality tests are not very useful.”

“Maybe plotting a histogram and looking at it would be really easier after all, even if not automated!” you sigh to yourself. But then, surprise! You can’t trust your ‘eyes’, as in, “is the following normally distributed, or what shall I trust after all?“:

And you give up after reading “If my histogram shows a bell-shaped curve, can I say my data is normally distributed?“.

Finally, you accept to live with ‘reality’, and implement a function that uses skewness and kurtosis, taking into account the the size of your data set.

**Resources for those who want to learn more about Normality Testing**

- Normality test according to Wikipedia
- A Gentle Introduction to Normality Tests in Python
- Testing for Normality — Applications with Python
- Statistical notes for clinical researchers: assessing normal distribution (2) using skewness and kurtosis
- Normality Tests for Statistical Analysis: A Guide for Non-Statisticians
- Testing for Normality
- 68–95–99.7 rule
- Is the Shapiro-Wilk test only applicable to smaller sample sizes?
- scipy.stats.normaltest documentation

The organizer of the conference, Louis Couffignal, was also mathematician and cybernetics pioneer, who had already published a book titled “*Les machines à calculer. Leurs principes. Leur évolution*.” in **1933** (Calculating machines. Their principles. Their evolution.) Another highlight from the conference was El Ajedrecista (The Chess Player), designed by Spanish civil engineer and mathematician Leonardo Torres y Quevedo. There was also a presentation based on practical experiences with the Z4 computer, designed by Konrad Zuse, and operated in ETH Zurich. The presenter was none other than Eduard Stiefel, inventor of the conjugate gradient method, among other things.

The field of AI has come a long way since 1951, and it is safe to say it’s going to penetrate into more aspects of our lives and technologies. It’s also safe to say that like many technological and scientific endeavors, progress in AI is the result of many bright minds in many different countries, and generally USA and UK are regarded as the places that contributed a lot. But it’s also important to recognize the lesser known facts such as this Paris conference in 1951, and realize the **strong tradition** in **Europe**: not only the academic, research and development track, but also the strong industrial and business tracks. Historical artifacts in languages other than English necessarily mean less recognition, but they should be a reason to cherish the **diversity** and **variety**. I believe all of these aspects combined should guide **Europe** in its quest for advancing the state of the art in AI, both in terms of software, hardware, and combined systems.

This article is heavily based on and inspired by the following article by Herbert Bruderer, a retired lecturer in didactics of computer science at ETH Zürich: “The Birthplace of Artificial Intelligence?“

]]>The 13 koans I could find are:

- Encourage flow.
- Design for failure.
- It’s not fully shipped until it’s fast.
- Practicality beats purity.
- Keep it logically awesome.
- Mind your words, they are important.
- Non-blocking is better than blocking.
- Approachable is better than simple.
- Favor focus over features.
- Half measures are as bad as nothing at all.
- Responsive is better than fast.
- Avoid administrative distraction.
- Anything added dilutes everything else.

I guess now is the time to meditate on my short program to see if I’m aligned with both The Zen of Python, and The Zen of GitHub

**UPDATE**: 14. koan discovered by Erwin Baeyens (in comments)

14. Speak like a human.

]]>

But sometimes, you just want to have an overview and see everything summed up in a single place, preferably an Emacs buffer so you can also play with it and hack it. Of course, your GNU/Linux, macOS, or MS Windows will happily show you all the available fonts, and let you filter out fixed width ones suitable for programming. Emacs itself can also do something very similar. But as I said, why not have something according to your taste?

With a bit of Emacs Lisp, it seems not that difficult, at least on GNU/Linux:

The result of running compare-monospace-font-families can be seen in the following screenshot:

Unfortunately, trying to do something similar for MS Windows is not that straightforward. The closest I could come is the following, and it lists some fonts that are not fixed width:

I hope you found this information useful. If you want to learn more about Emacs and fonts, feel free to visit the following web pages, too:

- Emacs, fonts and fontsets
- Good fonts for Emacs
- X logical font description
- GNU Emacs Manual | Text Properties
- GNU Emacs Manual | Fontsets
- GNU Emacs Manual | Low-Level Font Representation
- GNU Emacs Manual | Face Attributes
- https://unix.stackexchange.com/questions/363365/command-to-list-all-monospace-fonts-known-to-fontconfig/363368

]]>

But first things first, let’s see how Google Translate translated a very ordinary Dutch sentence into English:

Interesting! It is obvious that my son’s teacher didn’t have anything to do with a **grinding table (!)**, and even if he did, I don’t think he’d involve his class with such interesting hobbies. Of course, he meant the “multiplication table for 3”.

Then I wanted to see what the giant search engine, Google Search itself knows about Dutch word of “maaltafel”. And I’ve immediately seen that Google Search knows very well that “maaltafel” in Dutch means “Multiplication table” in English. Not only that, but also in the first page of search results, you can see the expected Dutch expression occurring 47 times. Nothing surprising here:

Back to Google Translate that’s powered by state-of-the-art automatic translation, relying on cutting edge deep learning techniques, and tons of data that Google can afford. It’s as if Google Translate isn’t aware of the existing context surrounding the word! It is as if Google Translate doesn’t, or can’t care for the context, because looking at the word itself, we see:

Interestingly, Google Translate suggests the “more frequent” “version” of the expression, and as expected, is relying on real world data and statistics.

But if you write it as it’s suggested, you get:

Please keep in mind that Dutch and English belong to the same family of languages. So, it’s not like I’m trying to translate between two languages that belong to totally unrelated families such as Turkish and English.

But, what’s the **nature** of the error and other errors that would belong to this **class of errors**? What does the system “**know**” (not only about this particular Dutch word), but what’s **its knowledge about its knowledge**? In other words, what’s the **meta-knowledge** of Google Translate, and can we even meaningfully talk about this?

Apart from a human being explicitly labeling this as a mistake, can Google Translate learn that it made a mistake? What about the context?

Can Google itself make Google Translate learn from Google Search? They belong to the same company after all. And we know that Google, as well as Microsoft, have been working on **semantic knowledge graphs** for a long time (employing the brightest and hardest-working minds from the industry and academia), enabling them to have explicit and logical structures that also power their search engines. Before AI taking over, and enslaving humanity, and putting most of the workforce out of work, maybe we should start by integrating the “smart” services of a big company, by making them learn from each other, learn from experience of different domains managed by the same company? How difficult can it be? We’ll see if one day Google will have enough money to solve this. Until then, maybe we should cut through the hype, and re-read what various artificial intelligence and cognitive science researchers have to say, e.g. Douglas Hofstadter.

Maybe we should also continue to keep a critical perspective of statistical and black-box deep learning approaches to fundamental domains of human reasoning, and insist on methods for more explicit, causal automated reasoning systems that can tell something about themselves, provide us humans with a way to tell them their mistakes in a reasoned, structured way, and be able to deal with analogies, applying lessons learned from their mistakes to similar cases in similar classes.

]]>

“A simple system may or may not work. A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system.”— John GallThis law is essentially an argument in favour of underspecification: it can be used to explain the success of systems like the World Wide Web and Blogosphere, which grew from simple to complex systems incrementally, and the failure of systems like CORBA, which began with complex specifications. Gall’s Law has strong affinities to the practice of agile software development.

The second law is known as Sowa’s law of standards:

]]>“Whenever a major organization develops a new system as an official standard for X, the primary result is the widespread adoption of some simpler system as a de facto standard for X.” — John F. Sowa

Like Gall’s law, The Law of Standards is essentially an argument in favor of under-specification. Examples include:

- The introduction of PL/I resulting in COBOL and FORTRAN becoming the de facto standards for scientific and business programming
- The introduction of Algol-68 resulting in Pascal becoming the de facto standard for academic programming
- The introduction of the Ada language resulting in C becoming the de facto standard for DoD programming
- The introduction of OS/2 resulting in Windows becoming the de facto standard for desktop OS
- The introduction of X.400 resulting in SMTP becoming the de facto standard for electronic mail
- The introduction of X.500 resulting in LDAP becoming the de facto standard for directory services