How to confuse Google Translate by simply adding a newline?

When you have the most popular and successful computer-based translation service in the world used by millions of people everyday, it’s inevitable that very interesting cases will be discovered. Let’s take the following question:

  • Can simply adding a “newline” character change the translation of a word?

This sounds weird, because for a human being, the obvious reaction would be:

  • What does that even mean? Probably you’ve accidentally hit ENTER or something, and that can’t possibly affect the meaning of a word, why do you even ask that?

Well, if the translation system in question based on statistical natural language processing and neural network algorithms such as deep learning, then things get a little more complex. Let’s first look at a sentence without any superfluous newline inserted:

and now, let’s hit ENTER right after the Dutch word “afzetzone”, to see the translation change magically:

The point here is not if the word “afzetzone” is translated correctly, but rather, how come its translation changes by simply adding one more “white space” after the word.

If you’re a lay person, you’ll probably be baffled by this example, and if you’re an NLP expert, specializing in deep learning techniques, you’ll probably scratch your head and then smile, and if you’re one of the scientists or engineers actually working on the Google Translate software’s debugging, well, then you might give a different reaction. 😉

All in all, keep in mind that in today’s technological landscape, there are super complex systems behind simple interfaces, and such “glitches” barely scratch the surface of this, providing a little, and opaque glimpse into a popular Artificial Intelligence product.

Normality Testing: is it normal?

It is largely because of lack of knowledge of what statistics is that the person untrained in it trusts himself with a tool quite as dangerous as any he may pick out from the whole armamentarium of scientific methodology. –Edwin B. Wilson (1927), quoted in Stephen M. Stigler, The Seven Pillars of Statistical Wisdom.

Imagine you’re responsible for testing some aspects of a complex software product, and one of your colleagues comes up with the following request:

  • Hey, can you write a self-contained function to test the results of software component X, and returns TRUE if the data set generated by X is normally distributed, and FALSE otherwise?

What’s a poor software developer to do?

Well, you cherish the fond memories of your first statistics class that you took more than 20 years ago, and say: “I’ll plot a histogram of the data, and see if it’s normal!”

But of course, in less than a second you realize that manual visual inspection of a plot will not make an automated test, not at all! So as a brilliant software developer with math background, you say, “easy, I’ll just grab my secret weapon, that is, Python and its SciPy library to smash through this little statistical challenge!” You’re happy that you can stand on the shoulders of the giants, and use a well-documented, simple function such as scipy.stats.normaltest.
What was the state of AI in Europe almost 70 years ago?

When it comes to the history of Artificial Intelligence (AI), even a simple Internet search will tell you that the defining event was “The Dartmouth Summer Research Project on Artificial Intelligence“, a summer workshop in 1956, held in Dartmouth College, United States. What is less known is the fact that, 5 years before Dartmouth, USA, there was a conference in Europe, back in 1951. The conference in Paris was “Les machines à calculer et la pensée humaine” (Calculating machines and human thinking). This can be easily considered the earliest major conference on Artificial Intelligence. Supported by the Rockefeller foundation, its participant list included the intellectual giants of the field, such as Warren Sturgis McCulloch, Norbert Wiener, Maurice Vincent Wilkes, and others.

The organizer of the conference, Louis Couffignal, was also mathematician and cybernetics pioneer, who had already published a book titled “Les machines à calculer. Leurs principes. Leur évolution.” in 1933 (Calculating machines. Their principles. Their evolution.) Another highlight from the conference was El Ajedrecista (The Chess Player), designed by Spanish civil engineer and mathematician Leonardo Torres y Quevedo. There was also a presentation based on practical experiences with the Z4 computer, designed by Konrad Zuse, and operated in ETH Zurich. The presenter was none other than Eduard Stiefel, inventor of the conjugate gradient method, among other things.

The field of AI has come a long way since 1951, and it is safe to say it’s going to penetrate into more aspects of our lives and technologies. It’s also safe to say that like many technological and scientific endeavors, progress in AI is the result of many bright minds in many different countries, and generally USA and UK are regarded as the places that contributed a lot. But it’s also important to recognize the lesser known facts such as this Paris conference in 1951, and realize the strong tradition in Europe: not only the academic, research and development track, but also the strong industrial and business tracks. Historical artifacts in languages other than English necessarily mean less recognition, but they should be a reason to cherish the diversity and variety. I believe all of these aspects combined should guide Europe in its quest for advancing the state of the art in AI, both in terms of software, hardware, and combined systems.

This article is heavily based on and inspired by the following article by Herbert Bruderer, a retired lecturer in didactics of computer science at ETH Zürich: “The Birthplace of Artificial Intelligence?

Lost in Google Translate: How Unreasonable Effectiveness of Data can Sometimes Lead Us Astray

I’ve recently received an e-mail in Dutch from the Belgian teacher of my 7.5-year-old son, and even though my Dutch is more than enough to understand what his teacher wrote, I also wanted to check it with Google Translate out of habit and because of my professional/academic background. This led to an interesting discovery and made me think once again about artificial intelligence, deep learning, automatic translation, statistical natural language processing, knowledge representation, commonsense reasoning and linguistics.

But first things first, let’s see how Google Translate translated a very ordinary Dutch sentence into English:

Interesting! It is obvious that my son’s teacher didn’t have anything to do with a grinding table (!), and even if he did, I don’t think he’d involve his class with such interesting hobbies. 🙂 Of course, he meant the “multiplication table for 3”.

Then I wanted to see what the giant search engine, Google Search itself knows about Dutch word of "maaltafel". And I've immediately seen that Google Search knows very well that "maaltafel" in Dutch means "Multiplication table" in English. Not only that, but also in the first page of search results, you can see the expected Dutch expression occurring 47 times. Nothing surprising here:


A visit to the largest computer museum in the world: The Heinz Nixdorf MuseumsForum

It all started more than seven years ago, when I read a short article in January, 2010 issue of Communications of the ACM, titled “Great Computing Museums of the World (Part One)“.

“The Heinz Nixdorf MuseumsForum (HNF; in Paderborn, Germany, is the world’s largest computer museum. The museum, which is also an established conference center, showcases the history of information technology—beginning with cuneiform writing and going right through to the latest developments in robotics, artificial intelligence, and ubiquitous computing.

The multimedia journey through time takes visitors through 5,000 years of history, starting with the origins of numbers and writing in Mesopotamia in 3000 B.C. and covering the entire cultural history of writing, calculating, and communications. Alongside typewriters and calculating machines, the exhibition shows punched card systems, a fully functioning automatic telephone exchange system from the 1950s, components from the earliest computer (which filled a whole room), over 700 pocket calculators, and the first PCs. Work environments from different centuries are also staged in the exhibition.

The exhibition highlights include fully functioning replicas of the Leibniz calculating machine and the Hollerith tabulating machine, a Thomas Arithmometer dating from 1850, a Jacquard loom operated with punched tape, components of the ENIAC from 1945, the on-board computer from the Gemini space capsule, the Apple 1, a LEGO Turing machine, and Europe’s largest collection of cipher machines. One of the current attractions at HNF is the world’s most famous automaton: Wolfgang von Kempelen’s chess playing machine, the Chess Turk, which dates from the 18th century.”

I was more than impressed, and wanted to visit Paderborn to see the world’s largest computer museum. I knew it was just a few hours away by car from Antwerp, but I’ve always postponed going there for various reasons. I didn’t want there to go alone, and I knew I needed someone like-minded enough to accompany me on this “nerdy” journey. Finally, last week, I and a physicist / data scientist friend of mine decided to go there, notwithstanding the weather conditions, and very snowy German highways.

I think this is the only museum where digital relics from my childhood and youth (1980s and 1990s) are considered as museum-worthy as replicas of 5000 year old Sumerian tablets! 🙂 It was pure joy and fascination to visit the halls of the museum, and be guided by very thematic and knowledgeable, gentle robots. One of them, Victoria, was a sight to be seen! The other one was also great, and you can watch “him” in action:

Read the rest of this entry »

After the course: Tales from the Genome, Introduction to Genetics & A Few Resources

Now that I’ve finished the Tales from the Genome, Introduction to Genetics course, I’d like to note some of the related resources (some of the links are related to, a company that sponsored the course, it is the same company whose genetic analysis kit I have used to learn more about my genome and the mutations I have. Unfortunately, in the meantime I have also learned that they were forced to stop selling their kits, luckily I already had my results before that happened).

geneticcode Read the rest of this entry »


