RSS

How to activate hotplugged / newly added RAM in Linux?


These days I’m busy helping one of our clients build a data platform for their renewable energy project in their own data center using Nutanix. I requested from their tech support a RAM and CPU cores upgrade for one of the virtual machines that was already running Debian GNU/Linux.

Should I buy this htop t-shirt, or go on a vacation? 😉

When they informed me that they increased the number of CPU cores and the amount RAM from the Nutanix side, I proceeded to reboot the server: To my surprise, even though I was able to see the correct number of CPU cores in htop, it seemed like the amount of RAM stayed the same! Where was the missing RAM? Nutanix management system showed that it allocated the requested amount of RAM to the server, but unlike the newly added CPU cores, we simply couldn’t see the expected amount of RAM from within the virtual machine running Debian GNU/Linux server.

After a brief investigation, we discovered that this has to do with Memory Hotplug mechanism of Linux kernel: using lsmem showed the ranges of available memory, the ones corresponding to the missing amount marked as offline.

I found out that it was possible to bring the offline memory ranges online (and vice versa) using chmem utility, e.g.:

Read the rest of this entry »
 
Leave a comment

Posted by on June 17, 2021 in Debian, Linux, sysadmin

 

Tags: , , , , , , , ,

A new data structure in town: Maple Tree


Thanks to a recent post on lwn.net, I learned about a new data structure: Maple Tree. Apparently, it’s been in development for the last 1.5 years: “The Maple Tree is a new data structure for Linux that provides an efficient way to store index ranges which map to a single pointer. It is RCU-safe and optimised for modern CPUs. For this application, it outperforms both the existing rbtree and radix tree data structures. The API is inspired by the XArray, and is significantly easier to use than the rbtree. This talk will cover the details of the implementation and show examples of users.”

This is what I could find about this up and coming “Maple Tree” data structure for enhancing Linux performance:

The Linux Maple Tree – Matthew Wilcox, Oracle
Read the rest of this entry »
 
Leave a comment

Posted by on February 15, 2021 in Linux, Programlama

 

Tags: , ,

Unix and Women


I’ve recently come across the names of two women that were active during the birth and early days of Unix, back in 1970s and 1980s. For future reference, I wanted to note down information about these pioneering women.

“For many people, writing is painful and editing one’s own prose is difficult, tedious, and error-prone. It is often hard to see which parts of a document are difficult to read or how to transform a wordy sentence into a more concise one. It is even harder to discover that one overuses a particular linguistic construct. The system of programs described here helps writers to evaluate documents and to produce better written and more readable prose. The system consists of programs to measure surface features of text that are important to good writing style as well as programs to do some of the tedious jobs of a copy editor. Some of the surface features measured are readability, sentence and word length, sentence type, word usage, and sentence openers. The copy editing programs find spelling errors, wordy phrases, bad diction, some punctuation errors, double words, and split infinitives.”

Computer aids for writers“, Lorinda Cherry, ACM SIGPLAN Notices, April 1981

Lorinda Cherry and Nina McDonald worked on Writer’s Workbench among other things in 1970s at Bell Labs. I wish the utilities that made up Writer’s Workbench would still be available and actively developed as free and open source software, maybe via GitHub (all I could find was this discussion on Hacker News).

According to M. Douglas McIlroy, Lorinda Cherry also contributed to another operating system: Plan 9.

The curious readers of history of computing can learn more about these women in the following online resources:

Read the rest of this entry »
 
Leave a comment

Posted by on February 2, 2021 in Programlama, Tarih

 

Tags: ,

Truth, correctness and utility: an example from Information Theory


I’ve come across the following when doing research on “data processing inequality“:

Fom page 19 of “Elements of Information Theory“, Second Edition, 2006, Thomas M. Cover and Joy A. Thomas

As it’s also stated in Scholarpedia’s “Mutual information” article, “Kullback-Leibler divergence is not a true distance: it is not symmetric, and it does not obey the triangle inequality (Cover and Thomas, 1991). It is not hard to show that DKL(P(z)||Q(z)) is non-negative, and zero if and only if P(z)=Q(z) .”

I found this a striking example of an expression not being true, and mathematically wrong, but the concept still being “useful“, as stated by Cover and Thomas, as long as you are experienced, and well aware of what you’re doing.

Further Reading:

 
Leave a comment

Posted by on February 2, 2021 in Books, Math

 

Tags: , , , ,

Diacritics restoration: can we do better using neural networks and deep learning? Perspectives from a 10-year-old open source project


People who need to write correctly in languages that have letters with various diacritics such as ‘ğ‘, ‘ş‘, ‘ö‘, ‘ı‘, etc., can be troubled with US or UK standard QWERTY keyboards because of the lack of such letters on those keyboard layouts. If you also need to switch between languages such as English, and Turkish, you know what I mean.

Possible forms of diacritic restoration in Turkish for “aci”. Source: “Diacritic Restoration Using Recurrent Neural Network” by Ayşenur Genç Uzun

The process of taking a piece of writing without correct spelling (that uses standard ASCII characters, without proper diacritics) , and replacing the relevant letters with the correct ones is known as “diacritics restoration“, or “diacritics reconstruction” (or “deASCIIfication” colloquially). About 10 years ago, I wrote a Python program to help people with this: Turkish Deasciifier; a port of the Emacs Lisp code developed by Prof. Deniz Yüret. There’s also a web interface at http://turkceyap.appspot.com.

Read the rest of this entry »
 
Leave a comment

Posted by on October 22, 2020 in Linguistics, Programlama, python, Science

 

Tags: ,

What is Engineering? Perspectives from “The Sciences of the Artificial”


If you are an engineer, or an engineering manager responsible for designing software-intensive complex systems, you will find a lot of food for thought in the following quotes from “The Sciences of the Artificial” by Nobel laureate and Turing Award recipient Herbert A. Simon. You might realize that the term ‘software‘ never appears in the following quotations, and the word ‘program‘ is mentioned only twice. Yet, the issues, concerns, methods, and the line of reasoning proposed by Simon can be used to attack the core of challenges facing software engineers working on different systems, and diverse domains. I believe these, as well as most of the rest of the book, deserve a critical and deep reading by generations of engineers.

“There is nothing special that needs to be said here about resource conservation—cost minimization, for example, as a design criterion. Cost minimization has always been an implicit consideration in the design of engineering structures, but until a few years ago it generally was only implicit, rather than explicit. More and more cost calculations have been brought explicitly into the design procedure, and a strong case can be made today for training design engineers in that body of technique and theory that economists know as “cost-benefit analysis.””

Read the rest of this entry »
 
Leave a comment

Posted by on October 6, 2020 in business, Management, Programlama, Science

 

Tags: , ,

Mozilla Common Voice Veri Seti ve Türkçe: Bir Gariplik Yok Mu?


Günümüzde YZ (Yapay Zeka) uygulamaları hayatımızın her alanına nüfuz etmeye devam ediyor: ses arayüzleri ve akıllı asistanlar pek çok yerde karşımıza çıkmakta. Makine Öğrenme temelli yapay zeka uygulamalarının diğer alanlarında olduğu gibi ses ve konuşma teknolojileri alanında da bilimsel çalışmaları ve inovasyonu artırmak, start-up’ları hareketlendirmek için kaliteli ve doğrulanmış, etiketlenmiş açık veri setlerine erişim önemli. Küçük start-up’ların Internet ve teknoloji devleri ile bu konuda hemen yarışmasını beklemek ise zor. Bu yüzden bu alanda veri toplayan ve açık lisanslar ile paylaşan Mozilla Common Voice gibi projeler önemli bir rol üstleniyor. Kaliteli ve çok miktarda veri içeren veri setleri, ses tanıma (Speech to Text), yazıyı sese çevirme (TTS – Text to Speech) vb. için çok önemli bir başlangıç noktası.

Kısa süre önce Mozilla Common Voice ses veri setinin yeni sürümünü duyurdu (Temmuz, 2020). Ses verisi toplanan diller arasında Türkçe olduğu için dikkatimi çekti. Yıllar önce benzer bir projeye katkıda bulunmuş biri olarak daha detaylı inceleyince beni şaşırtan bir durumla karşılaştım! Dünyada neredeyse 80 milyon kişi tarafından konuşulan Türkçe, bu veri seti içinde, İngilizceyi geçtim, 10 kat daha az insanın konuştuğu Katalanca gibi bir dilden bile daha az veri ile temsil ediliyor: Veri setindeki Katalanca veri miktarı Türkçe’den 24 kat, İngilizce ise 86 kat daha fazla!

Read the rest of this entry »

 
2 Comments

Posted by on July 3, 2020 in Linguistics, Programlama, Science

 

Tags: , , , , , , , , ,

Generative Deep Learning and Bach, a Good Fit?


If you’re like me, you know that there’s never “enough Bach” in one’s life and you can always tap into infinite musical curiosities based on Bach. Using Artificial Intelligence methods such as deep learning to “train” computers for music composition is one of the fascinating recent trends in this area, and applying these automated, statistical methods to Bach chorales is an active topic of research with interesting results. The book by David Foster, “Generative Deep Learning – Teaching Machines to Paint, Write, Compose, and Play“, has a chapter dedicated to using generative deep learning methods such as MuseGAN for music composition, and explains how such “generative” models can be trained on Bach’s real polyphonic compositions to output new musical pieces in the style of Bach.

Below is an original piece created by the Generative Adversarial Deep Learning Network (GAN, in particular the famous MuseGAN network architecture). The MuseGAN deep learning network system was able to create this after training for only 1000 epochs on a moderate laptop for 2 hours (without using GPUs), based on the data set at https://github.com/czhuang/JSB-Chorales-dataset (a set of 229 Bach chorales). In other words, this is definitely not representative of what Deep Learning can achieve as best because such a system can be easily trained for longer on much more powerful systems (see further examples below). The focus of these examples is the fact that you can also start to experiment with deep learning systems that start to model musical aspects without explicit musical teaching, hard-encoded rules in software, etc.

You can click on the image below to visit SoundCloud and listen to MP3 file generated by MuseScore.

Example created by the GAN by randomly applying nornally distributed noise vectors - Click to listen on SoundCloud

Example created by MuseGAN by randomly applying normally distributed noise vectors – Click to listen on SoundCloud

Among the actual Bach chorales in the data set, the “closest” one to the artificially generated example (“close” in the sense of Euclidean distance) can be seen below. Read the rest of this entry »

 
1 Comment

Posted by on December 17, 2019 in Math, Music, Programlama

 

Tags: , , , , , ,

The Level of High School Mathematics Education in France 220 Years Ago


Whenever new PISA (Programme for International Student Assessment) results are announced, or some journalist writes a piece on the latest state of French baccalauréat exams, many people take a critical look at educational matters and make comparisons. I think a little example from the dusty pages of the history of mathematics can shed some light at the level of high school education in France back in 1800s, that is, almost 220 years ago. Who knows, it might even give some inspiration to people who want to check their standards.

The example is about the famous German mathematician Gauss: He wrote a remarkable book in 1798, humbly titled as “Disquisitiones Arithmeticae” (“Arithmetical Investigations”). The book was first published in 1801, and only 6 years later it was translated into French and published in 1807 as “Recherches arithmétiques“.

The translator of this important book was Antoine Charles Marcelin Poullet-Delisle, a math teacher at a high school: Lycée d’Orléans. Another French high school teacher, Louis Poinsot, wrote a long review about the translation in a daily newspaper on 21 March 1807, Saturday. Poinsot was a mathematics teacher at Lycée Bonaparte in Paris, just like the French translator of Gauss’s book.

The archives of the daily newspaper where Poinsot published his review of “Recherches arithmétiques” is available online at DigiNole Home » FSU Digital Library » Napoleonic Collections » Le Moniteur universel » Moniteur universel

And you can read the review on the second page of the newspaper: Read the rest of this entry »

 
Leave a comment

Posted by on December 11, 2019 in Math, Tarih

 

Tags: , , , , ,

How to confuse Google Translate by simply adding a newline?


When you have the most popular and successful computer-based translation service in the world used by millions of people everyday, it’s inevitable that very interesting cases will be discovered. Let’s take the following question:

  • Can simply adding a “newline” character change the translation of a word?

This sounds weird, because for a human being, the obvious reaction would be:

  • What does that even mean? Probably you’ve accidentally hit ENTER or something, and that can’t possibly affect the meaning of a word, why do you even ask that?

Well, if the translation system in question based on statistical natural language processing and neural network algorithms such as deep learning, then things get a little more complex. Let’s first look at a sentence without any superfluous newline inserted:

and now, let’s hit ENTER right after the Dutch word “afzetzone”, to see the translation change magically:

The point here is not if the word “afzetzone” is translated correctly, but rather, how come its translation changes by simply adding one more “white space” after the word.

If you’re a lay person, you’ll probably be baffled by this example, and if you’re an NLP expert, specializing in deep learning techniques, you’ll probably scratch your head and then smile, and if you’re one of the scientists or engineers actually working on the Google Translate software’s debugging, well, then you might give a different reaction. 😉

All in all, keep in mind that in today’s technological landscape, there are super complex systems behind simple interfaces, and such “glitches” barely scratch the surface of this, providing a little, and opaque glimpse into a popular Artificial Intelligence product.

About the author: Emre is the co-founder & CTO of TM Data ICT Solutions in Belgium. You can read more about him in About page of this blog.

 
1 Comment

Posted by on November 8, 2019 in Linguistics, Programlama, Science

 

Tags: , ,