For some of the readers it’s old news, but I’ve just discovered the Zen of GitHub API. It immediately reminded me of The Zen of Python, and of course I wanted to find out a list of GitHub’s version of Zen koans. Therefore I wrote a short Python program to do the job: Read the rest of this entry »
When using Emacs, I don’t spend time thinking about fonts most of the time. Like the majority, I pick my favorite fixed width, mono space font and get on with it. Every now and then I can hear about some cool new font for reading lots of software source code and technical writing, and I might give it a try, but that’s the end of it.
But sometimes, you just want to have an overview and see everything summed up in a single place, preferably an Emacs buffer so you can also play with it and hack it. Of course, your GNU/Linux, macOS, or MS Windows will happily show you all the available fonts, and let you filter out fixed width ones suitable for programming. Emacs itself can also do something very similar. But as I said, why not have something according to your taste?
With a bit of Emacs Lisp, it seems not that difficult, at least on GNU/Linux:
The result of running compare-monospace-font-families can be seen in the following screenshot: Read the rest of this entry »
I’ve recently received an e-mail in Dutch from the Belgian teacher of my 7.5-year-old son, and even though my Dutch is more than enough to understand what his teacher wrote, I also wanted to check it with Google Translate out of habit and because of my professional/academic background. This led to an interesting discovery and made me think once again about artificial intelligence, deep learning, automatic translation, statistical natural language processing, knowledge representation, commonsense reasoning and linguistics.
But first things first, let’s see how Google Translate translated a very ordinary Dutch sentence into English:
Interesting! It is obvious that my son’s teacher didn’t have anything to do with a grinding table (!), and even if he did, I don’t think he’d involve his class with such interesting hobbies. 🙂 Of course, he meant the “multiplication table for 3”.
Then I wanted to see what the giant search engine, Google Search itself knows about Dutch word of “maaltafel”. And I’ve immediately seen that Google Search knows very well that “maaltafel” in Dutch means “Multiplication table” in English. Not only that, but also in the first page of search results, you can see the expected Dutch expression occurring 47 times. Nothing surprising here: Read the rest of this entry »
The first is known as Gall’s law for for systems design:
“A simple system may or may not work. A complex system that works is invariably found to have evolved from a simple system that worked. A complex system designed from scratch never works and cannot be patched up to make it work. You have to start over with a working simple system.” — John Gall
This law is essentially an argument in favour of underspecification: it can be used to explain the success of systems like the World Wide Web and Blogosphere, which grew from simple to complex systems incrementally, and the failure of systems like CORBA, which began with complex specifications. Gall’s Law has strong affinities to the practice of agile software development.
At a first glance, the question of counting lines in a text file is super straightforward. You simply run `wc` (word count) with
--lines option. And that’s what I’ve exactly been doing for more than 20 years. But what I read recently made me question if there are faster and more efficient ways to do that. Because, nowadays with very large and fast storage, you can easily have have text files that can be 1 GB, 10 GB, or even 100 GB. Coupled with that fact is that your laptop has at least 2 cores or maybe 4, that means 8 logical cores with hyper-threading. On a powerful server, it’s not surprising at all to have 16 or more CPU cores. Therefore, can this simple text processing be made more efficient, burning as many cores as available, and utilizing them to their maximum to return the line count in a fraction of time?
Here’s what I found:
- Parallel processing with unix tools
- Use multiple CPU Cores with your Linux commands — awk, sed, bzip2, grep, wc, etc.
- GNU Parallel example: Processing a big file using more CPUs
In the first link above, I came across an interesting utility:
- turbo-linecount: Super fast Line Counter
Apparently, the author of the turbo-linecount decided to implement his solution in C++ (for Linux, macOS, and MS Windows). He uses memory mapping technique to map the text file to the memory, and multi-threading to start threads that count the number of newlines (`
\n`s) for different chunks of the memory region that corresponds to file contents, finally returning the sum total of newlines as the line count. Even though there are some issues with that system, I think it’s still very interesting. Actually my initial reaction was “how come this nice utility still not a standard package in most of the GNU/Linux software distributions such as Red Hat, Debian, Ubuntu, etc.?”.
Maybe we’ll have better options soon. Or maybe we already do? Let me know if there are better ways for this simple, yet frequently used operation.
Yedi yaşındaki oğlum bugün sınıf arkadaşları ve öğretmenleri ile Anvers’teki opera binasını ziyaret edecek, perde arkasını görecek ve orada çalışanlarla konuşacak. Sabah onu okula bırakırken müzik ve bugün yapacakları hakkında konuştuk biraz. Onu okuluna bıraktıktan sonra ister istemez bazı hatıralar canlandı gözümde, aynı zamanda 2018 itibariyle yaşadığım sert gerçeklik aklıma geldi.
İstanbul’da büyüdüm, dünyanın en eski şehirlerinden birinde; tarihi, kültürel ve arkeolojik açıdan muazzam zengin bir mirasın içinde. İstanbul’da operaya ve baleye gitmeye başladım arkadaşlarımla, önce lise, sonra da üniversite öğrencisi olarak. 1990larda ve 2000lerin başında çoğu zaman opera bileti sinema biletinden ucuzdu. Bazı okurların iyi bildiği gibi gittiğim yer Atatürk Kültür Merkezi idi. Kolektif hafızamızın önemli bir parçası idi. Uzun zamandır ne halde olduğunu bilmiyordum ve öğrendim ki 2018 itibariyle aşağıdaki gibi görünüyor. İşte biz, kolektif hafızaya böyle davranırız:
My 7 seven year old son will visit the opera house in Antwerp today, together with his classmates and teachers as part of his school activities. We talked about music and today’s activity as I was driving him to the school this morning. This took me to a trip down the memory lane, and back to the harsh realities of the world I live in 2018.
I grew up in Istanbul, one of the oldest cities in the world with a very rich and complex historical, cultural, and archeological heritage. In my city, I used to go to opera and ballet as a high school, and then a university student. In fact, opera tickets were generally cheaper than cinema tickets, back in the 1990s and beginning of 2000s. The opera house was named “Atatürk Cultural Center”. It was an important part of our collective memory. It’s been demolished recently and this is how it looks as of 2018. This is how collective memory is treated: Read the rest of this entry »