I’m in the process of creating the ultimate USB multi boot stick for personal and professional uses, so far I experimented a little bit with Parted Magic, Clonezilla and grml. I want to note down the useful sites so far:
Monthly Archives: August 2010
I’m happy to see that Turkish Deasciifier Firefox add-on is being downloaded every day and used regularly by people who like / need it. I plan to add some features but currently I’m waiting for Jetpack SDK developers to solve some technical problems.
PS: Those fancy graphics are part of the Mozilla add-on management pages and the charting component is Simile Timeline component.
Here’s a short list compiled by Ahmet A. Akın:
- Unsupervised Search for The Optimal Segmentation for Statistical Machine Translation (Coşkun Mermer – Ahmet Akın)
- Syntax-to-Morphology Mapping in Factored Phrase-Based Statistical Machine Translation from English to Turkish (Reyyan Yeniterzi, Kemal Oflazer)
- Annotating Subordinators in the Turkish Discourse Bank (Deniz Zeyrek, Ümit Turan, Cem Bozşahin, Ruket Cakıcı,Ayışığı Sevdik-Çallı, Işın Demirşahin, Berfin Aktaş, İhsan Yalçınkaya, Hale Ögel)
- A Stochastic Finite-State Morphological Parser for Turkish (Haşim Sak & Tunga Güngör, Murat Saraçlar)
- Collocation Extraction in Turkish Texts Using Statistical Methods (Senem Kumova Metin, Bahar Karaoğlan)
- A Freely Available Morphological Analyzer for Turkish (Çağrı Çöltekin)
“TRmorph is a relatively complete morphological analyzer for Turkish. It is implemented using SFST, and uses a lexicon based on (but heavily modified) the word list from Zemberek spell checker. The morphological analyzer is distributed under the GPL.
To use the analyzer you need SFST. As well as the full source code, a compiled fsa, suitable to be used with SFST’s fst-mor or fst-infl is included. A UNIX makefile is provided for easy compilation from the sources (see the included README file for details. The analyzer is fairly complete, however, it may not be easy on unaccustomed eyes. Documentation and cleanup work is going on, you may want to visit soon to get a newer version.”
For details and live demo see http://www.let.rug.nl/~coltekin/trmorph/ and http://www.let.rug.nl/~coltekin/papers/coltekin-lrec2010.pdf
For some relevant natural language processing resources please see Resources for Turkish morphological processing, Morphological Disambiguation of Turkish Text with Perceptron Algorithm and http://denizyuret.blogspot.com/2006/11/turkish-resources.html.
One of the most famous type of fish in Turkey, namely Hamsi can now become very famous in the world of cryptography and security. A Turkish mathematician who is pursuing a Ph.D. at K.U. Leuven proposed a cryptographic hashing algorithm to NIST.
As we’re getting very close to the finals I’m eagerly waiting to hear about the winner.
PS: It is also a pleasure to see the name of Alp Öztarhan, a colleague of mine, who seems to have implemented the first version of the algorithm.
I’ve just realized that the default filters installed with fail2ban in Ubuntu GNU/Linux does not help you when you use Digest Authentication with Apache. In order to have the most basic measure against brute force attacks to a digest authentication enabled web service you need to modify
/etc/fail2ban/filter.d/apache-auth.conf. I have tried the suggestion given at fail2ban wiki and it seems to work http://www.fail2ban.org/wiki/index.php/Talk:Apache:
Once you add the line above to the apache-auth.conf file, try a to enter wrong username / password combinations when you are presented with the authentication window and then check if fail2ban detects it (I’m assuming your log files are at their usual locations):
$ fail2ban-regex /var/log/apache2/error.log /etc/fail2ban/filter.d/apache-auth.conf
If it returns success and you can see that the relevant IP addresses are matched then you can restart your fail2ban server and have one more level of protection.
I’ll try to attend to international CALL (Computer Assisted Language Learning) at University of Antwerp for the next three days:
Keynote speakers Antonie Alm (University of Otago, New Zealand), Maarten Vansteenkiste (Ghent University, Belgium) and Ema Ushioda (Warwick University, United Kingdom) will provide an overview of literature on motivation, an introduction to Self-Determination Theory and a presentation of the L2 SELF model.
Here are some highlighted topics from the conference
* the impact of ICT on motivation;
* designing for motivation;
* the role of ICT in the analysis of motivation;
* the relationship between motivation and proficiency level;
* learning styles;
* teacher motivation.
Some more live carillon performances, this time from Ghent. Close your eyes in order to stay away from the distraction of videos and enjoy the exciting rich timbre of the bells:
Ancient Symbols, Computational Linguistics, and the Reviewing Practices of the General Science Journals
The strongest criticism comes after and against one of the most controversial and recently popular research which made use of computers to understand ancient symbols. The issue was made famous by WIRED’s “Artificial Intelligence Cracks Ancient Mystery” article. Richard Sproat’s strong criticism of mis-using statistical methods in order to detect if a sequence of symbols constitute a language is worth reading: “Ancient Symbols, Computational Linguistics, and the Reviewing Practices of the General Science Journals“.
– UPDATE: Rao’s answer to the following criticism can be read at Rebuttal of Sproat, Farmer, et al.’s supposed “refutation”. Also see http://indusresearch.wikidot.com/script –
“Few archaeological finds are as evocative as artifacts inscribed with symbols. Whenever an archaeologist finds a potsherd or a seal impression that seems to have symbols scratched or impressed on the surface, it is natural to want to ‘read’ the symbols. And if the symbols come from an undeciphered or previously unknown symbol system it is common to ask what language the symbols supposedly represent and whether the system can be deciphered.
Of course the first question that really should be asked is whether the symbols are in fact writing. A writing system, as linguists usually define it, is a symbol system that is used to represent language. Familiar examples are alphabets such as the Latin, Greek, Cyrillic, or Hangul alphabets, alphasyllabaries such as Devanagari or Tamil, syllabaries such as Cherokee or Kana, and morphosyllabic systems like Chinese characters. But symbol systems that do not encode language abound: European heraldry, mathematical notation, labanotation (used to represent dance), and Boy Scout merit badges are all examples of symbol systems that represent things, but do not function as part of a system that represents language. Whether an unknown system is writing or not is a difficult question to answer.
Read the rest of this entry »