RSS

Category Archives: python

Normality Testing: is it normal?


It is largely because of lack of knowledge of what statistics is that the person untrained in it trusts himself with a tool quite as dangerous as any he may pick out from the whole armamentarium of scientific methodology. –Edwin B. Wilson (1927), quoted in Stephen M. Stigler, The Seven Pillars of Statistical Wisdom.

Imagine you’re responsible for testing some aspects of a complex software product, and one of your colleagues comes up with the following request:

  • Hey, can you write a self-contained function to test the results of software component X, and returns TRUE if the data set generated by X is normally distributed, and FALSE otherwise?

What’s a poor software developer to do?

Well, you cherish the fond memories of your first statistics class that you took more than 20 years ago, and say: “I’ll plot a histogram of the data, and see if it’s normal!”

But of course, in less than a second you realize that manual visual inspection of a plot will not make an automated test, not at all! So as a brilliant software developer with math background, you say, “easy, I’ll just grab my secret weapon, that is, Python and its SciPy library to smash through this little statistical challenge!” You’re happy that you can stand on the shoulders of the giants, and use a well-documented, simple function such as scipy.stats.normaltest.
Read the rest of this entry »

 
Leave a comment

Posted by on September 11, 2019 in Math, Programlama, python, Science

 

Tags: , , , ,

Zen of GitHub and Python


For some of the readers it’s old news, but I’ve just discovered the Zen of GitHub API. It immediately reminded me of The Zen of Python, and of course I wanted to find out a list of GitHub’s version of Zen koans. Therefore I wrote a short Python program to do the job: Read the rest of this entry »

 
2 Comments

Posted by on June 4, 2019 in Programlama, python

 

Tags: , ,

GODISNOWHERE: A look at a famous question using Python, Google and natural language processing


Are there any commonalities among human intelligence, Bayesian probability models, corpus linguistics, and religion? This blog entry presents a piece of light reading for people interested in a combination of those topics.
You have probably heard the famous question:

       “What do you see below?”

            GODISNOWHERE

The stream of letters can be broken down into English words in two different ways, either as “God is nowhere”   or as “God is now here.” You can find an endless set of variations on this theme on the Internet,  but I will deal with this example in the context of computational linguistics and big data processing.

margo

When I first read the beautiful book chapter titled “Natural Language Corpus Data” written by Peter Norvig, in the book “Beautiful Data“, I’ve decided to make an experiment using Norvig’s code. In that chapter, Norvig showed a very concise Python program that ‘learned’ how to break down a stream of letters into English words, in other words, a program with the capability to do ‘word segmentation’.

Norvig’s code coupled with Google’s language corpus, is powerful and impressive; it is able to take a character string such as

“wheninthecourseofhumaneventsitbecomesnecessary”

and return a correct segmentation:


‘when’, ‘in’, ‘the’, ‘course’, ‘of’, ‘human’, ‘events’, ‘it’, ‘becomes’, ‘necessary’

But how would it deal with “GODISNOWEHERE”? Let’s try it out in a GNU/Linux environment: Read the rest of this entry »

 
2 Comments

Posted by on March 1, 2014 in Linguistics, Programlama, python

 

Tags: , , , , , , , ,

Is Semantic Web and Linked Data Good Enough? SPARQL & DBPedia vs. Python & IMDbPY


Semantic Web & Linked Data: Technology of the future? Hopefully.

The inspiration of this short article is a simple question my wife asked while we were enjoying a recent episode of Continuum, a Canadian science-fiction series:

Q1. Hey, Emre, isn’t the girl who is playing Kiera’s grandmother the same girl who played Rosie Larsen in The Killing?

I said that I believed so and that it would be very easy to get the definitive answer via IMDb. Before I finished my sentence though, it occured to me that this would be a nice test to evaluate the current state of the Semantic Web and Linked Data. After all, how difficult would it be to query the wonderful world of linked data with a couple of SPARQL queries, and even go further by asking the following question:

Q2. Who are the actors that performed both in The Killing and Continuum?

What will be state of Semantic Web and Linked Data 65 Years from Now?

What will be state of Semantic Web and Linked Data 65 Years from Now?

After all, semantic web, linked data, coupled with DBpedia can easily tell us the actors that starred in Hoffa and The Shining, right? Simply running the following SPARQL query running the the following query using http://live.dbpedia.org/sparql:

Read the rest of this entry »

 
6 Comments

Posted by on July 11, 2012 in Programlama, python

 

Tags: , , , , , , , , , , , ,

Yet another reason to like open source and GitHub: A short GitHub badge story


GitHub Badge is a nice, little web utility that gives a 10.000 feet overview of your GitHub activity. It has a very high data-ink ratio, showing the programming languages you employ frequently, as well as some valuable statistics about your source code repositories. It is being developed out in the open by fellow hackers berkerpeksag and BYK.

GitHub Badge

GitHub Badge

This cute little application was the subject of well-deserved enthusiasm when I first saw it, and one of my immediate reactions were: “Hey, why don’t you add sparklines? Now, that would be cool, right?”. Apparently the developers listened and went on to spread more information visualization love. Well, now, don’t you consider this yet another reason to be fond of open source and interactive development that is becoming more and more popular every single day, thanks to the success of systems such as Github?

Oh, by the way, no, I’m not on vacation. Really. Err, maybe regarding my GitHub projects but… (Maybe I should open another issue for that 😉

 
Leave a comment

Posted by on January 30, 2012 in General, Programlama, python

 

Turkish Deasciifier: Google App Engine version ready


Turkish Deasciifier - Google App Engine version

Turkish Deasciifier - Google App Engine version

The Google App Engine version of Turkish Deasciifier is ready. You can try it at http://turkceyap.appspot.com/

This is version 0.1 and is the actual Python implementation. I have created this so that people who doesn’t want to install a Firefox add-on or Google Chrome extension can give it a try, too.

For related news and updates you can check http://ileriseviye.org/blog/?tag=turkish-deasciifier

 
Leave a comment

Posted by on July 31, 2010 in Linguistics, Programlama, python

 

Tags:

turkish-deasciifier: Added to Softpedia


I’ve just received an e-mail from Softpedia Editorial Team about my Python implementation of Deniz Yüret’s Turkish deasciifier:

Congratulations,

Turkish Deasciifier, one of your products, has been added to Softpedia’s database of software programs for Linux. It is featured with a description text, screenshots, download links and technical details on this page:
http://linux.softpedia.com/get/Text-Editing-Processing/Others/Turkish-Deasciifier-58739.shtml

The description text was created by our editors, using sources such as text from your product’s homepage, information from its help system, the PAD file (if available) and the editor’s own opinions on the program itself.

For related posts please visit http://ileriseviye.org/blog/?tag=turkish-deasciifier

 
Leave a comment

Posted by on July 24, 2010 in Programlama, python

 

Tags: