Book Review: Python for Data Analysis by Wes McKinney; O’Reilly Media

This review is about this item: Python for Data Analysis

=================================================================

I started reading at this particular book being sceptical. Although I most O’Reilly books I’ve read deliver, this one promises to introduce you to a field that is vast. Python’s various usages in data analysis. Does this one deliver? Certainly!

Let me be more specific. In the interest of full disclosure, I should note that I got this book for free via O’Reilly’s Blogger Review program. I have some experience in Python and, during the time of my exposure to it, I always read that Python was very powerful in the Data Analysis field, be it Scientific Computing, Financial Computing (up to a point, of course) and others, so naturally, I wanted to read a book to get to study Python’s usage in this field. What got me more hooked into reading this book is that this particular one was written by an expert on the field. The author of the book is also the author of the Pandas library. When I finally got through it, here are my comments on it:

- First this book gives you some information on why the data analysis field matters. For instance it refers to an example, using data analysis to come up with data sets to feed a machine learning algorithm.
– The book has short and concise (and above all, easy to follow) code examples that demonstrate the point of the text very quickly.
– The book provides several realistic use cases of the demonstrated content, so that you can get a good idea of what data analysis is all about.
– Covers (in varying degrees) xml parsing, interaction with HTML and databases. It even makes a small reference to MongoDB!
– It also covers string manipulation (including regular expressions) which is very nice!
– Has a whole chapter dedicating to plotting and visualizing.
– Has several chapters on Numpy and Pandas!
– Has a great chapter focusing on date and time data manipulation and the relevant modules in the Python lib.
– Although this book is better read if you have some Python knowledge already and want to extend your Python knowledge, it also has an appendix which goes through the essential knowledge of the Python programming Language, so even beginners with Python should feel comfortable with it.

Overall, I recommend this book if you want to get a good idea about Python’s usage in Data Analysis, whether you are a Python novice or a Python expert.

Happy new year!

I just wanted to wish to all my readers a happy new year! And I would also like to apologize for the long absence from writing. I will make up for it, I will present a new kind of articles this year: Book Reviews!

Stay tuned guys ;P 

NlightNFotis:

I liked this man’s views on C++ so much, that I wanted to share it with you guys.

Originally posted on Making the Complex Simple:

I love C++.

C++ taught me how to really write code.

Back in the day I would study the intricacies of the language, Standard Template Library, and all the nuances of memory management and pointer arithmetic.

Those were some seriously good times.  I remember reading Scott Meyers Effective C++ book series over and over again.  Each time I would learn something new or grasp more of how to use C++.

I’m saying all this just to let you know that I don’t hate C++.  I love C++.

There are plenty of excellent developers I know today that still use C++ and teach others how to use it and there is nothing at all wrong with that.

So what is the problem then?

The new message is wrong

C++11 just came out recently and it seems like there is this big resurgence in interest in C++ development.

announcer

Now don’t…

View original 1,642 more words

Pythonic Magic: Lambda expressions

All Justin Bieber needs is a Beauty and a Beat. Personally, I am happy with a Beauty and Python. Yeah, honestly, python offers one so much, that it’s difficult to ask for more (except for speed maybe, but that’s a tradeoff for python’s amazing expressiveness. And people try to address it as we speak).

So, as you may already have understood, in this second article at my featured series of Pythonic Magic, I am going to talk about Lambda Expressions. But what are Lambda Expressions, one might wonder. Lambda expressions are one of the most powerful (in my opinion) features of python, but people easily tend to misunderstand it or ignore it all together. Which is bad. ‘Cause it is awesome!

Lambda expressions are essentially a language construct that allows one to create anonymous (that is, not bound by any name) functions (more rather function objects) at runtime. That’s it! Really! I will admit that it’s implementation may be hard for some new python programmers to grasp, but once you get the hold of it, it’s really not that difficult. Before we go any further, it would be useful to provide an example of what a Lambda Expression looks like.

Supposing we want to create a function that returns the square of its input. In python, we would do something like this:


>>> def square(x):
...     return x * x
...

Pretty straight forward if you ask me. Nothing magical in this. Now let’s take a look at the output the above function produces:


>>> square(3)
9
>>> square(2.2)
4.840000000000001

Again, nothing magical (if you ignore that number following the trailing zero’s there, it’s all normal. That behavior is also normal, and has to do with hardware limitations on floating point number calculations.) Now that we got this, it’s time to see how the above could be implemented using Lambda Expressions.


>>> g = lambda x: x ** 2 # ** is the exponent operator in python. I could have also said x * x, but I refrained to do so, for clarity's sake
>>> g
<function <lambda> at 0x7f17656045f0>
>>> g(2)
4

Very nice if you ask me. As far as the lambda expression is concerned, it can do pretty much anything that the named function can do. And as you can see in my code snippet, using the lambda keyword, you get to have a lambda function object. Having a function object means that the object is callable (I mean obviously, it’s a function), which in python means that you can bind to any reference, and still call it using the parenthesis as in g(2).

If it still is kind of obscure to you, then maybe it should be worth for me to try to explain it a little bit better. Lambda forms have this general form: lambda [parameter list]: expression This is essentially the keyword lambda, followed by a number of parameters, and an expression, the value of which gets returned when you call the lambda function object.


>>> g = lambda x: x ** 2 # In this case x is the parameter, it is used in the expression to calculate the square of it, and the value of the calculated expression (in this case, the square) gets returned when the object is called.

Now I may like to call it with one parameter, but there is really no one stopping you from calling it with more than one parameters. For example take a look at the following


>>> sum = lambda x, y: x + y
>>> sum(4, 2)
6

>>> # Essentially the same as
... def sum(x, y):
...     return x + y
...

Of course I barely even touched the matter. However I believe that lambdas are a very powerful feature, which although not terribly useful all the time, may be a killer feature for when you really need that. If you feel like you want to know more about lambda expressions, I formed a neat list with various links explaining more than I am on the matter:

Pythonic Magic: List Comprehensions

Disclaimer: This will be the beginning of a series of articles, inside which I will select some of the features in python that I like the most and why I do like them, along with some description and usage of the feature at hand.

Disclaimer: I do not claim to be an expert in python (or any other language for that matter). Anything that follows is my own personal opinion, and may even be wrong, as it is dependent on my understanding on the matter. However I am doing my best to make sure that I do not include wrong information.

—————————————————————————————————————-

You may have heard that the Python programming language is easy to both learn and use. You should slap those that told you so, for blatantly lying to you. Python is not easy to learn. It is *very* easy. *Very very easy*. As a matter of fact it’s so easy that even babies could learn python.  Hyperbole, but you get the point.

I am not going to go to lengths as to why you should learn python. As a matter of fact, a google search can possibly give you many reasons to do so. However the ultimate reason for one to learn python, is to simply want to do so. Why you would want to do so, is an entirely different matter which I will not discuss here, or I will miss the original subject of this article. Which is: wait for it… List Comprehensions! Who could imagine!

Before I go on to explain what list comprehensions are, maybe I should give a description of what lists in python are. Lists are well…lists. Pretty surprising huh? No, on a more serious note, lists in python are a data structure, that acts like a container for any amount of objects, in a given order. If you already have been introduced to programming via another programming language, you may be able to think of the list as an array of objects, which has variable length. In fact Lists are much more powerful than that, following the tradition that Python follows, which is to ensure that you have all the tools that you may possibly need to work right out of the box. In fact, lists (and the other data structures that python provides by default, such dictionaries, sequences, tuples, etc) are so powerful and versatile, that 99% of the time, you will not need to use anything else than what Python already provides. Below is a demonstration of the lists in python.

>>> my_list = [3, 4, 5] # this creates a list with 3 integer objects
>>> my_list
[3, 4, 5]

>>> my_list = [] # this creates an empty list
>>> my_list
[]

>>> my_list.append(5) # this adds 5 to the list
>>> my_list
[5]

>>> my_list.insert(0,3) # this adds 3 in the first position of the list
>>> my_list
[3, 5] # python like many other languages is 0 indexed. This means that list indices begin with 0 instead of 1.

>>> my_list.remove(3) # nothing unexpected here
>>> my_list
[5]

>>> item = my_list.pop() # removes the object from the list, and returns it to be stored at item
>>> item
5

>>> my_list
[]

This pretty much sums the basic usage of Lists in python. I won’t go to any greater extend to because that would doesn’t concern this article. If you would like to read more about lists and their usage, I recommend that you study the official Python tutorials on Lists.

Now that we have talked about Lists, it’s time to talk about List Comprehensions. List Comprehensions are simply a sophisticated way to create Lists that have specific contents. If that doesn’t help you, maybe we should look at an example.

Let’s say we want to create a list of the even numbers in the range between 0 up to (but not including) 10. Normally you would do something like this:

>>> my_list = [] # constructing an empty list
>>> for number in range(10):
        if number % 2 == 0:
            my_list.append(number)
>>> my_list
[0, 2, 4, 6, 8]

List comprehensions allow you to do just the same thing, just times easier (and more sexy, by the way):

>>> even_nums = [x for x in range(10) if x % 2 == 0]
>>> even_nums
[0, 2, 4, 6, 8]

Holly floppy McBaggies. Did you see that? Human words fail to describe the Beauty and the Epic-ness in this.

You might be wondering now. How could I possibly use this to get the most out of my life  it? Well to be honest, there are many ways you could possibly use this feature. Yours trully uses List Comprehensions to create Lists out of  Strings (which are Immutable, that means they are unable to be changed. The reason is here.) to allow for easier manipulation of those. For example:

>>> name = 'fotis'

>>> name[1] = 'u'
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment

>>> name_list = [letter for letter in name] # creating a list out of name's characters
>>> name_list
['f', 'o', 't', 'i', 's']
>>> name_list[1] = 'u'

>>> name = ''.join(name_list)
>>> name
'futis'

As you can see we are talking about an extremely powerful feature, that one could use to create lists that have objects that are computed on the fly. As to the possible usage cases for this? Only your fantasy could possibly stop you from using it.

Why I believe that Python should be taught over C as a first programming language in universities.

It is (or maybe was) a worldwide trend for universities and colleges around the globe to introduce students to the C programming language as their first programming language. While some schools have now changed to Java instead, some are still teaching C as the first language one gets exposed to.

While there is nothing wrong with C (disclaimer: I love C, and I also consider it a must for software developers to learn, whether working high or low level), I firmly believe that is not suited to be used for such roles. C, being a medium level programming language, is somehow high level (this of course, is relative to comparison) while still managing to stay relatively close to the metal. C of course, derives much of its power from this single fact. Hardware interaction, while not useful to every single programmer, is still very useful for some tasks. One task I can think of, due to my work, is computer infrastructure projects. Compilers, Virtual Machines, and more, all can benefit from being close to the metal, and have a direct interaction with the computer’s hardware. I am sure this is all there is to it, but the aforementioned examples, can serve to demonstrate the need for low level interactivity.

However, tradeoff for this power is the fact that low level facilities get in your way when you program in C. Pointers quickly come to mind. No matter how powerful pointers are, most people are unable to effectively use them. This can be seen in most high level languages today, where pointers are pretty much extinct. Not only this, but to properly understand C, you need to have, at least some, knowledge of computer architecture. While you still can use C, regardless of the prevalence of such knowledge, it’s not uncommon to see students learn that a C x86 integer is 4 bytes, but not being able to answer why it is so if asked. Not to mention that the way student’s are taught C, by not telling them about the C standard library, they are getting reduced to exercises about array manipulation that get repetitive and tedious over a short amount of time, and don’t really add much to the student’s knowledge. The end result is that students feel incapable of delivering anything more than something trivial. Not to mention that there are schools (at least here in Greece) that teach C programming in Windows.

On the other hand, higher level languages today, such as javascript and python, at least in my opinion are more suited to serve as a language in an introductory course. That is because these languages are high level enough to abstract the underlying architecture in such a way, so that programming concepts can be demonstrated without the need to have knowledge of the underlying architecture. This way, algorithms, data structures, and more can be taught, without having the language’s idioms get in the way. What’s more, is that these languages are known to have an easier learning curve, and last but not least, they integrate a great standard library, that student’s can use to deliver something useful, and see real world applications development faster than in C. What’s not to love?

While I personally believe that a serious programmer should know at least one of the so called medium level languages (C, C++), even if they learn to one only to understand more about the way the machine works, I firmly believe that such languages should not be used in introductory courses in computer science.

So how do you feel about that? Do you think that C should be taught at introductory courses in computer science? Or do you feel (like me) that other languages are better suited, and why?

Hello World!

While this blog of mine is not what one could describe as new (it existed for some months, but I had nearly forgotten about its existence) I decided to make an introductory post, after deleting the previous one.

The reason the previous one was deleted was that I repurposed the blog to contain pretty much anything I feel like. Yep, you got it right, it will be a personal blog, and its contents will range from programming, software engineering and computer science (a lot of those, apparently) to pretty much anything I feel like talking, really.

Oh, and by the way, I feel like now is a good time as ever to introduce myself. I am Fotis ‘NlightNFotis’ Koutoulakis, a young, Greek, computer science student. I also loooovvvvveee programming. On my free time I find myself contributing (well, more like trying to contribute) to open source projects.

Well, that is for me and the blog now. Hope I have a good start, and well, I hope you find reading my blog a joy, or at least somehow informative.