In a previous post I talked a little about the type of analytical work I used to do in my previous life as a consultant in Cluster (then DiamondCluster, then Mercer, now OliverWyman). Most of this work was done using SAS, an statistical package I personally dislike.
The reasons for using SAS were mostly historic. It was used by one of our first clients in this area, and then a certain myth develops around the tool. It has been proved that works so... it has to be used. Consultants are a little bit risk-averse.
The whole concept is built around disk-based datasets, you read them, you sort them, you tranpose them... and you always move from a dataset to another dataset. Actually it is not so different from a relational database, and for all my disliking, SAS provides data manipulation capabilities that are at the very least similar or quite often better, if harder to use, than any database you may find.
That explains why people are using SAS and they are quite happy with it. And they have sound reasons for that. SAS provides with unparalleled ability to perform all type of manipulations and statistical operations on very large datasets. In the right environment, that is with the right people like some of the ones I met in the US, who were extremely knowledgeable about Statistics and had spent a lot of time programming in this environment, you have a winning combination.
But I still dislike the tool for three main reasons:
-I find it error-prone (syntax is horrible), and I would not like to be judged on all the SAS code I had to produce
-It is extremely programmer unfriendly (did I mention that the syntax is horrible) it is always really hard to debug code written by somebody else, especially people not trained in some basic programming practices, but SAS makes it even harder.
-SAS is not for casual programmers, if you are hiring people to offer them a career in consultancy they will be burnt if you force them into this. And finding good SAS programmers for contracting is not easy. To me this is a good enough reason to look for alternatives.
So if I am to replace SAS I need to find something that provides me similar flexibility and a more productive environment. And I have to say that so far I have not found it in one package. I am in the process of building it using what is available.
My current setup involves three different elements:
-A database, currently MySQL but will probably move to Postgres
-A statistical package, currently R . As good as SAS and I like the fact that I can focus on the statistics and forget the data part (done in the database)
-An external programming environment to act as a glue. I am currently using AppleScript and Perl (actually the PERL part has been done by one of my colleagues, Sergio). If I feel geeky enough I may even go to Objective C on my Mac.
My bet is that will be easier to find people who know how to deal with SQL databases (a commodity nowadays) and a language like PERL. R is also quite popular. I also like the fact that all are publicly available software giving me a broad pool of talent to tap (and access to the sources if required). Also I love the fact that all of this is running on my Mac, a much nicer environment to work. I know people think Apple machines are mostly for "creative" guys who do media production. Actually they can be used as a scientific workstation quite easily. This is much better than processing this in MS Access as I had to do a few times.
Of course my laptop is not the best environment to run a database server of several gigabytes, but this is easily scalable.
I will take a look at how the different elements are doing on a later post.
Sunday, March 16, 2008
Friday, March 14, 2008
Springtime in Europe
I have spent the last few weeks helping my partners in Europe with a project involving one of the largest mobile operators where we are looking at how to improve the handset strategy.
This is another striking difference between Europe and the ME in how companies operate. In the ME people are used to pay fully for their mobile phones, in Europe handsets were subsidized almost from the beginning. That means that operators make a substantial investment (up to 100 euros) upfront to capture a new customer and expect to recover the cost of the handset over the next months. The whole thing has grown, just do the maths: 100 euros per subscriber times 10 million handsets per year (or 60 million if you look at the whole operation) and you are quickly talking serious money here.
As operators face an increasing pressure on voice tariffs they are starting to look more and more into content related services to provide an extra revenue. New services (games, music, video...) require handsets with more advanced capabilities that are, not surprisingly, more expensive. Mobile operators need to understand how these new features translate into additional revenue through the consumption of these content-rich services.
One way of doing this would be asking people what they want and why and then tailor the offer to meet their needs. The problem with this approach is that people are not good at forecasting their own consumption (that is why we spend so much effort designing pricing plans that will give the perception of low tariffs while protecting the revenue of the operator).
The way we prefer is to look at the actual data patterns. We have extracted a quite large sample of their historical usage data and we have crossed it with the information regarding the handset used. This involves processing a large volume de data on relatively sophisticated ways (at least for what consultants usually do).
In the past we used SAS for this but I have never been happy about that as the syntax is horrible, the licencing process totally unfriendly and there is a serious scarcity of qualified programmers. So this time I decided to give a go to some alternative software. So far we are using MySQL and R running of course on my Mac.
I will post a more detailed assessment once the project is over but so far I am quite happy.
This is another striking difference between Europe and the ME in how companies operate. In the ME people are used to pay fully for their mobile phones, in Europe handsets were subsidized almost from the beginning. That means that operators make a substantial investment (up to 100 euros) upfront to capture a new customer and expect to recover the cost of the handset over the next months. The whole thing has grown, just do the maths: 100 euros per subscriber times 10 million handsets per year (or 60 million if you look at the whole operation) and you are quickly talking serious money here.
As operators face an increasing pressure on voice tariffs they are starting to look more and more into content related services to provide an extra revenue. New services (games, music, video...) require handsets with more advanced capabilities that are, not surprisingly, more expensive. Mobile operators need to understand how these new features translate into additional revenue through the consumption of these content-rich services.
One way of doing this would be asking people what they want and why and then tailor the offer to meet their needs. The problem with this approach is that people are not good at forecasting their own consumption (that is why we spend so much effort designing pricing plans that will give the perception of low tariffs while protecting the revenue of the operator).
The way we prefer is to look at the actual data patterns. We have extracted a quite large sample of their historical usage data and we have crossed it with the information regarding the handset used. This involves processing a large volume de data on relatively sophisticated ways (at least for what consultants usually do).
In the past we used SAS for this but I have never been happy about that as the syntax is horrible, the licencing process totally unfriendly and there is a serious scarcity of qualified programmers. So this time I decided to give a go to some alternative software. So far we are using MySQL and R running of course on my Mac.
I will post a more detailed assessment once the project is over but so far I am quite happy.
Thursday, January 17, 2008
Small things that make your life easier
I got this week the visit of Cristobal, one of my business partners and you can find him here growstartup.blogspot.com . After a number of meetings we had a few hours to try to enhance his experience. I realized that he is (happily) oblivious to most of the software available for his MacBook so I decided to help him choose a few (mostly free) utilities that could help him.
We work a lot with PDF documents, they are in fact one of the few truly interoperable and multiplatform standards you can find. They work 100% to distribute a read-only document, unfortunately not everybody is smart enough to use them properly when some interaction is needed. Most of the forms I need to fill are just plain pdf files, designed to be printed and faxed back. This is neither ecological nor efficient and we have some alternatives.
FormulatePro/ is a nice free utility that allows you to annotate and modify any pdf file. I use it to open the form, fill it and then Print as PDF and fax back. It is even free and I have never experienced any problem (I am running tiger)
If like me you have not migrated yet to Leopard, you may find this of interest PDFMergeX is another pdf-related utility that allows you to merge several pdf documents and print them as a single document.
I find it ironic that people send documents in PDF because they expect you not to modify them and I am advising on how to do alterations to them.
Another free utility that can save you time is Shrook, this is a news reader that can help you to get the news that you find relevant on a more compact way. I find it much better that going through email digests or going to the website to get the news. One thing I am still looking for is more Middle Eastern related channels. If you know of any please let me know.
Finally, I also advised Cristobal to use Adium , this is my favorite messaging application. It support most if not all the protocols you may wish and is very nicely integrated with Growl.
All of these tools are free and all of them do what they are supposed to do. As a matter of principle I try to keep my system as simple as possible and that involves using as little pieces of software as possible and have them as small as possible.
I rather have 5 highly focused utilities than a large piece of software that does 20 different things. Life is too short to have to deal with malfunctioning computers.
We work a lot with PDF documents, they are in fact one of the few truly interoperable and multiplatform standards you can find. They work 100% to distribute a read-only document, unfortunately not everybody is smart enough to use them properly when some interaction is needed. Most of the forms I need to fill are just plain pdf files, designed to be printed and faxed back. This is neither ecological nor efficient and we have some alternatives.
FormulatePro/ is a nice free utility that allows you to annotate and modify any pdf file. I use it to open the form, fill it and then Print as PDF and fax back. It is even free and I have never experienced any problem (I am running tiger)
If like me you have not migrated yet to Leopard, you may find this of interest PDFMergeX is another pdf-related utility that allows you to merge several pdf documents and print them as a single document.
I find it ironic that people send documents in PDF because they expect you not to modify them and I am advising on how to do alterations to them.
Another free utility that can save you time is Shrook, this is a news reader that can help you to get the news that you find relevant on a more compact way. I find it much better that going through email digests or going to the website to get the news. One thing I am still looking for is more Middle Eastern related channels. If you know of any please let me know.
Finally, I also advised Cristobal to use Adium , this is my favorite messaging application. It support most if not all the protocols you may wish and is very nicely integrated with Growl.
All of these tools are free and all of them do what they are supposed to do. As a matter of principle I try to keep my system as simple as possible and that involves using as little pieces of software as possible and have them as small as possible.
I rather have 5 highly focused utilities than a large piece of software that does 20 different things. Life is too short to have to deal with malfunctioning computers.
Subscribe to:
Posts (Atom)