Friday Aug 27, 2010
What is data science?
"The future belongs to the companies and people that turn data into products" says Mike Loukides on O'Reilly Radar. His "What is data science?" article, is an interesting read, that can be found here http://oreil.ly/dknxJV.
I've used some of the tools mentioned like the Python programming language and the Beautiful Soup library to clean up HTML. It has allowed me to deliver some analytics that have combined together data from multiple sources over the internet to project some predictions about the future. In one customer assignment, I combined together Australian Federal Government data, with local State based Population Projections to effectively create a wealth of data about future market shares. This was all done with Python and Beautiful Soup on my Mac Book Pro. I didn't need my own database or data warehouse as I was working with thousands of bits of summary data that was readily available over the internet.
In other activities, I've used Apache Hadoop and its Map Reduce framework to process Australian financial market statistics on the full trading day history of all 2000+ listed companies on the ASX. I've also recently investigated Apache Mahout with its machine learning capabilities and am in the process of learning Apache Pig & Apache Hive to store and process data on top of Apache Hadoop.
All this software is free open source and scales to process large volumes of data on commodity infrastructure.
However, some strong analysis and programming skills are required. I'm working on advancing my knowledge also of statistics that are pertinent to these endeavours. In the past I've found the O'Reilly's book Programming Collective Intelligence to be excellent.
I agree with the Hal Varian quote also mentioned in Mike's post "The ability to take data -- to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it -- that's going to be a hugely important skill in the next decades."
Tags trends data visualization datascience analytics statistics | Comments 0
Saturday Nov 10, 2007
Trends for 2008 corporate web sites
Our current company web site, gets great comment from the people who read it as a great example of a corporate web site. There are a few pages that get some hits and this is mainly people trying to find out about us, our services or our contact information. In contrast our blog server gets a significant higher number of hits, not only from the search engines but also from our ATOM/RSS syndicated content that is propagated to other aggregators.
The difference is so much so, that our organisation has started working on a new web site for 2008 and beyond.
The aim of this web site so far, is to:
- showcase our delivery capabilities through example;
- highlight recent blog entries for those that may not know of our blog server;
- leverage our investment in social software;
- promote our company, our success, our services and our products;
- use web 2.0 style components to create a richer experience;
- deliver more meaningful and precise memes in a shorter time frame;
- create an interactive brochure about our organisation;
- reduce reliance on "www" type addresses;
- move to a more dynamic and richer experience;
- maintain just enough content on the page for Search Engines;
- direct people to other means of learning about us as people (living mamals) and engaging with us besides the "traditional" brochure ware site; and
- show what we believe to be the corporate web site style for 2008 to 2010 time frame.
To achieve the above we are experimenting at the moment with a one page interactive design using tools from Google Code, mootools and the Dojo Toolkit along with flash animation (reverting to static images for those corporates that have not enabled it). The one page interactive design is really different and does look cool (not ready to show people just yet).
To a large extent, our current production site, has been optimised for speed with lower bandwidth connections. Looking at Google Analytics that represents less then 5% of visitors. So the new site will load slower for the dialup guys, but c'mon they should upgrade to something better anyway (hope I'm not being discriminate)
Consideration was given for using our blog server as our web site with a couple of extra pages, but I don't think corporates are ready for that (or maybe it is us who are not ready?). So our experiment with the pre-mentioned aims, seems to be shaping up nicely.
Tags interactive corporate trends business brochure web | Comments 0
