Skip to main content

Python for Data Science

Looking at more resources online for Python for Data Science.

There are many good resources available.

Of course the main tools are: NumpyPandasMathPlotLibSkiKit-Learn has some amazing tools.

Kaggle for instance has Data Science contents, but good to install a local system like the Jupyter Notebook to speed things up as the Kaggle editor can lag and take some time to run on small data-sets.

The newer DataCamp has some neat tutorials on it and simple App to do daily exercises on your mobile device.

Here is the Python DataScience Handbook. Really useful.

A short tutorial: Learn Python for Data Science, a fun read.

A list of cool DataSci tutorials is here, and another how to get started with Python for DS.

Will add more later.


Popular posts from this blog

Darrell Ulm Git Hub Profile Page

This is the software development profile page of Darrell Ulm for GitHub including projects and code for these languages C, C++, PHP, ASM, C#, Unity3d and others. Here is the link: https://github.com/drulm The content can be found at these other sites: Profile , Wordpress , and Tumblr . Certainly we're seeing more and more projects on Github or moving there and wondering how much of the software project domain they currently have percentage-wise.

Getting back into parallel computing with Apache Spark

Getting back into parallel computing with Apache Spark  has been great, and it has been interesting to see the McColl and Valiant BSP (Bulk Synchronous Parallel) model finally start becoming mainstream beyond GPUs. While Spark can be some effort to setup on actual clusters and does have an overhead, thinking that these will be optimized over time and Spark will become more and more efficient.  I have started a GitHub repo for Spark snippets if any are of interest as Apache Spark moves forward 'in parallel' to the HDFS (Hadoop Distributed File System).