Data Mining Syllabus – PyMathCamp

Demand for Data science talent is exploding. McKinsey estimates that by 2018, a 500,000 strong workforce of data scientists will be needed in US alone. The resulting talent gap must be filled by a new generation of data scientists. The term data scientist is quite ambiguous. The Center for Data Science at New York University describe data science as,

the study of the generalizable extraction of knowledge from data [using] mathematics, machine learning, artificial intelligence, statistics, databases and optimization, along with a deep understanding of the craft of problem formulation to engineer effective solutions

Data science.

Data science.

As you can see, a data scientist is a professional with a multidisciplinary profile. Optimizing the value of data is dependent on the skills of the data scientists who process the data.

Intellij.my is offering these essentials with PyMathCamp. This course is your stepping stone to become a data scientist. Key concepts in data acquisition, preparation, exploration and visualization along with examples on how to build interactive data science solutions are presented using Ipython notebooks.
You will learn to write Python code and apply data science techniques to many field of interest, for example in finance, robotic, marketing, gaming, computer vision, speech recognition and many more. By the end of this course, you will know how to build machine learning models and derive insights from data science.

The course is organized into 11 chapters. The major components of PyMathCamp are:

1) Data management (extract, transform, load, storing, cleaning and transformation)

We begin with studying data warehousing and OLAP, data cubes technology and multidimensional databases. (Chapter 2, 3 and 4)

2) Data Mining (machine learning technology, math and statistics)

Descriptive statistics are applied for data exploration. Mining Frequent Patterns, Association and Correlations. We will also learn more on the different types of machine learning methodology through python programming. (Chapter 5)

3) Data Analysis/Prescription (classification, regression, clustering, visualization)

At this stage, we are ready to dive into data modelling with different types of machine learning methods. PyMathcamp includes many different machine learning techniques to analyse and mine data, including linear regression, logistic regression, support vector machines, ensembling and clustering among numerous others. Model construction and validation are studied. This rigorous data modelling process is further enhanced with graphical visualisation. The end result will lead to insight for intelligent decision making. (Chapter 6 and 7)

Source: Pethuru (2014)

Source: Pethuru (2014)

Encapsulating data science intelligence and investing in modelling is vital for any organization to be successful.

Hence, we will use our data mining knowledge gained from the above chapters to analyse, extract and mine different types of data for value. Or more specifically spatial and spatiotemporal data, object, multimedia, text, time series and web data. (Chapter 8, 9 and 10)

After spending a few months learning and programming with PyMathCamp, we will end the course by updating you with the latest applications and trends of data mining. (Chapter 11)

In conclusion, PyMathCamp is the perfect course for student who might not have the rigorous technical and programming background required to do data science on their own.

Credit to: Joe Choong

“Future belongs to those who figure out how to collect and use data successfully.” 

Muhammad Nurdin, CEO of IntelliJ.

button

Leave a Reply

Your email address will not be published. Required fields are marked *