Data Mining Syllabus – PyMathCamp

Demand for Data science talent is exploding. McKinsey estimates that by 2018, a 500,000 strong workforce of data scientists will be needed in US alone. The resulting talent gap must be filled by a new generation of data scientists. The term data scientist is quite ambiguous. The Center for Data Science at New York University describe data science as,

the study of the generalizable extraction of knowledge from data [using] mathematics, machine learning, artificial intelligence, statistics, databases and optimization, along with a deep understanding of the craft of problem formulation to engineer effective solutions

Data science.

Data science.

As you can see, a data scientist is a professional with a multidisciplinary profile. Optimizing the value of data is dependent on the skills of the data scientists who process the data.

Intellij.my is offering these essentials with PyMathCamp. This course is your stepping stone to become a data scientist. Key concepts in data acquisition, preparation, exploration and visualization along with examples on how to build interactive data science solutions are presented using Ipython notebooks.
You will learn to write Python code and apply data science techniques to many field of interest, for example in finance, robotic, marketing, gaming, computer vision, speech recognition and many more. By the end of this course, you will know how to build machine learning models and derive insights from data science.

The course is organized into 11 chapters. The major components of PyMathCamp are:

1) Data management (extract, transform, load, storing, cleaning and transformation)

We begin with studying data warehousing and OLAP, data cubes technology and multidimensional databases. (Chapter 2, 3 and 4)

2) Data Mining (machine learning technology, math and statistics)

Descriptive statistics are applied for data exploration. Mining Frequent Patterns, Association and Correlations. We will also learn more on the different types of machine learning methodology through python programming. (Chapter 5)

3) Data Analysis/Prescription (classification, regression, clustering, visualization)

At this stage, we are ready to dive into data modelling with different types of machine learning methods. PyMathcamp includes many different machine learning techniques to analyse and mine data, including linear regression, logistic regression, support vector machines, ensembling and clustering among numerous others. Model construction and validation are studied. This rigorous data modelling process is further enhanced with graphical visualisation. The end result will lead to insight for intelligent decision making. (Chapter 6 and 7)

Source: Pethuru (2014)

Source: Pethuru (2014)

Encapsulating data science intelligence and investing in modelling is vital for any organization to be successful.

Hence, we will use our data mining knowledge gained from the above chapters to analyse, extract and mine different types of data for value. Or more specifically spatial and spatiotemporal data, object, multimedia, text, time series and web data. (Chapter 8, 9 and 10)

After spending a few months learning and programming with PyMathCamp, we will end the course by updating you with the latest applications and trends of data mining. (Chapter 11)

In conclusion, PyMathCamp is the perfect course for student who might not have the rigorous technical and programming background required to do data science on their own.

Credit to: Joe Choong

“Future belongs to those who figure out how to collect and use data successfully.” 

Muhammad Nurdin, CEO of IntelliJ.

button

Reinvent Blockchain with Artificial Intelligence

Austin Flow Map 1 12-15-08

Blockchain is the future of finance. It is the future of how all transactions is going to work. We are talking about money, business and the world. Blockchain decentralizes all transactions via nodes, which are participated by market participants such as peoples, businesses, banks and securities firms. It is state-of-the-art technology. Thus, the blockchain thinking is required now in order to create efficient transactions for enlarging future.

Bittersweet of Blockchain.

Blockchain will be used to create smarter, more efficient systems for all supply chains, gaming, multi-media rights management, car rental, real estates sell & purchase, Government proof of identity and insurance record management.

Openness, speed, security, privacy, reliability, accountability and transparency will be the key values running in blockchain thinking. The only bitter thing about blockchain is middlemen/intermediaries will be cut out.

Auto-compliance.

We believe that AI is making world a better place. We are interested about AI role in Blockchain.

The technology is powered by second generation, as the first one has brought us to Internet of information era. The second generation is demanding compliance in their businesses, which has open new innovation door to transform all human affairs, wiser.

You should care because you would wanted to know where is this meat came from. Or for artist, you wanted to know who is currently have your trademarked/IP properties. For donors, you can know where you donations goes to. For Muslim, you can know where and how your zakaat is managed.

Now, you know the capabilities of this revolution. Blockchain is indeed an intense creation.

Permissionless & Permissioned Blockchain.

Blockchain is originally permissionless, publicly shared and decentralized. Meaning, it is detached from the central monetary system. Historically, Bitcoin is one of the famous and the most ideal cryptocurrency for technology fundamentalists. But not economists. Security firms claimed that participants that are involved in permissionless Blockchain is danger to the society that the firms faced difficult time to track suspicious activity.

Second, the ledger is shared to act as a source of truth for businesses in the blockchain. This means, shared ledger can records all transactions across the business network among participants and copies of the records are exact and replicable.

Today, permissioned and private Blockchain are existing. Permissioned means participants can only see sufficient documents that requires consensus/agreement/permission. They need to be validated from involved ends to reach consensus. This is to lower the risk of faulty transactions since interference can be occured across many places at the exact same time.

Blockchain network will be not just public ledger, but also concern about private parties in which only authorized parties are allowed to join in the transactions. Records can be protected with a digital signature and to seal the record, the permissioned blockchain will generate a private and public key. Therefore, participants can only see what they are allowed to see. Versioning history of unique IDs for customer, invoice and reference numbers will be appear clear and transparent in all transactions which are unchangeable and final.

IntelliJ’s quest in Blockchain.

Despite all these fancies about Blockchain, we like to extend its capability argumentatively.

Efficiency, trust-ability and independence are the keys of a mining operation so to record transactions.

We are interested in this quest:

How to make the operation to think independently and shared mutual way of thinking in all networks as creatively and spontaneously as an ingenuine human?

Conclusion.

Finally, we are absolutely aware that blockchain will create winners and losers. Although so, we cannot afford dislocation and insecurity in our money management.

Centralized is thought as suitable for building-fintech-products thinking, whilst decentralized is more free-ier, allowing re-innovation in the whole operation. Both ways are responsible to generate prosperity to the country.

In case you are interested too in innovating blockchain technology, we should spend more time in its thinking within compliant philosophy together because the disruption is real and is indeed coming into the way.

Imagine about empowering blockchain in all IoT network. For that, time will be the ultimate unit of measurement.

Fin.

Transformation is not about improving, it is about re-thinking. -Malcolm Gladwell

EagleEye – Malaysia’s first drone company aims to alert you real-time if your neighborhood is under conflicts.

Human transplanted kidney costs MYR 5,000 to MYR 9,000. This is the price tag in recorded 2012, four years back. And about 60,000 transplants are taking place worldwide each year, where 1 in 10 were done illegally. 

Scary enough?

Daily Mail reported that:

“Wealthy patients are paying up to £128,500 (roughly $191,028 today) for a kidney to gangs, often in China, India and Pakistan, who harvest the organs from desperate people for as little as £3,200 (roughly $4,756 today)”

For your information, organ trading and trafficking has been a “very profitable and impatient industry” since The New Millennium. And it is still ongoing, staggering US$1 billion a year! That number is only  in China alone. p:s I wonder if they are hiring.

Organ harvesting drama.

 

Enough about kidneys, how about other crimes such as human kidnapping, robberies, rapes and murders?

Crimes are taking places everywhere and citizen’s security is still a gamble. We do not have military level security, to watch over huge perimeter, taking care of everyone’s safety. 

The Solution – EagleEye

an intelligent drone that fly for only one reason; to save lives.

 

Cypher-UAV

Cypher-UAV

 

#1 How?

EagleEye patrols your neighborhood by flying autonomously day and night, 24/7. Using high definition camera, EagleEye detects suspicious activity, analysing them using artificial intelligence.

EagleEye captures visual evidences like video and image, in real-time using Computer Vision technology called as OpenCV.

Whenever EagleEye detects suspicious activity, it will instantly triggers alert to authorities like police and security agencies for instant action. Snapshots are taken for human verification, making sure that crimes are managed at almost instant.

#2 EagleEye is intelligent enough to analyse crime event

 

Cypher

Cypher

We implanted artificial intelligence so that it can learn crime and differentiate between serious and non-serious activities. Well, we do not want to send officer just because two guys just give each other’s a paw.

The level of intelligence does not only detect occurred events but uses captured data to predict upcoming crime activity. Like, busy but wide roads are highly potential for theft involving bikes.

EagleEye is also ready to provide feeds 100% accurate to the involved parties.

#3 EagleEye craves for top safety

 

Imagine a security guard “flying” 10 metres away from your house and he never need get drowsy, never need lunch break or even take 5. He keeps roaming around until he needs “sleep”. While he’s off to sleep, another security guards come and continue scouting. Another words saying is, its a 24/7, no pause, security.

Nahhh.. I have CCTVs?

Good for you, sir. However, CCTV is static. 

Meanwhile, EagleEye flies autonomously, detects suspicious activity intelligently, alerts the involved authorities, and at the same time, helps police to execute safety and security well planned and throughout.

#4 Shut up and take my money!!

Hang on there buddy, EagleEye is waiting for a new regulation on the use of drones in Malaysia which is anticipated that Department of Civil Aviation (DCA) will renew Act of Aeronautical Information Circular (AIC) 4/2008 this year 2016.

Lets pray together it is executed ASAP.

Conclusion

Drones is very soon to be reliable usage in daily life.

We are not here to take anyone’s job. But drones will be the ultimate solution, to ensure highest level of security around us. The implementation will be in stages, thus we still need people to cooperate with.

And EagleEye will never be dictator’s army. It has the communication ability to hear what you want it to hear. For instance, you saw a guy brutally attacked at a corner. You can tell the drone to go and check it out, instead of risking your safety. Drones will take care of it with its procedures.

Some words from the IntelliJian – “we are too excited to get this project on field. We cannot wait to help more people with our AI solutions”.

Notice

To achieve better neighborhood’s security, we are inviting skillful engineers and innovative designers to work with us, so that all Malaysian citizens can live happier in a longer period of time. Please do contact us.

Many thanks.

Images credited to KONAMI: https://us.konami.com/mgs/

TraitHire – Eliminate injustice in conventional job hiring system using its Artificial Intelligence

The true challenge in job recruitment is recruiters has no adequate tools to accurately measure skill sets and qualification. Inaccuracy is due to human bias, which leads to injustice.

dilema graduan industri pengalaman pekerjaan

A story from a victim of conventional hiring system.

This idea is inspired from a friend of us who was rejected by job finder system, simply because he did not have “3 years of experience”.

In reality, he was a really talented programmer and he applied jobs at big companies like IBM and Quintiq. Sadly, the conventional hiring system did listed him out as he was not given a chance to prove his quality at all. Without many choices, he had to pursuit his career with smaller company, that provides less benefits.

On the other hand, that big company has to struggle to deal with another programmer they hired. That person barely do basic problem solving, simply because he is actually not as good as he was in the resume. Well, he had what the recruiter demanded – 3 years of experience.

With “impressive” communication skills, he secured the job even despite his poor technical skills. Just because interview’s conclusion is made of human bias, his skills were not evaluated carefully and his promising resume remains ‘outstanding’.

The Solution – TraitHire

#1 TraitHire is the missing ingredient in today’s conventional systems 

Recruiters already has a method to filter out hiring employees’ skills by assigning them technical (for technical job) and interpersonal assessments, so recruiters will make less mistake in hiring. But, these assessments are time-compressing and less effective.

So, we want to offer accurate list of workforce to recruiters, to hire accurate skills and qualification faster. We do not think that employer should spend time, testing job applicants every time they do hiring. Let the system automatically do it, while employers just need to sit back and relax, since more accurate and effective filtration is being done by TraitHire.

The next big thing is our artificial intelligence will outsmart future recruiting by matching job candidates personality with job description or client’s requirement.

#2 Inaccuracy- Problem that is worth fighting

Conventional job finder sites are inaccurate, demotivating, unreliable and unjust.

We want to solve this matter quickly because there are lots of talented programmer out there who need to be treated and rewarded fairly. While employers do need “someone” to advice them on making hiring decisions.

We CAN fake things up just to look good, especially resume. Or candidates can just hire someone else to build their CV, voila, a really REALLY good looking document. Sadly, it is only good on papers.

TraitHire uses real human being personality, taken from job candidates’ brain wave signal using special device. This is a thing which we can never fake up. The personalities are, of course,  accurate and reliable.

#3 Personality plays a significant role in workplace behaviors

This study noted the influence of personality does not begin the day a person starts work. Rather, how a person prepares for his or her entrance into the job market and the kind of impression that person makes during the recruiting process also appear to be a function of the applicant’s personality.

Almost all of the companies use personality assessments to assess job candidates’ personality. This shows that industrial organizations are not neglecting personality as indicator in making hiring decision.

Even after assessing personality traits, different organizations may use different methods to assess interpersonal skills such as ability to work in a team, communication, service-oriented, and ability to inspire and etc. These organizations should know that interpersonal skills and personality traits have positive correlation.

Interesting news, we have found that particular cognitive abilities or personality traits influence interpersonal processes and interpersonal skills when they account for task-level demands.

Meanwhile, Klein’s, a meta-analysis found that extroversion and agreeableness are predictive of interpersonal processes and interpersonal skills.

So, based on these evidences…

TraitHire’s aims are as follows:

  1. To save recruiters’ time and laborious work to hire the right personality of job candidates which fits in role without assessing it via inaccurate and time-consuming personality assessments.
  2. Recruiters able to hire true skills and qualifications regardless years of experience.
  3. Recruiters avoid human bias.
  4. Recruiters won’t lose sight of right job candidates or slide out of them to competitors.
  5. Recruiters will no need or less need to watch multiple videos to shortlist right candidates.
  6. Job candidates will no need to build ridiculous resume anymore, but can start focusing on building skills, solely.
  7. Job candidates will have satisfied working environment by being hired to the job that he or she really committed and be rewarded fairly.

Conclusion

Approaching the future, recruiters that are still relying on the conventional system to find, evaluate and hire employees, are predicted to continuously face tougher and inaccurate hiring process.

We believe that intelligence is needed to solve these problems; job distribution, sourcing, candidate engagement, scheduling, and selection, faster.

After this, recruiters shall focus on recruiting real skills and qualifications that job candidates really possess.

We want to go extra miles, not just in IT industry but all. Join us.

Thank you.

Image credit.

Machine Learning for startup development

Machine Learning is not a new thing in computer knowledge landscape. The part of Artificial Intelligence concept is growing popularly nowadays along with the growing of awareness of several parties for managing digital data and system automation for replacing manual part done by human.

The implementation and the usage of Machine Learning has been felt, yet we do not realize that. In a simple language, Machine Learning technique as a computer algorithm to learn data to recognize patterns and to make model based on historical data. That model is used for classifying or predicting new data that facilitate us for making or supporting a decision taking process.

The Analogy of  Machine Learning Concept

If we want to hold an event that involves many startup for doing presentation about the product and its potential in Malaysian market. The committee has successfully collected 5 startups, those are: GrabCar, Traveloka, Sallyfashion, Tiket and Foodpanda.

In order to make the event runs smoothly, the committee decided to separate the presentation session based on startup category. For example your team is not in the office because they have to do field work. The team event should identify certain startup based on the defining category independently. Since your team event has done the same event many times and has met with various startup the team event has several things to judge for designing startup based on its category.

Startup Name


The food category product startup will usually contain words related to food or restaurant. It works the same for travel category product startup will usually contain things related to travel. And the transportation category product startup will usually relate to transportation. Fashion category startup product will usually contain words related to fashion and clothes.

Startup Logo

The food category product startup will use the logo about food equipment attribute. Travel category product startup will present about travel. Transportation category product startup will have a logo related to street. Fashion category product startup will have a logo related to fashion and clothes.

Based on those two details, the team event can classify 5 startups based on product cathegory:

Based on startup name

Food product category: Foodpanda (there is a food word). Travel product category: Traveloka (there is a travel word), Tiket (there is a tiket word). Transportation product category: GrabCar (there is a word car). Fashion product category: sallyfashion (there is fashion).

Based on Startup Logo

Its definition is about the same with the name of the first startup point.
In technical Machine Learning term, name and startup logo are part of features and several of each feature are called a frequency distribution. That is the learning process of a machine.

Sometimes in certain feature, an object does not have suitable specification, for example in the above example, the sallyfashion logo is in the form of word. However, it can be clearly identified in the previous feature that it is a startup. But it can be tricked by giving more features and detailed frequency distribution. That is how the Machine leraning algorithm is arranged so that a computer machine can learn.

The solution development for startup tech on Machine Learning basis

Before talking in detail about scope that can be done by a tech-startup with Machine Learning concept, there are technology implemented challenge:

There are many challenges against technology implementation of Machine Learning in Malaysia. Those are the low payment of workforce so that it makes it difficult to make budget efficiency argument, the low understanding of the usage of technology until the fear of irreplaceable workforce.

However, the existence of nowadays Machine Learning is excellent and it makes it possible to make a system that learns by itself about a set of complex data and large scale with minimum human intervension. The implementation is considered successful if the automatization process if the result of automatization process is able to get close to human job quality with reachable price. This kind of usage is believed can be a contradiction from the above issue.

I am optimistic that Machine Learning technology implementation in Indonesia will be used by many companies that specialized themselves in technology development.

An important simple law becomes a reference even though a technology concept seems complicated. But the implementation of technology is not always for big and complicated problems.

The simplest implementation of Machine Learning, is the identification of spam/junk. The techniques that is used is learning the given data that has been labeled (spam or not) by extracting the features which is later used as an input parameter from algorithm that is used for classification. For automatization, a model is made for showing the learning result and also the algorithm that is used. That model is now later used for classifying or predicting new data.

Another example that is commonly found in online public sector is content recommendation. It starts from article recommendation that is related to the articles that are being read in an online media sites, another product that is related to the product that is seen in another commercial site, until the video that is related to the video that is being watched in online watching sites.

Machine Learning has already been used for specific industry in Malaysia that is not directly related to public, for example identifying attack pattern (from hacker, rootkit, virus, and etc) that is aimed to a certain network and it automatically doing the blocking, doing an automatic bide advertisement (autobid), identifying the users character based on their daring activity, predicting the events or figures who are predicted will be the news centered until making automatic content texts.

The developing area for tech-startup for Machine Learning technology is for the products mentioned above. It is because the budget can be paid based on its necessity (on-demand) because the easiness of cloud computing technology. Even it is for back-end technology Machine Learning. Several provider cloud computing have prepared them for ready to use.

Market demand for technology solution based on Machine Learning platform

Author sees that nowadays Malaysia needs Machine Learning development technology experts. This might be connected to industrial focus which is more focus on marketing aspects rather than technology investment. As a result, the industry chooses to be more focus on choosing a ready to use implementation ( by using PaaS and SaaS) that have been developed by overseas parties. Along with the awareness of efficiency that can be accumulated by more custom technology, it is sured that this kind of technology will be prioritized.

So this is a good chance for quickly preparing ourselves to learn. Ikhwan recommends several learning references that can be used, those are:
Introduction to Artificial Intelligence” by Sebastian Thrun and Peter Norvig. Sebastian Thrun is known as self-driving car maker in Google and Peter Norvig is one of the artificial technology pioneers who is now becomes Director of Research in Google.“Machine Learning” by Andrew Ng, who is one of the Stanford professors who is later becomes Chief Scientist in Baidu Research.“Neural Networks for Machine Learning” by Geoffrey Hinton, who is known because his research about neural network. He works as a Distinguished Researcher for Google and also Distinguished Emeritus Professor in University of Toronto.

Besides those three members mentioned above, there are many sources that can be used and can be found by using search engine. This automatization system will become a new way of service, including in Malaysia as its benefit can be enjoyed by many.