Machine Learning for startup development

Machine Learning is not a new thing in computer knowledge landscape. The part of Artificial Intelligence concept is growing popularly nowadays along with the growing of awareness of several parties for managing digital data and system automation for replacing manual part done by human.

The implementation and the usage of Machine Learning has been felt, yet we do not realize that. In a simple language, Machine Learning technique as a computer algorithm to learn data to recognize patterns and to make model based on historical data. That model is used for classifying or predicting new data that facilitate us for making or supporting a decision taking process.

The Analogy of  Machine Learning Concept

If we want to hold an event that involves many startup for doing presentation about the product and its potential in Malaysian market. The committee has successfully collected 5 startups, those are: GrabCar, Traveloka, Sallyfashion, Tiket and Foodpanda.

In order to make the event runs smoothly, the committee decided to separate the presentation session based on startup category. For example your team is not in the office because they have to do field work. The team event should identify certain startup based on the defining category independently. Since your team event has done the same event many times and has met with various startup the team event has several things to judge for designing startup based on its category.

Startup Name


The food category product startup will usually contain words related to food or restaurant. It works the same for travel category product startup will usually contain things related to travel. And the transportation category product startup will usually relate to transportation. Fashion category startup product will usually contain words related to fashion and clothes.

Startup Logo

The food category product startup will use the logo about food equipment attribute. Travel category product startup will present about travel. Transportation category product startup will have a logo related to street. Fashion category product startup will have a logo related to fashion and clothes.

Based on those two details, the team event can classify 5 startups based on product cathegory:

Based on startup name

Food product category: Foodpanda (there is a food word). Travel product category: Traveloka (there is a travel word), Tiket (there is a tiket word). Transportation product category: GrabCar (there is a word car). Fashion product category: sallyfashion (there is fashion).

Based on Startup Logo

Its definition is about the same with the name of the first startup point.
In technical Machine Learning term, name and startup logo are part of features and several of each feature are called a frequency distribution. That is the learning process of a machine.

Sometimes in certain feature, an object does not have suitable specification, for example in the above example, the sallyfashion logo is in the form of word. However, it can be clearly identified in the previous feature that it is a startup. But it can be tricked by giving more features and detailed frequency distribution. That is how the Machine leraning algorithm is arranged so that a computer machine can learn.

The solution development for startup tech on Machine Learning basis

Before talking in detail about scope that can be done by a tech-startup with Machine Learning concept, there are technology implemented challenge:

There are many challenges against technology implementation of Machine Learning in Malaysia. Those are the low payment of workforce so that it makes it difficult to make budget efficiency argument, the low understanding of the usage of technology until the fear of irreplaceable workforce.

However, the existence of nowadays Machine Learning is excellent and it makes it possible to make a system that learns by itself about a set of complex data and large scale with minimum human intervension. The implementation is considered successful if the automatization process if the result of automatization process is able to get close to human job quality with reachable price. This kind of usage is believed can be a contradiction from the above issue.

I am optimistic that Machine Learning technology implementation in Indonesia will be used by many companies that specialized themselves in technology development.

An important simple law becomes a reference even though a technology concept seems complicated. But the implementation of technology is not always for big and complicated problems.

The simplest implementation of Machine Learning, is the identification of spam/junk. The techniques that is used is learning the given data that has been labeled (spam or not) by extracting the features which is later used as an input parameter from algorithm that is used for classification. For automatization, a model is made for showing the learning result and also the algorithm that is used. That model is now later used for classifying or predicting new data.

Another example that is commonly found in online public sector is content recommendation. It starts from article recommendation that is related to the articles that are being read in an online media sites, another product that is related to the product that is seen in another commercial site, until the video that is related to the video that is being watched in online watching sites.

Machine Learning has already been used for specific industry in Malaysia that is not directly related to public, for example identifying attack pattern (from hacker, rootkit, virus, and etc) that is aimed to a certain network and it automatically doing the blocking, doing an automatic bide advertisement (autobid), identifying the users character based on their daring activity, predicting the events or figures who are predicted will be the news centered until making automatic content texts.

The developing area for tech-startup for Machine Learning technology is for the products mentioned above. It is because the budget can be paid based on its necessity (on-demand) because the easiness of cloud computing technology. Even it is for back-end technology Machine Learning. Several provider cloud computing have prepared them for ready to use.

Market demand for technology solution based on Machine Learning platform

Author sees that nowadays Malaysia needs Machine Learning development technology experts. This might be connected to industrial focus which is more focus on marketing aspects rather than technology investment. As a result, the industry chooses to be more focus on choosing a ready to use implementation ( by using PaaS and SaaS) that have been developed by overseas parties. Along with the awareness of efficiency that can be accumulated by more custom technology, it is sured that this kind of technology will be prioritized.

So this is a good chance for quickly preparing ourselves to learn. Ikhwan recommends several learning references that can be used, those are:
Introduction to Artificial Intelligence” by Sebastian Thrun and Peter Norvig. Sebastian Thrun is known as self-driving car maker in Google and Peter Norvig is one of the artificial technology pioneers who is now becomes Director of Research in Google.“Machine Learning” by Andrew Ng, who is one of the Stanford professors who is later becomes Chief Scientist in Baidu Research.“Neural Networks for Machine Learning” by Geoffrey Hinton, who is known because his research about neural network. He works as a Distinguished Researcher for Google and also Distinguished Emeritus Professor in University of Toronto.

Besides those three members mentioned above, there are many sources that can be used and can be found by using search engine. This automatization system will become a new way of service, including in Malaysia as its benefit can be enjoyed by many.

The tester certification is easy to get

When you gonna do penetration test (pen-test), it is very important to make sure that your organization has qualified experts to do it. And there’s more.

The quality of your tester defines the test results. The more expertise of your tester, the bigger possibility you’ll get an in-dept view about your security systems.

But wait…

Pen-test has quite huge potentials to make new trouble. It differs from any other testing method that enable the tester work with your IT team. This is why pen-test has laten problems. Let me explain.

In a system attacking process, a pen-tester has to be cautious to not make any fatal damage. Why? Because the attack could violate your organization’s policy, messing up with your operation system, modifying ownership structure, etc. Thus, a careless pen-tester (wait… a careless tester?), lack of expertise, or disobeying ethics tends to cause huge problem for you.

If your IT staf detects the blackbox (zero knowledge) pen-test, usually he will be shocked. Because he doesn’t get your notification first. That’s why a pen-tester has to make sure that your IT staff will not overact. Because the test is allowed by management but covered from tech-team.

It will be ridiculous if one of your staff reports the case to the police. Everyone will be embarassed because the test is under your permissions. Except, there is incident happens outside the scope of agreed test. Or, the tester does something immoral. (Stealing your datas or consciously tear down your IT system).

MAQ (Most Asked Question): Should we hire hackers to do pen-test?

Well, If “hacker” refers to people who do online vandalizm, the answer may be NOT! If you want to test your door-locking systems, why would you hire a painter?

NAQ (Next Asked Question): Is the testing cost expensive?

It depends.

It depends on pen-tester expertise level, duration, or scope of the test. Duration affects result. For maximum result, you need intensive testing with enough duration. “With price, comes face.” (Kidding. Corrrect me because I’m wrong. I’m just free translating C++ Java proverb). There’s no free lunch, dude.

Really?

Of course there is. Just visit hackers community and challenge them. Most of the time they will not give their “lunch”. But when you get the volunteer, your lunch will be free. My advise is: create controversial issues. Such as bring up ethnicity, religious, race and inter-relations issue. Or, convince them that the organization is sponsored by zion. Challenge accepted. Thank me later.

And you know what?

Penetration test is more than just a technical attack. You have to consider the business side. Such as making sure how the result will effect business decision. Pen-test has to be inline with your organization security strategy.
If you’re interested in pen-test, you have to test the pen-test. Seriously.

A confession that he’s a hacker is not enough. May be he tell you that he has sent to jail because of his hacking activiy. May be he tell you that he’s an ethical hacker. His “cool-bad-guy-hacker story doesn’t guarantee his skill. Real hacker will never be “ethical”. But, he won’t get caught either. A perfect combination of bravado, curiousity, and intelligence.

Ethics doesn’t substitute skills. Like the Y Combinator founder said.

Only script kiddies and hackers wannabe get caught. That’s what you get when you have big ego with tiny skills.

Well, at the end of this article I’m gonna tell you a secret:

Never trust a guy who calls himself an ‘ethical hacker’. Especially if he shows you his certifications. Why? Just think of him like a Indonesian who got Driving License in a fucking easy way. You got the license, but not the skill.