Machine learning

What does it do?

Machine learning is a branch of artificial intelligence(AI) that is essentially applied statistics. It works by learning from the data itself and continuously improving the accuracy of the application without being programmed to do it. In data science, an algorithm is a sequence of statistical processing steps. In machine learning, machines 'train' algorithms to find patterns and features in large amounts of data, especially in large amounts of business data, in order to make decisions and predictions based on these data patterns. As more data is processed, the algorithms improve, and the decisions and predictions become more accurate.

The workings of building a machine learning application (or model) are divided into four basic steps. These need to be performed by data scientists working closely with the business professionals for whom the models are being developed. Step 1 is to select and prepare the training data set to be used. The machine model will need to practice absorbing some of the data to solve problems that represent these datasets. The training data needs to be labelled as data to facilitate classification and feature recognition in some cases. The model will also automatically label other data that has not been artificially classified. The data will also be divided into two subsets: a training subset (which will be used to train the application) and an evaluation subset (which will be used to test and optimize the application).

Step 2 is to select the algorithm to be run on the training dataset. There are six common algorithms. There are three types of algorithms for labelled data, including regression algorithms, decision trees, and instance-based algorithms. Algorithms for unlabeled data include clustering algorithms, association algorithms, and neural networks.

Step 3 is to train the algorithm to create the model. Training the algorithm is an iterative process of comparing the output with the result through the algorithm variables, adjusting the weights and biases in the algorithm. The accurate algorithm that results from the training is the machine learning model.

Step 4 is to use and improve the model. The final step is to use the model in conjunction with the new data, which will improve accuracy and validity over time.

What can be done now?

Machine learning is now at a relatively mature level, it has a strong relevance to our real-world applications and can even be used every day. Machine learning can be applied to

  • Number processing: It can be applied to Apple Siri, Amazon Alexa, Google Assistant and other digital assistants using Natural Language Processing (NLP) to drive GPS and speech recognition (speech-to-text) software. It allows computers to process text and speech data and 'understand' human language in a human way.
  • Advice: Deep learning models can drive "people also like" and "just for you" business recommendations from Amazon, Netflix, Spotify and other industries such as retail, entertainment, travel, job search and news services.
  • Online recommendation advertising: Machine learning and deep learning models can evaluate subtle differences in the content of a web page, the opinions or attitudes of visitors, and tailor push ads to the interests of the user.
  • Chatbots: Chatbots can use a combination of pattern recognition, natural language processing and deep neural networks to interpret input text and provide appropriate responses.
  • Fraud detection: Machine learning regression and classification models can flag stolen credit cards and successfully detect the criminal use of stolen or misappropriated financial data.
  • Cyber security: Machine learning can extract intelligence from incident reports, alerts, blog posts, etc. to identify potential threats to complement the work of security analysts.
  • Medical image analysis: The explosive growth in the type and volume of digital medical imaging data may provide an opportunity to create more human error when reading the data. Convolutional neural networks (CNNs), recurrent neural networks (RNNs) and other deep learning models can extract features and information in medical images to help support accurate diagnosis.
  • Self-driving cars: Both machine learning and deep learning algorithms play a role in enabling self-driving cars, for example by identifying objects in the car's surroundings and predicting how they will change or move.

What is likely to be able to do be done soon?

Miniaturization is a way forward for machine learning applications in the coming years, and with machine learning running on tiny, low-power chips, we can achieve very high energy efficiency using deep learning techniques. Neural networks work mainly on multiplying large matrices together, and the fact that the same numbers can be reused in different combinations means that the CPU spends most of its time multiplying two cached numbers together and much less time fetching new values from memory. This is important because fetching data from memory can easily consume thousands of times more energy than performing a calculation operation. Relatively lower memory requirements (only a few tens or hundreds of kilobytes) also mean lower power SRAM or flash memory for storage. This makes deep learning applications ideal for use with microcontrollers, especially when using 8-bit computing rather than floating point numbers, as microcontrollers often already have well-adapted DSP instructions.

What is the potential impact of this development?

Energy supply is a major problem, creating complex wiring and expensive and wasteful power consumption for appliances. Using machine learning principles to develop microprocessors could save more power in many electronics. The ultimate goal of almost all smart products is to be able to deploy devices anywhere and without the need for maintenance such as battery replacement. The biggest obstacle to achieving this goal is how much power most electronic systems use. The microcontroller itself uses only 1 milliamp or less, but peripherals require much more power. A coin cell battery can provide 2500 joules of power, so even powering a device that consumes 1 mA will only last for a month. Processors and sensors can reduce energy consumption to the microwatt level (e.g., Qualcomm's Glance vision chip), but displays and radio components have much higher power consumption, and even low-power WIFI and Bluetooth use tens of milliamps. Data movement also consumes a lot of power. There seems to be a pattern: the amount of power required for a given operation is proportional to the distance over which the data is sent.

Which people will be most affected and how?

Some engineers working on satellite imagery they currently face the problem that basically high-definition video can be captured using a mobile phone camera, but there is only a small amount of memory on the satellite to store the results of these captures, and every few hours the data needs to be downloaded to a base station on Earth using limited bandwidth. The same problem can be demonstrated for almost all scenarios where we use sensors. Even home cameras face limitations in terms of bandwidth from WIFI and broadband connections.

We need something that runs on a microcontroller that not only uses very little power but relies on computing rather than radio and can turn all the sensor data we waste into something useful. The technology to fill this gap is none other than machine learning, and in particular deep learning.

Deep learning can achieve very high rates of energy savings, which will significantly reduce the global rate of electricity consumption, control the manufacturing costs of electronic devices, increase the speed of satellite transmissions, improve the productivity of industry and agriculture, etc.

Will this create, replace or make redundant any current jobs or technologies?

But workers will be more affected in this development and their job descriptions will be replaced. In one factory, for example, a master teacher called Hans looks along the rows of machines every morning, puts his hand on them and listens carefully, then tells the foreman which machine needs to be repaired offline, all depending on the master's many years of experience and his keen intuition. Like many factories there are such master craftsmen, but they are retiring, and it is difficult to pass on such skills. However, if we could install a battery-operated device with a built-in machine learning model and microphone on each machine (the equivalent of a networked version of a master), and let the model learn common inspection operations and automatically report any abnormalities to the factory, we could automate this experience-based operation of the master at a very low cost. This would increase the efficiency and low human cost that factory managers want but would put workers' jobs at risk.

In your daily life, how will this affect you?

The future development of miniature, low-power chips will also be applied and affect our daily lives. The most obvious change will be that computers and mobile phones will consume much less power and thus be charged less frequently. We can go two or three days or even a week or a month without having to recharge our phones and computers (when they are fully charged the first time). This will reduce the burden of backpacks or the embarrassment of not being able to find a socket for commuters and students. For myself, studying an IT profession requires me to use my computer for long periods of time, but because it consumes electricity very quickly. Every time I go to the library, I need to carry a charger and a heavy rechargeable battery, which makes my backpack heavy every time I travel, and I also need to find an outlet when both the rechargeable battery and the computer are dead, but not every seat in the library is equipped with an outlet. If the computer could consume electricity very slowly, I would be able to reduce the burden on my backpack every time I go to the library and not have to look for a spot with an outlet, which would improve my efficiency and concentration to some extent. Similarly, for office workers who need to travel, they don't need to put so many charging tools in their backpacks, reducing the burden of luggage and the embarrassment of not being able to find a socket outside. In addition, home cameras are also facing bandwidth limitations in terms of WIFI and broadband connections. For example, a friend of mine had a higher ISP usage in one-month last December than he had in the previous 11 months combined. When he analyzed it in a little more detail, he found that the reason was that the Christmas lights twinkling in the house were causing a dramatic drop in the compression ratio of the video stream, with too many video frames varying significantly! If this technology had matured, there wouldn't have been the unexpectedly excessive power consumption that my friend was experiencing.