Why industrial companies should hire data scientists
The transition into Industry 4.0 is going to take more than just an investment into innovative technologies like the Industrial Internet of Things (IIoT), augmented reality (AR), and machine learning (ML). We’re going to need plenty of highly skilled talent to pull it all together and make it work. You may already have a great team of control engineers, but we can’t expect one person to have expertise across so many domains.
When it comes to tasks like big data analytics, training a machine learning model, or creating an in-depth strategy for developing a data pipeline, there’s no substitute for a professional data scientist. Declared the sexiest job of the 21st century by the Harvard Business Review, data scientists are in high demand across the economy because their work sits at the cross-section of incredibly valuable and extremely difficult.
By bringing their unique skillsets to the table, data scientists create real value for companies by giving them a way to leverage the unprecedented data we’re now generating. This ranges from insights into high-level business decisions all the way down to the minutiae of optimizing machines on the plant floor.
Simply put, data scientists convert knowledge into power. It’s not hard to see their value proposition in a world that’s increasingly defined by smart devices. As our machines become more intelligent, largely because of the work of data scientists, we need even more clever people to take full advantage of them.
Types of Analytics
To get a better perspective of a data science job, let’s go through some of the most common analytics that we see in industrial contexts.
First, we have predictive analytics, defined by IBM as “the use of advanced analytic techniques that leverage historical data to uncover real-time insights and to predict future events.” The basic idea here is to look at previous trends over time to make an educated guess about what’s to come. An easy example is American retail sales between Thanksgiving and Christmas. Businesses anticipate this yearly boom by planning accordingly.
In industries like manufacturing, we don’t see as many of these low-hanging fruits. Instead, we’re seeing an uptick in this type of analytics for predictive maintenance, which uses analytics to forecast machine failure so that we can perform preventative upkeep and minimize downtime. Sensor data from a robotic arm, for example, may indicate rising temperature, increased vibration, or a change in some other variable. The data scientist’s job is figuring out which variables are important, what thresholds to set, and when to recommend maintenance.
Another common analytic technique is anomaly detection, which Towards Data Science’s Susan Li defines as “the process of identifying unexpected items or events in data sets, which differ from the norm.” Essentially, it is digging through the data to look for points of interest and observations that don’t quite fit into the “normal” range.
We can use this technique in both negative and positive contexts. A machine’s anomalous behavior may indicate part failure, incompatibility at an integration point, or operator error. Analyzing it can give us a scent of the trail that we need to follow to address the issue. On the other hand, an anomalous increase in single day sales can point to a successful marketing campaign, an industry reaction to a piece of legislation, or a favorable change in market conditions.
The last type that we want to highlight is prescriptive analytics, which takes predictive analytics one step further by then recommending a way forward. In logistics, for instance, prescriptive analytics can streamline operations and optimize routes. We also commonly see prescriptive analytics in targeted advertising, where companies can prescribe relevant products to customers based on observed behavior patterns.
Although it may be possible to manually perform these analytics, it’s not practical at scale, especially when we’re trying to deal with petabytes of unstructured data. That’s why the main component of any data scientist’s work is automating these analytics with machine learning, a type of artificial intelligence (AI).
ML models take data as input, run it through complex algorithms to learn on their own, and produce the output that we ask for. For instance, we can train a model on sensor data from a conveyor belt, ask it for predictive analytics to inform maintenance timings, and it will search through the data for correlations between the machinery’s state and the times when it malfunctions. In this way, ML can uncover patterns that remain invisible to the human eye.
We can further break this down into two categories: in the cloud and on the edge. Since cloud computing gives us easy access to cheaper and more plentiful compute resources, data scientists often build their models in the cloud, regardless of where it’s going to end up. For models that will continue to work at large scale or that don’t require real-time control, we can likewise deploy them in the cloud.
However, if we want to save bandwidth by reducing the amount of data we’re sending up to the cloud, if we need to minimize latency for real-time functionality, or if we’re concerned about data privacy, data scientists can deploy their ML models on the edge. By running the model on an embedded device that’s closer to the data’s source, industries can improve automation via intelligent autonomy, cut networking costs, and add a layer of security for sensitive data.
Whether their ML runs on the cloud or on the edge, and whether they’re developing models for predictive analytics, anomaly detection, or prescriptive analytics, data scientists bring value to industries across the board. They do so by combining their technical skills with tools like Python and TensorFlow with their critical thinking skills that they need to develop an analytical framework.
Data scientists do much more than the difficult task of programming AI. They figure out the right questions to ask, they put data into a business context, and they perform any number of pre-processing jobs, such as sorting, cleaning, and augmenting the data.
This is the takeaway: if you haven’t already, then you should hire a data scientist.
Published By David Hoysan
Originally published at https://www.linkedin.com.