Machine Learning Approach for Cloud Data Analytics in IoT. Группа авторов
Читать онлайн книгу.research was taken one step ahead by authors in [11] who have used different ML techniques to predict the sales. In [11], it is observed that the normal regression techniques when integrated with boosting techniques have observed better results in comparison to mere regression algorithms. Using the same principle, authors in [12] also used ML approaches to predict future sales using the historical sales data. Authors discussed various approaches for the sales prediction and finally concluded that gradient boost algorithm is the best fit model for this scenario as it achieves highest accuracy and efficiency. The authors in [13] also implemented a stacking approach for regression ensemble to further improve prediction for sales. Authors in [14] proposed a model using ML techniques to optimize pricing on a daily basis. All these predictive models can be employed to predict demand and sales of products in future. Authors in [15] presented a regression model using regression trees for each department to predict future demand. The proposed model is authenticated in terms of its efficiency using least squares regression, principal components regression, and other similar regressions. Similarly, authors in [15] used historical data and Rue La La’s expertise for building size curves for each product p which represents the percentage of product demand for each size of p. Here, authors also attempted a price optimization problem with an object to set a prime for each product so as to maximize the revenue.
Authors in [16] proposed a framework to perform requirement analysis in the retail industry. The proposed framework consists of three modeling views: business view, analytics design view, and data preparation view. These views collectively perform data preparation activities. The authors in [17] employed descriptive analytics in relation to data mining for decision-making. Here, it is worth mentioning that predictive data analytics employs deterministic optimization techniques such as the decision tree method.
3.3 Predictive Data Analytics in Retail
Each retail industry aims to devise attractive and efficient business strategies to lure the largest portion of customers. For the past few years, retail industries had been using historical data to frame business strategies [18]. Focusing on mere historical transaction data fails to give promising results in this rapidly evolving and competing business world involving huge ocean of data [19]. This inability is addressed using predictive data analytics, an efficient approach to use big data to predict the activity, behavior, and future trends for any enterprise. Further, predictive data analytics is required owing to exponential rise in data and cut-throat competition. Predictive data analytics also helps to obtain a thorough understanding of customers, budget, and stock. As a result, predictive data analytics has gained wide acceptance and attracted several researchers and academicians. Predictive data analytics fails to achieve the desired results using simple regression type methods as it is not suitable in this multidimensional environment. Hence, it employs ML to enhance its capability [20]. The following are the most prevailing models for predictive data analytics [14]:
Classification Model
Clustering Model
Outliers Model
Time Series Model
The readers may refer to [14] for the explanation of these models. All these models use common predictive algorithms. The various predictive algorithms can be broadly categorized into two groups, viz., ML and deep learning. ML primarily works for tabular data which may be linear or nonlinear. Basically, deep learning is also a subset of ML but it has better optimization when dealing with audio, text, and images. ML-based predictive modeling uses various algorithms. Some common algorithms are discussed below in brief [21].
Random Forest: It is the most popular classification and regression algorithm of ML capable of handling huge volumes of data. Random forest implements bagging where a subset of training data is used to train the network. Training process may be repeated with another subset in parallel thus achieving a strong learner.
Generalized Linear Model (GLM): This model narrows down the list of variables and thus performs better than the general linear model. As a result of narrowing down the variables, it gets trained quickly. The limitation of this model is that it requires relatively huge training data sets.
Gradient Boosted Model (GBM): it generates a model that uses decision trees for classification. In this approach, each tree rectifies errors present in previously trained tree. As it builds one tree at a time, it takes longer but gives better generalizations. Hence, it is used in ML-based ranking in Yahoo, among others.
K-Means: It is a popular and fast algorithm to classify data points in various groups so that all points in the same group are highly similar. The aim of this classification is that intragroup similarity is maximized and intergroup similarity is minimized.
Owing to abovementioned algorithms, ML has been widely accepted and recognized as an efficient choice for handling huge volumes of data in the retail industry. It enables sophisticated algorithms for customers’ understanding and thus provides customer-oriented shopping experience. The subsequent subsection discusses the employment of ML for predictive data analytics in the retail industry.
3.3.1 ML for Predictive Data Analytics
As mentioned earlier, ML has been accepted as an efficient and effective choice for predictive data analytics in the retail industry. ML algorithms aid in identification of valued customers for a retail industry. These valued customers need to be retained by devising exciting and attractive strategies for success of any business entity. ML also enjoys the facility of customizing personalized customer’s view based on his likings and history [22]. Ability to provide user-oriented customer view further escalates its popularity in the retail industry. Similarly, it can also be utilized for predicting required stock so as to minimize involved risks and uncertainties [23]. The basic model of ML for predictive data analytics in the retail industry performs several functions. A sample description for same is as follows:
1 It initially gathers the data from diverse sources related to products for training purposes.
2 Thereafter, an algorithm is chosen to analyze the features of training data. Algorithms also precisely predict the product price.
3 It is followed by prediction of the right price in comparison to real price of the product.
4 ML algorithm continuously adjusts the prediction mechanism in order to minimize the gap between predicted price and actual price.
5 This pre-training is followed by prediction of price of numerous products and a feedback loop is also considered to further enhance the accuracy of the model.
6 To further refine the model, new product data is added to the system.
The abovementioned steps are the basic steps to employ predictive data analytics in the retail industry. Few such examples of its applications are as follows [16]:
ML for Demand Prediction: ML uses high computations power to handle highly volatile data to predict the demand in future. For the same prediction, ML uses external and internal sources (structured, semi-structured, and unstructured data) of information so as to make informed decisions. This data may involve historical data, social media data, etc. Here, ML applies complicated mathematical algorithms to uncover hidden patterns in complicated and large datasets and thus provide reliable and accurate forecasts [24].
ML for Predictive Sales Analytics: Another common application of predictive data analytics is to understand the driving motive behind customer’s purchase and their behavior under particular circumstances. Similarly, data from different sources is aggregated. The aggregated data is cleansed to determine the best forecasting algorithm for the current scenario. It then builds a predictive model to identify relationships among various factors. This is followed by monitoring the model to measure its accuracy with an objective to maximize its prediction accuracy.
ML for Customer’s Customization: Using ML, recommendation engines are developed which give a customized view to each individual customer based on his likes and requirements. Provision of customized view ensures retention of customers to the same platform. Amazon has got the best recommendation engine which has