Data Mining and Forecasting Services
Using data mining, companies and organizations can increase the profitability of their businesses by uncovering opportunities and detecting potential risks.
Our data mining and analysis consulting services can help you extract valuable information out of your data by utilizing forecasting modeling (regression and time series analysis). We can analyze your data and provide you with forecasting reports that will suit your need.
Forecasting Models and Project Life Cycle
Forecasting is a component of data mining. It is the process of estimation in unknown situations and is commonly used in discussion of time-series data. Regression models can best be used with time series data to detect trends and seasonalities (even though the models are also useful for cross section data). They can help answer questions such as “What will our sales in the next quarter be?” and “How confident are we in the prediction?” Regression models are also very good for interpolating and extrapolating data in both linear and nonlinear approaches. Our Excel consulting services can provide you with forecast reports by testing your data through various models and implementing the best model that is determined.
We have a team of business analysts, statistical modelers, and IT professionals that utilize tools such as Forecast Pro, SPSS, Statistica, Access, and Excel to perform the analysis.
Our regression models include, but are not limited to:
- Linear and nonlinear regression
- Multiple regression
- Exponential smoothing with additive seasonality
- Simple exponential smoothing with multiplicative seasonality
- Halt Winters exponential smoothing
- Halt Winters simple exponential smoothing with additive seasonality
- Halt Winters simple exponential smoothing with multiplicative seasonality
- Damped exponential smoothing
- Simple moving average analysis
- Centered moving average analysis
- ARIMA (Autoregressive Integrated Moving Average)
Here are two examples of forecast plot:
Fig 1. Third (3rd) Order polynomial model
Fig 2. Seasonality (quarterly) model
Linear regression is used to model the value of a dependent scale variable based on its linear relationship to one or more predictors. It estimates the coefficients of the linear equation, involving one or more independent variables that best predict the value of the dependent variable. For example, you can try to predict a salesperson's total yearly sales (the dependent variable) from independent variables such as age, education, and years of experience.
- An automotive industry group keeps track of the sales for a variety of personal motor vehicles. In an effort to be able to identify over- and underperforming models, you want to establish a relationship between vehicle sales and vehicle characteristics. We can use linear regression to identify models that are not selling well.
- Is the number of games won by a basketball team in a season related to the average number of points the team scores per game? A scatter plot indicates that these variables are linearly related. The number of games won and the average number of points scored by the opponent are also linearly related. These variables have a negative
relationship. As the number of games won increases, the average number of points scored by the opponent decreases. With linear regression, you can model the relationship of these variables. A good model can be used to predict how many games teams will win.
- The Nambe Mills company has a line of metal tableware products that require a polishing step in the manufacturing process. To help plan the production schedule, the polishing times for 59 products were recorded, along with the product type and the relative sizes of these products, measured in terms of their diameters. We can use linear regression to determine whether the polishing time can be predicted by product size.
Nonlinear regression is a method of finding a nonlinear model of the relationship between the dependent variable and a set of independent variables. Unlike traditional linear regression, which is restricted to estimating linear models, nonlinear regression can estimate models with arbitrary relationships between independent and dependent variables. This is accomplished using iterative estimation algorithms. Note that this procedure is not necessary for simple polynomial models of the form Y = A + BX^2. By defining W = X^2, we get a simple linear model, Y = A + BW, which can be estimated using traditional methods such as the Linear Regression procedure.
- Can population be predicted based on time? A scatter plot shows that there seems to be a strong relationship between population and time, but the relationship is nonlinear, so it requires the special estimation methods of the Nonlinear Regression procedure. By setting up an appropriate equation, such as a logistic population growth model, we can get a good estimate of the model, allowing us to make predictions about population for times that were not actually measured.
- An internet service provider (ISP) is determining the effects of a virus on its networks. As part of this effort, they have tracked the (approximate) percentage of infected e-mail traffic on its networks over time, from the moment of discovery until the threat was contained. We can use Nonlinear Regression to model the rise and decline of the infection.
This procedure produces fit/forecast values and residuals for one or more time series, using an algorithm that smoothes out irregular components of time series data. A variety of models differing in trend (none, linear, or exponential) and seasonality (none, additive, or multiplicative) are available.
- Inventory-intensive businesses often employ statistical techniques for projecting future inventory. The Exponential Smoothing procedure can be used both to develop a model of the inventory time series and to produce fast forecasts based on that model.
ARIMA (Box-Jenkins) Example
This procedure estimates non-seasonal and seasonal univariate ARIMA (Autoregressive Integrated Moving Average) models (also known as "Box-Jenkins" models) with or without fixed regressor variables. The procedure produces maximum-likelihood estimates and can process time series with missing observations.
- You are in charge of quality control at a manufacturing plant and need to know if and when random fluctuations in product quality exceed their usual acceptable levels. You've tried modeling product quality scores with an exponential smoothing model but found--presumably because of the highly erratic nature of the data--that the model does little more than predict the overall mean and hence is of little use. ARIMA models are well suited for describing complex time series. After building an appropriate ARIMA model, you can plot the product quality scores along with the upper and lower confidence intervals produced by the model. Scores that fall outside of the confidence intervals may indicate a true decline in product quality.
- A catalog company, interested in developing a forecasting model, has collected data on monthly sales of men's clothing along with several series that might be used to explain some of the variation in sales. Possible predictors include the number of catalogs mailed and the number of pages in the catalog, the number of phone lines open for ordering, the amount spent on print advertising, and the number of customer service representatives.
Are any of the predictors useful for forecasting? Is a model with predictors really better than one without? Use the ARIMA procedure to create forecasting models with and without predictors, and see if there is a significant difference in predictive ability.
- The retail grocery market in a medium sized metropolitan area is dominated by two supermarket chains: Norton's and EdMart. Norton's was recently purchased by a large national grocery chain that then introduced its own brand of products, most of which sell for substantially less than the name brand products offered at EdMart. For a number of years, EdMart has maintained about a 5% edge in market share over Norton's, primarily due to its superior customer service. During their first two months of ownership, the new parent company of Norton's launched an aggressive campaign advertising their own product line. The result was a rapid and dramatic increase in market share. Was the increase in market share solely at the expense of EdMart's share, or is some of the increase due to losses by the small mom-and-pop groceries that make up the rest of the local market?
Seasonal Decomposition Example
The Seasonal Decomposition procedure decomposes a series into a seasonal component, a combined trend and cycle component, and an "error" component. The procedure is an implementation of the Census Method I, otherwise known as the ratio-to-moving-average method.
- A scientist is interested in analyzing monthly measurements of the ozone level at a particular weather station. The goal is to determine if there is any trend in the data. In order to uncover any real trend, the scientist first needs to account for the variation in readings due to seasonal effects. The Seasonal Decomposition procedure can be used to remove any systematic seasonal variations. The trend analysis is then performed on a seasonally adjusted series.
- A catalog company is interested in modeling the upward trend of sales of its men's clothing line on a set of predictor variables such as the number of catalogs mailed and the number of phone lines open for ordering. To this end, the company collected monthly sales of men's clothing for a 10-year period. To perform a trend analysis (for example, with an autoregression procedure) it's necessary to remove any seasonal variations present in the data. This is easily accomplished with the Seasonal Decomposition procedure.
* Source: wikipedia.org
Copyright © Excel Business Solutions. All Rights Reserved.