Chapter 8
PROJECTIONS I
SIMPLE REGRESSION
by: josavere
Introduction
With the Digital Revolution, predictive analytics looks for future results using data from the past; the models use different methodologies with a very similar general objective; some are classification specific (the model results are binary; a yes or no, in the form of 0 and 1) and others regression that allow predicting a value that can be applied to an unknown event in the past, present or future .
The Digital Revolution provides BIG DATA with an abundance of structured variables, such as data tables, and unstructured variables, such as texts, images or videos, and provides new possibilities for prediction and brings about a change in layout. Now they are built flexible and heterogeneous with a proven ability to predict well, data different from those used to estimate them; the final predictor used combines different models, procedures and data types.
Decision Trees, Neural Networks, Support Vector Machines, Bayesian Analysis, Logistic Regression, Linear Regression, Time Series and Data Mining, K-Nearest Neighbors, Ensemble Models, Gradient Boosting, Incremental Response Models, Replace, Introducing Multiple parameters extracted from Big Data, with many advantages, the models traditionally used by statistics.
Big Data Analytics is the technology used to analyze a huge amount of structured and unstructured data that is collected, organized and interpreted by software, transforming it into useful information for decision-making and to generate ideas about market trends. In addition, it contributes to the generation of ideas for new products and services, customer attraction, audience understanding, security and more benefits to make strategic decisions.
Predictive models are very useful for calculating, with a high degree of approximation, future values that serve as an aid for decision making, complementing the good judgment of executives, and for presenting solid arguments to bankers and investors; very especially to the so-called angels because they bet on emerging businesses when they present good projections.
1. APPLICATION TO SALES. USE OF THE COMPUTER
The sales budget constitutes the departure point for the planning of a company and elaborates a plan that generates value. Its preparation corresponds to the marketing and sales executive, but the financial division has to participate also openly in its elaboration, in order to have a solid base with the purpose of making a simulation analysis, until it gets to the point of defining a Plan to generate Value (josavere). The financial executive participates as an advisor, delivering elements of analysis to the management.
The sales depend on a great number of variables. Among them we can list:
a. Potential market
b. Competition level
c. Positioning level
d. General economic situation
e. Available incomes
f. Buyers attitude
g. Substitute products
h. Prices
i. Publicity investments
2. PREPARATION
3. INDICATORS
Each organization must look for the right indicators, to project their sales trying to predict results using the mathematical model, complemented with the executive’s opinion. The indicator (independent value) for being a known value that is used to predict results based on the regression equation.
Example:
Kind of Product |
Indicator |
|
1. School books |
amount of matriculated students |
|
2. University books |
a number of high-level students |
|
3. Cars |
familiar income |
|
4. Food |
population size and familiar income |
|
5. Gas, tires, batteries |
moving vehicles |
|
6. Baby products |
nasality index |
|
7. Petroleum equipment |
wells to drill. (short and long term plan) |
|
8. Steel |
industrial production |
The regression line: is the one that gives the best adjustment to the available historic facts; using the statistic inference, we project the future results. It's equivalent to thinking that if everything keeps up until now, the result can be calculated with the use of a mathematical formula. If X represents the time (independent variable) and Y represents the sales (dependent variable), the equation is a model of simple regression, which has the form; as follows:
|
A: |
intercept |
|
B: |
slope |
|
Y: |
estimated sales |
|
X: |
years |
In the case of the multiple line regression, the dependent variable Y, change with others that interact together; the mathematical expression is:
| Y = A + b1x1 + b2x2 + b3x3 ... bnxn |
The correlation coefficient indicates the percentage of the total sales of the company, which has a line associated with the independent variables.
The determination coefficient indicates the adjustment level of the model, it means that the closer the value of R2 is to 1, the bigger is the amount of the variation that can be explained by the terms that appear in the model, and are calculated by the second potential of the correlation coefficient.
Observations are the number of periods that are taken as a base to make a projection.
4. BUDGETARY MODELS
The X is the independent variable (years), and Y, is the dependent variable (real sales in units)
| X |
Y |
XY |
X2 |
Y2 |
|
1 |
110 |
110 |
1 |
12100 |
|
2 |
123 |
246 |
4 |
15129 |
|
3 |
141 |
423 |
9 |
19881 |
|
4 |
156 |
624 |
16 |
24336 |
|
5 |
164 |
820 |
25 |
26896 |
|
6 |
175 |
1050 |
36 |
30625 |
|
7 |
186 |
1302 |
49 |
34596 |
|
8 |
200 |
1600 |
64 |
40000 |
|
9 |
234 |
2106 |
81 |
54756 |
|
10 |
254 |
2540 |
100 |
64516 |
|
11 |
274 |
3014 |
121 |
75076 |
|
12 |
290 |
3480 |
144 |
84100 |
|
∑78 |
∑2307 |
∑17315 |
∑650 |
∑482011 |
a. Formula to calculate the slope (B)
b. Formula to calculate the intercept (A)
c. Formula to calculate the correlation coefficient
d. Formula to calculate the determination coefficient
With numbers it would be:
Determination coefficient (r2) = (0.9886)2 = 0.9774
With this the equation is:
EXCEL APPLICATION FOR THE CORRELATION ANALYSIS
Steps to follow:
a. Open excel and enter the original facts as you can see in the picture:
b. Select the icon "functions (fx)" statistics functions (upper circle)
c. Click "accept", watch and answer indicating the range of the matrix as seen in the next picture:
d. Click on finish and you will see the answer 0,98866917
e. Proceed in the same way to calculate the intercept and the slope.
f. With our example, the values are:
C. correlation |
0,98866917 |
|
Intercept |
86,8181818 |
|
Slope |
16,2202797 |
|
C. determination |
0,97746673 |
Equation of regression: Y = 86,82 + 16,22 * X
Note: as you can see, using Excel we obtain the same results as if we were using the traditional formulas, but in a faster and more reliable way. This happens if the facts you enter are the right ones.
g. Let’s see how good is the adjustment:
Matrix to Test the Model |
|||
|
X |
Y |
Y(estimated) |
(Ye - Y)/Y% |
|
1 |
110 |
103.04 |
-6.33 |
|
2 |
123 |
119.26 |
-3.04 |
|
3 |
141 |
135.48 |
-3.92 |
|
4 |
156 |
151.70 |
-2.76 |
|
5 |
164 |
167.92 |
2.39 |
|
6 |
175 |
184.14 |
5.22 |
|
7 |
186 |
200.36 |
7.72 |
|
8 |
200 |
216.58 |
8.29 |
|
9 |
234 |
232.80 |
-0.51 |
|
10 |
254 |
249.02 |
-1.96 |
|
11 |
274 |
265.24 |
-3.20 |
|
12 |
290 |
281.46 |
-2.9 |
h. Projections: Based on the same facts, calculate the estimated sales for the years 13 and 14.
Y13 = 86,82 + 16,22 * (13) = 297,68
Y14 = 86,82 + 16,22 * (14) = 313,9
i. Graphic illustration:
B. MODEL OF MODIFIED AVERAGES
Steps to follow:
a. Enter the facts
| Years (X) |
Sales (Y) |
|
1 |
110 |
|
2 |
123 |
|
3 |
141 |
|
4 |
156 |
|
5 |
164 |
|
6 |
175 |
|
7 |
186 |
|
8 |
200 |
|
9 |
234 |
|
10 |
254 |
|
11 |
274 |
|
12 |
290 |
b. Establish the sales average by biennial, as indicated next:
| Years |
Average |
|
1 - 2 |
116.50 |
|
2 - 3 |
132.00 |
|
3 - 4 |
148.50 |
|
4 - 5 |
160.00 |
|
5 - 6 |
169.50 |
|
6 - 7 |
180.50 |
|
7 - 8 |
193.00 |
|
8 - 9 |
217.00 |
|
9 - 10 |
244.00 |
|
10 - 11 |
264.00 |
|
11 - 12 |
282.00 |
c. Calculate the sum of the averages, and calculate the participation percentages of each one of them:
| Years |
Average |
% Participation |
|
1 - 2 |
116.50 |
5.5 |
|
2 - 3 |
132.00 |
6.3 |
|
3 - 4 |
148.50 |
7.00 |
|
4 - 5 |
160.00 |
7.60 |
|
5 - 6 |
169.50 |
8.00 |
|
6 - 7 |
180.50 |
8.60 |
|
7 - 8 |
193.00 |
9.20 |
|
8 - 9 |
217.00 |
10.30 |
|
9 - 10 |
244.00 |
11.60 |
|
10 - 11 |
264.00 |
12.50 |
|
11 - 12 |
282.00 |
13.40 |
|
Sum |
2107.00 |
100.00 |
d. Calculate the percentage variation of each year by dividing the year "n" over (n-1) and apply it to the year to project.
AI Opinion: The article "PROJECTIONS I - SIMPLE REGRESSION" written by José Saúl Velásquez Restrepo provides an overview of the importance of projections and simple regression in predictive analysis and their application in the field of sales. Below are some thoughts on the content:
Focus on the importance of data analytics: The article highlights the relevance of data analytics and the Digital Revolution in business decision making. This reflects the increasing importance of data collection and analysis in the digital age to predict trends and make strategic decisions.
Relevance of projection in sales: The article highlights the importance of sales projections and how these projections can be fundamental to the planning and success of a company. Sales are a crucial indicator for business performance, and accurate projections can be valuable for decision making.
Use of mathematical models: The article mentions the application of mathematical models, particularly simple regression, to predict sales. These models use indicators and independent variables to make projections. Using tools such as Excel to perform correlation and regression analysis is common practice in the field of statistics and data analysis.
Practical Example: The article provides a practical example of how to use simple regression to predict sales. This helps illustrate how theoretical concepts apply in real situations and can be useful for those who want to learn more about the topic.
Emphasis on accuracy and usefulness: The article emphasizes the importance of accuracy in projections and how these projections can be useful for decision making. The use of tools such as Excel is presented as an effective way to achieve this precision.
Overall, the article provides a useful introduction to simple regression and its application in sales projections. It highlights the importance of data analytics in business decision-making and offers practical guidance on how to use mathematical models to achieve accurate projections.











