Chapter 6 Linear Regression Using Excel 2010 Linear regression is a method of estimating the portion of a cost that is variable and the portion that is fixed. This method models the relationship between an activity and the total cost by fitting a linear equation to the data. Unlike the high-low method which uses only two data points, linear regression uses all data points in constructing the cost equation, making it much superior to the high-low method.
A linear regression generates information for a cost equation in the same form as the other methods of estimating costs: Y = VCx + TFC, where 'x' is the independent variable (the activity) and 'Y' is the total cost (dependent variable). While there are several software programs that generate linear regressions, using Excel ® is relatively easy and is a business tool frequently used by managers.
It is also a business tool that is installed on the majority of home and business computers. As such, you will learn how to run (generate) a regression using Excel ®. 1 Interpreting the Regression Output While a number of statistical items are generated in the regression output, your primary interest is the components of the cost function found in the last section of the summary output. Linear regression output for a home moving company that packs and moves residents to new homes appears below. Intercept is the y-intercept which is the estimate of total fixed costs for each period. X Variable 1 represents the estimated variable cost per unit, or the slope of the cost equation. The cost equation for the moving company is: Y = 3,815.69x + 828,814 We read this formula as 'Total cost equals variable cost of $3,815.69 times the number of residences moved plus fixed costs of $828,814'.
NumXL supports an interesting functionality – stepwise regression in Excel – to help you select this optimal set. Let’s demonstrate how you would use it. The output tables are generated. The stepwise regression in Excel generates one additional table next to the coefficients table. Let’s take a closer.
Always express unit costs (i.e., the unit variable cost) with two decimals, and total costs (fixed costs) with no decimals. Is the Data 'Good'? A is often prepared prior to running a regression to pre-assess the relationship between two variables.
A weak or nonexistent relationship between the activity and the total cost indicates that the linear regression output will not provide a useful cost equation. Assessing the quality of the cost equation (regardless of the cost estimation used) is beyond the scope of managerial accounting. As such, you will focus solely on generating a cost equation and how to use it to estimate future costs. Data Analysis ToolPak To use the linear regression tool in Excel, the Data Analysis Toolpak must be installed. To verify if it is installed, click Data from the Excel ® menu.
If you see the Data Analysis command in the Analysis group (far right), the Data Analysis Toolpak is already installed. If it is not installed, follow the directions in the Appendix at the end of this chapter to install it. Walk Through Problem Wilson Company provided the following information concerning the number of monthly service calls provided and the total cost incurred for its pest control operations for the each month during 2016.
Year Number of Service Calls Total Cost January 1,040 $33,600 February 1,200 36,300 March 1,260 37,800 April 1,100 35,500 May 1,220 36,600 June 1,010 32,900 July 1,190 36,200 August 1,050 33,400 September 1,210 37,700 October 1,250 37,400 November 1,060 33,800 December 1,280 38,100 Run a linear regression using the regression tool in Excel ®. Write the cost equation in standard form. Determine the estimated cost of providing 1,140 service calls for January, 2017. Solution Step 1: Open the Excel ® program. Copy and paste the data for Wilson Company into columns A, B, and C beginning in row 1 of a blank worksheet. Step 2: Because the regression output occupies 9 columns and 13 rows, to avoid placing the regression output on top of your data, select a location in the worksheet in which there are enough empty rows and columns to accommodate the output. Place your mouse pointer in cell A15 since that is an acceptable location. Step 3: Select the Data ribbon menu, then Data Analysis command on the Analysis tab.
A popup box will appear. Scroll down and select Regression. Step 4: The Regression wizard will be displayed. In the Input Y Range field, select all the values in the total cost column of the data table-cells C2 through C13.
Do not select the total. Step 5: In the Input X Range field, select all the activity values-cells B2 to B13 in the same manner as Step 4. Step 6: In the Output Range field, select cell A15 so that the regression output will begin in that cell Step 7: Verify you have no checks in all remaining boxes. Your wizard should be identical to the graphic below: Step 8: Click OK to 'run' the regression. Verify your output appears as follows, noting that the column widths may differ based on your worksheet settings. Step 9: Write the cost formula based on the regression output as: Y = 19.08x + 13,720 The total cost is $13,720 plus $19.08 times the number of service calls.
Always express unit costs (i.e., the variable cost) with two decimals, and total costs (fixed costs) with no decimals. Step 10: The total cost expected if 1,140 service calls are made is: Y = ( x 1,140) + 13,720.2593 = $35,473 Appendix - How to Install the Data Analysis Tool Pak Open Excel 2010. Choose the Data tab from the menu ribbon in Excel. If there is no Data Analysis item on the ribbon (look to the far right), follow the steps below to install. Click the File menu option, and then click Options.
Click Add-Ins, and then in the Manage box (near the bottom), select Excel Add-ins. In the Add-Ins available section, select the Analysis ToolPak check box, and then click OK. If Analysis ToolPak is not listed in the Add-Ins available box, click Browse to locate it. If you get prompted that the Analysis ToolPak is not currently installed on your computer, click Yes to install it.
You will be immediately returned to your original worksheet. Click the Data tab from the menu ribbon and you will see the Data Analysis option on the Analysis tab at the far right. 1 This chapter explains how to run a regression using Excel 2010. Directions for other versions of Excel may differ.
Note: Want to learn even more about advanced Excel techniques? In the three-part video series I'll show you how to easily solve engineering challenges in Excel. Linear Regression with Excel Charts When you need to get a quick and dirty linear equation fit to a set of data, the best way is to simply create an XY-chart (or “Scatter Chart”) and throw in a quick trendline. Add the equation to the trendline and you have everything you need. You can go from raw data to having the slope and intercept of a best-fit line in 6 clicks (in Excel 2016). Let’s say we have the data set below, and we want to quickly determine the slope and y-intercept of a best-fit line through it.
![How How](/uploads/1/2/5/3/125384068/855543723.jpg)
We’d follow these 6 steps (in Excel 2016):. Select x- and y- data. Open Insert Tab. Select Scatter Chart. Right-Click Data Series. Select Add Trendline.
Check Display Equation on Chart Now we know that the data set shown above has a slope of 165.4 and a y-intercept of -79.85. Linear Regression in Excel with the LINEST function The method above is a quick way to fit a curve to a series of data, but it has a significant downfall. The equation displayed on the chart cannot be used anywhere else. It’s essentially “dumb” text.
If you want to use that equation anywhere in your spreadsheet, you have to manually enter it. However, if you change the data set used to obtain the equation, that equation you manually entered will not update, leaving your spreadsheet with an erroneous equation. What we need for these situations is a function that can perform the same kind of regression analysis done by the charting utility and output the coefficients to cells where we can use them in an equation. The does this perfectly.
Note: Want to learn even more about advanced Excel techniques? In the three-part video series I'll show you how to easily solve engineering challenges in Excel. The results are exactly the same as those provided by the trendline method. This was obviously more work than using a trendline, but the real advantage here is that the slope and y-intercept values have been output to a cell.
That means we can use them dynamically in a calculation somewhere else in the spreadsheet. Regression Using the Excel Solver This last method is more complex than both of the previous methods. Fortunately, it will probably be unnecessary to ever use this method for basic single-variable linear regression. However, I’ve included it here because it provides some understanding into the way that the previous linear regression methods work. It will also introduce you to the possibilities for more complicated curve fitting using Excel. Enter “guess-values” for the slope and intercept of the equation.
Calculate new y-values based on those guess values. Calculate the error between the calculated y-values and the y-data. Use the Solver to find values of the slope and intercept that minimize the total error Let’s start again with the x- and y- data we had before. Next, enter some guess values for m and b into some cells on the worksheet. Now create a new column of calculated y-values based on the m and b guess values and the known x-data.
Next, create an error column, calculating the difference between the y-data and calculated y-values. Finally, create a new formula, calculating the sum of squares of the error column.
We will use the Solver to minimize this value – the sum of the squared errors. The reason why we use “sum of squares” instead of just “sum” is because we do not want an error of -100 in one cell to cancel out an error of 100 in another cell. We want each value in the error column to be driven to its minimum absolute value. Now, let’s open up the Solver. If you have never used the Solver Add-In before, you must first enable it.
After the Add-In has been loaded, you can open the Solver from the Data tab: With the Solver open, the setup for this is pretty straightforward. We want to minimize the objective, cell H3, or the sum of the squared errors. To do so, we will change variable cells E3 and F3, the slope and y-intercept of our linear equation.
As a last step, uncheck the option to “Make Unconstrained Variables Non-Negative”. When properly set up, the solver dialog should look like this: When we click “Solve”, the Solver does its thing and finds that the values m = 165.36 and b = -79.85 define the best-fit line through the data. Exactly what was predicted by the chart trendline and LINEST. Of course, this is totally expected. After all, we have just done “manually” what the Trendline tool and LINEST do automatically. In the case of a linear single-variable regression like we have here, the Solver is probably complete overkill.
However, this is just the start. We can use this same concept to do more complex multi-variable or non-linear regression analysis. Using the Solver, you can fit whatever kind of equation you can dream up to any set of data. Wrap-Up There you have it – three ways to perform linear regression in Excel. Another thought occurs to me. The trendline formula in a chart has the shortcoming that it can be used in a worksheet only if the coefficients are manually copied into cells.
These typed values suffer from not being dynamic, as you pointed out. A second problem is that they only display a limited resolution, so typing them in locks in the rounding errors from the coefficients displayed in the formula. You can get around this by applying a scientific number format with many digits, but most people don’t think of this, and their calculated values may be way off. Great suggestion, Jon. I guess that makes four methods – further proving how there are usually many different means to one particular end with Excel. The reason I like LINEST is because I only have to remember one formula (remembering things seems to be getting harder for me these days) and because I can easily expand LINEST to return coefficients for a polynomial or other type of equation. However, I would agree that using LINEST to only return slope and intercept might be overkill given its capability and the relative complexity of dealing with an array formula.
Primary Sidebar.