Build Neural Network With Ms Excel New
Step-by-step: Build a simple feedforward neural network in Microsoft Excel (newer Excel versions)
This guide produces a working, trainable 1-hidden-layer neural network (input → hidden → output) that you can run, inspect, and train with backpropagation using only Excel formulas and built-in tools (no add-ins). Assumptions and defaults:
- Network: 2 inputs, 3 hidden neurons, 1 output (you can change sizes).
- Activation: sigmoid for hidden and output.
- Loss: mean squared error (MSE).
- Training method: batch gradient descent with learning rate you set.
- Excel: modern Excel with dynamic arrays and standard functions (works with older too with small formula adjustments).
Overview of sheets and layout
- Sheet "Params": topology and hyperparameters.
- Sheet "Data": training examples (X, y).
- Sheet "Weights": weights and biases.
- Sheet "Forward": forward pass computations (activations).
- Sheet "Grad": gradients and weight updates (backprop).
- Sheet "Train": training loop controls (iteration counter, loss history).
- Create the workbook and sheets
- Add sheets named: Params, Data, Weights, Forward, Grad, Train.
- Params sheet (single-cell hyperparams)
- A1: "Input size" → B1: 2
- A2: "Hidden size" → B2: 3
- A3: "Output size" → B3: 1
- A4: "Learning rate" → B4: 0.5
- A5: "Epochs / iterations" → B5: 1000
- A6: "Batch size" → B6: (set to number of rows in Data or leave blank for full-batch)
- Data sheet (training set)
- Row1 headers: X1, X2, Y
- Rows 2..n: put training examples. Example: XOR dataset
- Row2: 0, 0, 0
- Row3: 0, 1, 1
- Row4: 1, 0, 1
- Row5: 1, 1, 0
- Use an Excel Table (Insert → Table) named "TrainData" to make dynamic ranges easier; otherwise refer to ranges explicitly.
- Weights sheet (initialize weights & biases)
- Layout for readability; use ranges:
- W1: weights from input → hidden: create a 2×3 table (rows = inputs, cols = hidden units).
- Put header row "h1","h2","h3" and left column "x1","x2".
- Fill cells with small random numbers: =NORM.INV(RAND(),0,0.5) or =RAND()*0.2-0.1
- b1: hidden biases: a 1×3 row next to W1, initialize =0 or small random.
- W2: weights from hidden → output: a 3×1 column, initialize small random.
- b2: output bias: single cell.
- W1: weights from input → hidden: create a 2×3 table (rows = inputs, cols = hidden units).
- Example cell addresses (adjust if you place differently):
- W1 in Weights!B2:D3 (2 rows × 3 cols)
- b1 in Weights!B4:D4 (1 × 3)
- W2 in Weights!F2:F4 (3 × 1)
- b2 in Weights!F5
Tip: Lock initial random seed by replacing RAND() with fixed numbers if you want reproducible runs.
- Forward pass: compute activations for each training example
- Use the Data sheet rows as inputs and vectorized formulas to compute network outputs.
- Create header row on Forward: X1, X2, Z1_h1, Z1_h2, Z1_h3, A1_h1, A1_h2, A1_h3, Z2, A2, Error, SqError
- For each training row (example in Data row i), compute:
- Z1_j = sum_k (X_k * W1_kj) + b1_j
- In Excel, if inputs are in Data!A2:B2, and W1 in Weights!B2:D3 and b1 in Weights!B4:D4, compute hidden pre-activations with:
- =MMULT(TRANSPOSE(Data!A2:B2), Weights!B2:D3) + Weights!B4:D4
- If MMULT returns an array, Excel will spill into three cells for Z1_h1..Z1_h3.
- In Excel, if inputs are in Data!A2:B2, and W1 in Weights!B2:D3 and b1 in Weights!B4:D4, compute hidden pre-activations with:
- A1_j = sigmoid(Z1_j) where sigmoid(z)=1/(1+EXP(-z)).
- Example: =1/(1+EXP(-cell_with_Z1))
- Z2 = SUMPRODUCT(A1_range, W2_range) + b2
- Example: =SUMPRODUCT( A1_spill_range, Weights!F2:F4 ) + Weights!F5
- A2 = sigmoid(Z2)
- Error = y - A2
- SqError = (Error)^2
- Z1_j = sum_k (X_k * W1_kj) + b1_j
- If you use tables and spill formulas you can compute forward pass for all rows at once using array formulas referencing entire Data ranges and MMULT.
- Loss (MSE)
- On Forward or Train sheet compute MSE over all training examples:
- =AVERAGE(Forward!SqError_range)
- Backpropagation: compute gradients row-wise and average
- For each example compute gradients for each weight and bias; then average across batch (full-batch):
- For output layer:
- dA2 = A2 - y (derivative of MSE w.r.t activation; if using 1/2 factor, adjust)
- dZ2 = dA2 * sigmoid'(Z2) where sigmoid'(z)=sigmoid(z)*(1-sigmoid(z)) → dZ2 = dA2 * A2 * (1-A2)
- dW2_j = A1_j * dZ2
- db2 = dZ2
- For hidden layer:
- dA1_j = dZ2 * W2_j (propagate error back)
- dZ1_j = dA1_j * A1_j * (1-A1_j)
- dW1_kj = X_k * dZ1_j
- db1_j = dZ1_j
- For output layer:
- Implementation tips in Excel:
- In sheet "Grad", mirror Forward rows and compute dZ2, dW2, db2, dZ1s, dW1s per example using cell formulas and SUMPRODUCT.
- Example formulas (adjust ranges):
- dA2_cell = Forward!A2_output - Forward!Y
- dZ2_cell = dA2_cell * Forward!A2 * (1-Forward!A2)
- dW2_range_for_example = Forward!A1_range * dZ2_cell (produces 3 values)
- dZ1_range = dZ2_cell * Weights!F2:F4 * Forward!A1_range * (1-Forward!A1_range) — compute element-wise
- dW1_matrix = X_vector (2×1) * dZ1_row (1×3) → use outer product to get 2×3 matrix:
- If X_vector is in cells, use: =MMULT(TRANSPOSE(X_vector), dZ1_row) to produce 2×3 array
- Use array-aware functions or copy formulas per column if dynamic arrays unavailable.
- Aggregate gradients and update weights (batch)
- Average gradients across all examples (use AVERAGE across rows for each weight cell).
- Update rule: W_new = W_old - learning_rate * grad_W_avg
- In Weights sheet, create adjacent cells for Updated_W1, Updated_b1, Updated_W2, Updated_b2 with formulas:
- =W1_cell - Params!B4 * Grad!Avg_dW1_cell
- =b1_cell - Params!B4 * Grad!Avg_db1_cell
- In Weights sheet, create adjacent cells for Updated_W1, Updated_b1, Updated_W2, Updated_b2 with formulas:
- To perform iterative training, you have two approaches:
A) Manual update via copy–paste
- After computing Updated_W* formulas, copy Updated_* cells and Paste Values back into W* cells to apply update, then recalc until epochs completed. B) Use a macro (VBA) to loop and apply updates automatically (recommended for many epochs).
- Minimal VBA outline:
Sub TrainNN() Dim epochs As Long: epochs = Sheets("Params").Range("B5").Value Dim lr As Double: lr = Sheets("Params").Range("B4").Value Dim i As Long For i = 1 To epochs Sheets("Forward").Calculate Sheets("Grad").Calculate ' read averages from Grad sheet, update Weights values ' example: W1(1,1) = W1(1,1) - lr * GradAvg_W1(1,1) ' update all weight cells similarly Next i End Sub - Use Application.Calculate or CalculateFull if needed. Save before running macros.
- Debugging and verification
- Start with a tiny learning rate (0.1–1.0 for simple tasks) and few epochs to see loss decreasing.
- Print loss each epoch in Train sheet: have a cell that records iteration number and MSE (use VBA to append).
- Test on simple problems (XOR requires nonlinearity and hidden layer — good test).
- If loss diverges: reduce learning rate, initialize weights smaller, or use normalization on inputs.
- Extensions and improvements
- Use tanh activation by replacing sigmoid formulas (tanh(z)=TANH(z), derivative = 1 - tanh(z)^2).
- Implement momentum: maintain velocity matrices and update weights with v = muv - lrgrad; W += v.
- Implement mini-batch: compute averages per batch subset.
- Add regularization: L2 by subtracting lr * lambda * W during update.
- Use Excel charts to plot loss vs iterations (Train sheet).
- Example workbook hints (practical)
- Keep consistent named ranges: Inputs = Data!A2:B5, Targets = Data!C2:C5, W1 = Weights!B2:D3, b1 = Weights!B4:D4, W2 = Weights!F2:F4, b2 = Weights!F5.
- Use helper columns to keep formulas clear (Z1_1, A1_1, …).
- When using MMULT, ensure dimensions match and wrap in N() or VALUE() when needed.
- Minimal numeric example (XOR)
- Use the XOR Data rows given in step 3.
- Initialize W1 small randoms like:
- W1 = [[0.2, -0.3, 0.1],[0.4, 0.25, -0.2]], b1 = [0,0,0]
- W2 = [0.3, -0.1, 0.2], b2 = 0
- Run a few hundred epochs with lr=0.5 — you should see MSE reduce and outputs approach targets.
- Saving and reproducibility
- After training, paste values of weight cells to freeze final model.
- Save workbook; include a sheet that documents final weights and a test table to validate predictions.
If you want, I can:
- Produce exact cell-by-cell formulas for your workbook layout (tell me whether you prefer 0-indexed or 1-indexed rows, exact cell addresses, and whether you want VBA included), or
- Generate a ready-to-download Excel file with the network implemented (requires file export capability).
The Evolution of Neural Networks in Microsoft Excel For years, building a neural network in Microsoft Excel was considered a "brute force" academic exercise—a way to visualize backpropagation using complex macros and thousands of manually linked cells. However, with the introduction of modern features like Dynamic Arrays functions, and Python in Excel
, the platform has transformed from a static grid into a Turing-complete environment capable of sophisticated machine learning. The "New" Building Blocks
The modern approach to Excel-based AI leverages several key updates that eliminate the need for traditional VBA macros: LAMBDA and Helper Functions : Functions like MAP, REDUCE, and SCAN build neural network with ms excel new
allow you to encapsulate the complex math of a neuron—weights, biases, and activation functions—into a single, reusable formula. Dynamic Arrays
: Instead of copying formulas down thousands of rows, a single formula can now "spill" an entire layer of calculations across the grid, making the architecture of a Multi-Layer Perceptron (MLP) much easier to manage. Python in Excel
: By enabling Python directly within a cell, users can now import libraries like
to handle the heavy matrix multiplication required for deep learning without leaving the spreadsheet. Building the Architecture
Constructing a modern neural network in Excel follows a streamlined five-step process: Initialize Parameters to generate initial weights and biases for each layer. Forward Propagation : Employ the function for matrix multiplication, combined with a for the activation function (like Sigmoid or ReLU). Calculate Loss
: Use standard formulas to determine the error between the network's prediction and the actual training data. Backpropagation Step-by-step: Build a simple feedforward neural network in
: While more complex, this involves calculating the gradient of the loss with respect to each weight. In modern Excel, this can be automated via or visualized through iterative cell updates. Optimization Excel Solver add-in
can act as your optimizer (similar to SGD or Adam), automatically adjusting weights to minimize the error. Why Use Excel for AI?
Part 4: The "New" Training Loop (Manual Iteration)
In Python, you loop 10,000 times. In Excel, you traditionally needed VBA. With the "new" Excel, we use Circular Iteration (enabled manually) or a simple Data Table.
We will use the iterative method as it is the most "new Excel" way to simulate a loop.
The Three Lessons I Learned
1. Backprop is just addition and multiplication.
Excel has no autograd. Writing dLoss/dW = (Pred - True) * Input manually makes you realize that deep learning is simply weighted averages with memory.
2. Local minima are visible. In Python, loss curves are abstract plots. In Excel, you watch the "Loss" cell bounce up and down as you tap F9. You can see the model get stuck. You can see it escape. Network: 2 inputs, 3 hidden neurons, 1 output
3. Excel is the ultimate low-code ML platform. For a business analyst who cannot install Python, a simple logistic regression (1-neuron network) in Excel is incredibly powerful. Adding a hidden layer is overkill, but it proves that the barrier to AI is no longer code—it is understanding.
Step 2.1: Hidden Layer Linear Sum (Z1)
In cell F6 (using dynamic array multiplication MMULT):
=MMULT(Input, W1) + B1
Result: A 1x4 array. The MMULT function is the native matrix multiplier.
Step 4: The "Learning" (Backpropagation)
This is where the magic happens. Standard Excel doesn't "learn" automatically; we must calculate the gradients (how much to change the weights) using formulas.
- Calculate Error: $Target - Prediction$.
- Calculate Derivatives: Use the chain rule to find out how much each weight contributed to the error.
- Output Gradient:
Prediction * (1 - Prediction) * Error - Hidden Gradient: Complex chain rule application involving output weights and hidden activation derivatives.
- Output Gradient:
Error at hidden layer
delta_hidden = MMULT(delta_output, TRANSPOSE(W2)) * HiddenActivation * (1 - HiddenActivation)
Layout visually:
| | A | B | C | D | E | F | G | H | I | J | K | L | M | |-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----| | 1 | | A | B | Y | | W1 | | | b1 | | W2 | | b2 | | 2 | | | | | | col1| col2| | | | | | | | 3 | | 0 | 0 | 0 | | 0.5 | -0.6| | 0.1 | | 0.4 | | 0.2 | | 4 | | 0 | 1 | 1 | | 0.7 | 0.2 | | -0.2| | -0.3| | | | 5 | | 1 | 0 | 1 | | | | | | | | | | | 6 | | 1 | 1 | 0 | | | | | | | | | |
(Initial weights are small random numbers – you can type your own.)
Part 3: The Magic Trick – Backpropagation (The Learning)
This is where the "new" Excel shines. Backpropagation requires calculating the derivative of the error with respect to every weight. We do this using matrix calculus.