Challenges and Solutions

An analysis of the technical obstacles encountered during the development of HERMES OPTIMUS and the methodological approaches implemented to resolve these challenges.

Memory Constraints

The TI-84 Plus Silver Edition operates with a mere 24KB of RAM, presenting an exceptionally constrained environment for neural network implementation. The initial architectural approach attempted to incorporate all weights and biases directly within the program code structure, which inevitably resulted in memory overflow errors due to the substantial parameter requirements.

Challenge

A 4-60-12 neural network requires storing 240 weights for the first layer, 720 weights for the second layer, and 72 bias values—a total of 1,032 floating-point values.

Solution

Instead of hardcoding the weights in the program, I converted them to external .8xm (matrix) and .8xl (list) files that could be loaded at runtime. This reduced the program size by approximately 80%.

Training Limitations

The initial research methodology involved training the neural network directly on the calculator hardware utilizing standard backpropagation algorithms. This approach encountered multiple technical limitations, including prohibitively slow execution performance and critical variable overflow errors that prevented successful training convergence.

Challenge

Training on the calculator was impractically slow, taking approximately an hour for 25 epochs. Additionally, the backpropagation algorithm caused variable overflow errors due to the limited precision of TI-BASIC's floating-point implementation.

Solution

I developed a Python script to train the network externally using NumPy for efficient matrix operations. The script implemented the same architecture and activation functions as the calculator version, ensuring compatibility. The full 12-word dictionary required approximately 500,000 epochs to achieve optimal accuracy, while the earlier 4-word dictionary needed only 50,000 epochs.

Weight Transfer Process

Following the successful training of the network using Python, it became necessary to develop an efficient methodology for transferring the optimized weights to the calculator hardware. The preliminary approach involving manual transcription of weights through TI-BASIC commands proved both inefficient and susceptible to transcription errors.

Challenge

The original plan was to generate TI-BASIC code with commands like "0.123→[I](1,1)" for each weight, but this approach was impractical for over 1,000 weights and resulted in a program that was too large to run on the calculator.

Solution

I discovered that I could convert the weights to CSV files and then use Cemetech tools to convert these CSV files to .8xm and .8xl files that could be directly loaded onto the calculator. This eliminated the need for TI-BASIC commands to initialize the weights.

Numerical Stability

The sigmoid activation function, while mathematically appropriate for the classification task, can generate numerical values at the extremes of the computational range, potentially resulting in overflow or underflow errors when executed on calculator hardware with limited numerical precision.

Challenge

When calculating sigmoid(x) = 1/(1+e^(-x)), large negative or positive values of x would cause errors on the calculator due to its limited numerical range.

Solution

I implemented safeguards that clip extreme values before applying the sigmoid function:

If L₂(I)>10:Then:0.9999→L₂(I):End

If L₂(I)<­10:Then:0.0001→L₂(I):End

If abs(L₂(I))≤10:Then:1/(1+^(­L₂(I)))→L₂(I):End

Performance Optimization

The expansion of the system's dictionary size from the Legacy version of HERMES introduced significant performance challenges. The original implementation with only four words (each with four letters) demonstrated substantially faster processing times.

Challenge

The increased dictionary size in the current version of HERMES OPTIMUS resulted in longer processing times, increasing from approximately 7 seconds with the smaller dictionary to about 20 seconds with the expanded vocabulary. This performance degradation significantly impacted the system's practical utility.

Hardware Limitation

This performance constraint is fundamentally tied to the Zilog Z80 processor's limited 15 MHz frequency. Despite implementing various code optimizations such as minimizing variable reassignments, reducing conditional checks, and using more efficient loop structures, the processing speed remains inherently restricted by the calculator's hardware architecture. The trade-off between dictionary size and processing speed represents an unavoidable limitation of implementing neural networks on such constrained hardware, highlighting the significant achievement of getting the system to function at all on such limited computational resources.

Final Solution

The culminating implementation of HERMES OPTIMUS successfully addresses all identified technical challenges, resulting in a fully operational neural network system that functions effectively within the constraints of the TI-84 Plus Silver Edition calculator hardware.

The critical technical innovations that facilitated this achievement include:

  • External training in Python with a compatible architecture
  • Converting weights to .8xm and .8xl files for efficient loading
  • Implementing numerical safeguards to prevent overflow errors
  • Optimizing the forward pass for speed and memory efficiency
  • Designing a user-friendly interface within the calculator's constraints

These methodological solutions collectively reduced the program's memory footprint by approximately 80%, thereby enabling the successful execution of a neural network on a computational device with merely 24KB of RAM utilizing a programming language developed in the 1980s—a significant achievement in resource-constrained machine learning implementation.

Built with v0