Errors Are the Curriculum — On Self-Improvement, Neural Networks, and AI Agents

I'm stuck on this idea of self-improvement, and I feel like I'm staring into a big void. On one hand, I feel like this might be the path to something huge — agents that actually get better on their own, organizations that learn without a human babysitting every cycle. On the other hand, it's such a dark area ahead and I honestly don't know what to really do about it. So I'm doing what I always do — I'm starting before I'm ready and figuring it out as I go.

But let me back up, because this rabbit hole started with something personal.

I Hated School Because Nobody Told Me What Learning Actually Is

I was fascinated by learning for a long time, and I must admit that growing up, learning was something I thought was exclusively school-related — and that was mainly the reason I didn't like school. I used to think about when all this learning would end so I could start living my adult life, doing things, building stuff. I just couldn't wait for school to be over so I could start my real life. For those of you not in school now, I think you simply realize how far off my assumption was. For those of you still in school — I really hope you're not having the same mindset I did.

The thing is, starting to actually build things after school — actually while still in school, I had my first business in my second year of college — I realized only after the first real failure that I genuinely didn't know how to do stuff in real life. It was a very hard landing, but it got me to shift my thinking and realize that learning is crucial for progress, and I found out that progress is crucial for pursuing happiness, so this brought me to a hard reset in my relationship with learning.

I wish school would have done a better job explaining to me that learning is uncomfortable, and that's okay. Learning is not supposed to be fun. And yes, you can create circumstances that create the allure that we can associate learning with fun, but it's my opinion based on everything I've experienced, studied, and learned — learning is not a comfortable activity.

And you kind of see the problem here, don't you? We build this whole cultural narrative that learning should be gamified and joyful and wrapped in dopamine, and then we're surprised when people avoid the thing that actually makes them grow — because real growth comes from the discomfort, from the error signal, from the gap between what you expected and what actually happened.

My Brain Learns the Same Way a Neural Network Does (Or the Other Way Around) :)

In nature there are many forms of learning, but for us humans as conscious entities, learning seems to be a core brain function. And if you look at the brain and how it actually learns, you'll see that most processes are driven by adrenaline, acetylcholine, dopamine — chemicals used to increase alertness and focus, mostly associated with discomfort and fear, not with sitting comfortably watching YouTube tutorials. I always wanted to understand processes and how things work, and understanding how learning works explained a lot about me not liking school.

The idea of learning, as I understood it, is deeply related to errors. You perceive a danger, an unfamiliar situation, something you haven't experienced before, and then this chemical mechanism sets in motion, marks your brain for change, and then your brain actually changes — it learns. At a more abstract level, our brain is constantly making predictions about what it should expect based on saved patterns. Any deviation from the pattern is basically an error, and that error is the fuel for the learning process.

Now here's where it gets interesting, because we find the exact same pattern in training the neural networks that power our LLMs. You have a training set and a control set, you start with default weights in the model, you input data and check the output, you measure the distance between what you got and what you wanted, and then you correct the weights using a technique called backpropagation. You repeat the process until the output matches the controlled data.

Basically — you use the error to learn. The brain does it. Neural networks do it. It's the same fundamental architecture.

THE ERROR
IS THE
CURRICULUM

The agent performs the task — sells a policy, writes a response, handles an objection. This is the turn. Everything starts here.

This is the loop that everything runs on — brains, neural networks, and (if I can pull this off) my AI agent. Act, measure, evaluate the error, adjust. The error isn't the enemy. The error is literally the curriculum.

So Can We Make Agents That Learn the Same Way?

Coming back to our agents, this seems like the natural way to look at the problem. We want our agent to be more efficient, to better perform under some specific task. As long as we can define our desired outcome and we can clearly measure the current outcome under the same comparable values, we can start a learning process.

But here's where it gets tricky, and this is where I've been banging my head against the wall for weeks now.

Is this something that can make agents smarter? I think it depends on your definition of smart. I think "efficient" is the most appropriate word here. If an agent needs to do a task and it completes the task having an output similar to our desired one, it's 100% efficient. But before we can start calling the agent smart, I believe what happens when the desired outcome is NOT the expected one is way more important. If we can design a system in which the agent can evaluate the output based on expected reference values, understand what the error is, and have available tools to change its execution process so that it can diminish the error on the next run — now THAT'S smart.

Think about it like a baseball pitcher, and I love this analogy because it connects to something I wrote about before — destiny as a great pitcher throwing balls at us. The efficient pitcher throws strikes consistently. The smart pitcher notices that a specific batter pulls everything to the left, adjusts the pitch location mid-game, and strikes him out looking. The difference isn't raw ability — it's the feedback loop. It's learning from the error in real time.

The Turn: Defining What We Can Actually Measure

A systematic approach to this should look something like this: a turn — basically a process starting with an input and finishing with a result, an output. You need a measurable output and a comparable expected result. And this is where it gets tricky, because the world splits into two very different problems.

→

Actual outputExpected output

3.0 / 8 criteriaerror: 62%

Turn 1 — adjustments taking effect

When I have numbers, it's quite easy. I have an expected result of 5, I get 7, the error is 2. Numbers have this inherent property that makes the job super simple , they're born to be compared.

But what if our expected output is "writing a great blog article"? Well, this is way harder to evaluate. How do you reliably evaluate "great"? And how do you measure the gap? As a human, you might evaluate it as "okay" on a scale of very bad, bad, okay, good, very good, great, and exceptional. But even if you have the scale, it's incredibly hard to quantify the gap between "good" and "great" so we can understand what progress needs to be made, what learning is needed to improve the output.

This is the fundamental problem of self-improvement in AI agents — and honestly, it's the fundamental problem of self-improvement in humans too. You know you want to be better. You might even know you're currently "okay." But what exactly is the distance between where you are and where you want to be, and what specific changes close that gap?

The True/False Trick: Making the Subjective Measurable

The idea I've been working with is that for an AI agent to self-improve, this gap needs to be quantifiable in such a way that after a turn is completed, a clear numeric evaluation can be achieved. To obtain this, the evaluation criteria need to be true or false statements.

So the desired outcome needs to be decomposed into evaluatable criteria — and this is where the magic happens. Try toggling the criteria below and watch what happens to the score:

“Write a great sales conversation”

↓ decomposed into ↓

Binary criteria the agent can self-evaluate

Measurable score4 / 7

Weak — some instincts are there but major gaps remain.

Click any criterion to toggle it and watch the score change

These criteria being true or false can be evaluated as 0 or 1, therefore they become measurable. You add them up, you compare to the total possible score, and suddenly you have a number. Not a perfect number — believe me, I know this is a simplification — but a number you can work with. A number that tells you "last time you scored 6 out of 10, this time you scored 7 out of 10, the criteria you're still missing are X and Y."

It's like - and this is going to sound ridiculous but stay with me — it's like when you're cooking and someone says "make it taste better." Useless feedback. But if they say "needs more salt, the pasta is overcooked by about 2 minutes, and the sauce needs acidity" — now you can actually improve. You decomposed "better" into specific, checkable things.

Building the Self-Improvement Loop

Building a self-improvement agent around this scheme is a real challenge. I believe that the focus on building skills — explicit, documented capabilities that the agent can execute — and then using auto-improvement cycles to refine those skills will be essential in building a great agent.

But the implementation has this nasty chicken-and-egg problem: should the agent evaluate itself? Should a separate skill-improvement and learning agent be present in the environment, with the sole purpose of identifying the gap and rewriting skill documentation? I'm not sure which is the best solution. On my autonomous life insurance sales agent, I'm testing both approaches, but finding true/false evaluation criteria for a sales AI agent is quite tricky.

Think about it — how do you decompose "had a good sales conversation" into binary criteria? "Did the agent acknowledge the customer's concern before presenting a solution?" Sure, that's evaluatable. "Did the agent create genuine emotional resonance?" Good luck turning that into a true or false. "Did the customer feel understood?" You'd need to ask the customer, and even then people lie about their feelings all the time.

I'm going to write a separate article just on this topic because it deserves its own deep dive — the specific challenge of defining evaluation criteria for conversations where the outcome is fundamentally about human perception and emotion.

Why This Is the Key to Everything

I think self-improvement efficacy will be the key to autonomous organizations. But — and this is where I keep going back and forth — setting things up needs to be an infinite task, not a one-time configuration. Making sure that on any task you can determine a gap analysis and clear improvement criteria seems to work beautifully on small, contained tasks. But organizations have human interactions, which can be wildly unpredictable, and you cannot jump too high with your abstraction level.

What I mean is — you cannot define the gap as "the profit we obtained versus what we planned" and then self-optimize skills to generate more profit. This is way too high level. You cannot learn how to make money directly, just like in life, where money is a consequence, a by-product of delivering value to customers. And that value needs to be perceived by customers, and perception is not a true or false paradigm — it's way more grey than that.

So the secret to an autonomous organization is definitely buried somewhere in here — in this intersection of measurable improvement loops, decomposed evaluation criteria, and the acknowledgment that some things, the most important things, resist clean measurement. But I'll try to unlock this one experiment at a time.

At the end of the day, I keep coming back to the same realization that started this whole journey: learning is uncomfortable, it's driven by errors, and the only way to get better — whether you're a human, a neural network, or an AI agent trying to sell life insurance to skeptical Romanians — is to look honestly at the gap between where you are and where you need to be, and then actually do something about it.

My agent is still not great at learning from its mistakes. But then again, it took me 30-something years and a collapsed business to figure out that learning was supposed to be uncomfortable all along. So maybe I should give it a little more time.

Building from Cluj-Napoca, Romania

Next up: Defining evaluation criteria for sales conversations — or, how I'm trying to turn "did the customer feel understood?" into a number.