Your Own AI on the Cheap

     

Harnesses are all the rage at the moment in AI. A harness is the environment in which you interact with a large language model (LLM). Some examples you may have heard of are Claude Code, Open AI Codex, OpenCode and Claude Cowork but there are many others. Some are focused around developers and writing code, and some (like Claude Cowork) are focused more on office tasks, making documents, posting online, etc. There are some for designers, CAD creating, 3d modeling, etc, etc.


你的代币怎么样?(How's your tokens?)

     

I’ve been having fun playing with coding agents, and I’ve been playing with some different ways, skills calling tools mostly, to enhance them at inference time. Unfortunately, to do my experiments I need to run them within the context of a coding session which gets very costly very quickly.

To try to get to a state where I can iterate, I decided to byte the bullet and try to get a local lab running. However, like most people I am GPU poor. The only thing I have is an old gaming PC with an Nvidia 1660 Ti with 6GB of RAM. That isn’t enough to run just about anything, but I am no stranger to constraints.


Basis - A Lisp Neural Network

     

Claude Code is absolutely insane. For a builder-of-software-and-crazy-ideas, I am unsure of another way to express how incredibly addictive this tool is.

However, it could very well be that I have fallen down one of the oft described rabbit holes. Or maybe I have succumbed to AI psychosis. At this point I am not sure. Like the man who thought he discovered a new form of mathematics because ChatGPT convinced him, perhaps I am falling into that trap as well. Somebody please let me know if I’ve gone off the rails.


The Unreasonable Effectiveness of Claude Code

     

As I mentioned in my last post, I very much enjoy writing code “by hand”. It is the only thing I have consistently dedicated the last 30+ years of my life to. In fact, I have a hard time understanding what software engineering / building software products would even be if one didn’t have a solid grasp of, let alone enjoy, writing code.

However, the reality is that the point of building software isn’t to build software. The point is to make a product or a business. It’s to accelerate some real world process or enable some kind of commerce. Engineers often get hung up on linters, syntax formatting, some made up abstraction, or big O time complexity of some batch process that runs all night. While those things have a place, the end goal of software is to increase shareholder value and make the company money. It’s not my favourite thing to think about, but it’s the truth.


We'll All Vibe Code; It'll Be Anarchy

     

The past few years I’ve been busy learning the fundamentals of machine learning and building my own custom models. Small models of course. SpaCy models, some small CNN models, a few hand rolled ones, and some neat little ML processes using pre built speech-to-text and OpenCV bits and bobs. Most of the things I have been building are specific, proprietary and sensitive so I haven’t really tried “vibe coding”. I’ve used LLMs to summarize some topics, do topic searches and what have you, but I hadn’t really tried letting an LLM write code “for me”.


Better At Research & The ARC Prize

     

Getting Better At Research

Last semester I sat a course on how to properly write a research paper. The course was extremely useful as I believe my academic writing has, in the past, been quite lacking (and I refuse to use LLMs to “fix” my papers). There were many, many insights I took from the course, but one of the best ones came towards the last few weeks: “You should keep a daily journal of your research project.”


AI Hallucinations

     

Introduction

I very much enjoy building machine learning / deep learning / AI systems. I went back to school to understand as much as I could about the subject. And while I am nowhere near Yann LeCun, Geoffrey Hinton, or Andrew Ng level, having built a few systems, I do have a pretty decent high level understanding of how these systems work. Well, in so far as can be taught I suppose - some of the behaviours are quite unexpected.


Emergent Behavior

     

Introduction

I’ve been mulling this over for a few days. I think emergent behaviour in deep neural networks isn’t surprising when you think about it. I also think trying to refine or control those behaviours might not be possible. Allow me to explain what I mean.

Like a Record Baby

I think most people at this point know that deep neural networks (like LLMs: chatgpt, gemini, claude, etc) are pattern matching machines. With simpler machine learning models, we are trying to find much simpler patterns, the most basic being a pattern of a straight line:


On AI Taking Engineering Jobs

     

Introduction

AI has the potential to eventually replace software engineers, but the current iteration of LLMs (large language models like GPT4, Claude, etc) will not be what does it - at least not with how it’s being currently used. I think engineers have an amazing opportunity to ride the wave of a new paradigm of things to build, but it will likely take some effort.

New Kids on The Block

When I was a kid, there was only assembly, basic, and C.


Fun with Binary Trees

     

This is another contrived post to test out my interactive coding scripts, and also to test making a d3 tree graph. While this post is probably not going to be enlightening, I hope it is somewhat entertaining.

One of my favorite data structures is the binary tree. It’s not the fastest at everything, but it does most things you’d need, and it does them reasonably quickly.

If you have a list of things that you need to sort, search, and also insert new items into, a binary tree is not a bad thing to consider. While it’s searching is nowhere near as fast as, say, a hash table and it’s sorting isn’t as fast as timsort - it’s versatility and simplicity make it one of my goto structures.