My short foray into AI

My short foray into AI

For a while this winter I gave a shot to applied AI engineering and learned a lot in the process. Here are some of my thoughts.

I got laid off

Late last year at the start of winter I got laid off from my job. I, of course, tried to look for a new one but it was a tough market.

Everyone wanted to hire senior engineers, but no one wanted to pay for them.

Or they had stupid expectations like “you have to know how the Go runtime works”… Such questions are interesting, but not something I’d expect a backend engineer to know. It would rarely help them build a crypto product.

I interviewed a few people who were in AI and looked at VC funding numbers. It seemed like a good time to try applied AI engineering.

Side note: why do VC funding numbers matter? They are the big driver of capital allocation in startups. Where the money goes, that’s where the jobs/opportunities are. So that’s where you want to be as an engineer.

What did I build & learn?

After interviewing a few people, watching the space, and reading a lot of articles, I’ve built Minerva. It’s a self-hosted API that makes it easy to turn S3 documents into embeddings for RAG AI applications. And then serves these embeddings over HTTP for RAG.

Me and my friend Paul launched it here. We got some interest, but not enough to go full time into it. And slowly it died.

If you’re curious to learn more about how Minerva/RAG works, the README from Minerva is a good place to start.

Side note: I’ve also built a chrome extension that allows you to easily summarize any selected text on a webpage. It’s called The Gist of It and it’s also open source. And I’m currently working on a new version of it called “Page Chat” that will allow you to chat with any webpage.

If you just want the TL;DR of how RAG / Minerva works, here’s the gist

RAG is just search, but instead of you using the results, the AI uses them. It’s useful when you want to make AI apps that use your private data.

Under the hood, you use a combination of keyword matching and vectors. Vectors = we split text, use embedding models to turn it into a vector (math coordinates), and then use cosine similarity to find the most similar vectors.

Minerva does the heavy lifting: point it at your S3 docs, it’ll process them into smart chunks, and serve them over HTTP when needed.

The selling points for Minerva were:

  1. self-hosted and you have your data on your own servers (compliance, privacy, etc)
  2. fast and easy to set up
  3. has support for local models (no need to send your data to OpenAI)
  4. does semantic chunking (keeps context intact)

Prompt engineering is not fake

Before going into AI, I was a bit skeptical about prompt engineering. Turns out you can query the LLM in such ways to get more value out of it. The reason why I was skeptical was because (sheer luck) I naturally did a lot of the best practices without knowing.

But, for your benefit, here are some of the best practices:

  • assign a role to the LLM (e.g. “you are an expert in X”)
  • give it examples of what you want it to do (this is called N-shot prompting)
  • give the LLM the context it needs (e.g. “here are some docs you might need to know about”)
  • give the LLM only the context it needs (don’t give it too much or it will lose focus)

Automation is amazing, but…

Automation is amazing, but...

The meme says it all. I don’t know if I should need to explain more.

For people on screen readers, I guess I should say it out loud:

Programmers when they build a program in 10 hours just to perform a task that would take them 10 minutes.

Do some quick math before you build something. Will it save you enough time to justify the cost?

Sure, with AI we can build MVPs and POCs much faster. AI coding assistants actually shine at small projects. But it still takes time.

Especially it takes more time and the bar is higher if you’re building market-facing products. Internal products can be rough around the edges.

Why am I (sort of) out of the AI game?

1. AI has its own hype cycle, just like crypto was in 2021.

After spending some time in this space, I can tell you it’s a lot like crypto was in 2021.

Initially I wanted to tell myself that it’s not true because the VC funding has been consistently high over the past few years. Not just over 1-2 years like in 2021’s crypto bubble.

But then you look at all the influencers on X and YouTube talking about AI. The same patterns exist.

Every damn week there’s a new “AI will replace us all” take. Every damn week there’s a “we’ll launch this and it’s going to change everything” announcement.

And then reality happens. No it didn’t replace shit. And no it didn’t change everything.

I’ve built a few products that use AI, as you know from above. I’ve been using AI coding assistants daily since before Github Copilot (remember Tabnine?). I’ve been using ChatGPT, Claude, DeepSeek, Perplexity, and a few other LLMs in my daily workflow since GPT 3.5. From asking medical advice to summarizing articles to researching random topics, you name it.

It can make us more efficient / productive. It may even skew supply / demand for certain skills.

But, imho, that’s about it.

And anyone that actually uses AI daily knows this.

2. AI is very US-centric, not global remote like crypto.

Jobs are good, there’s plenty of them. But everyone wants to hire in the US. Or even in SF.

So your best bet as a remote engineer is to start a consulting practice. But that’s a whole other story. And it’s imho hard if you don’t have a network in that field and proof of work.

It can be done, but it’s hard.

3. Much of the capital in crypto has flown to Rust and Solana (I am assaulted by recruiters).

Funny thing happened, I was applying to jobs in AI and one of the CEOs was like:

I wanted to reach out as a founder friend just started building [redacted] and looking for strong web3/full-stack engineers. If you’d like an intro, feel free to email me.

“Just when I thought I was out, they pull me back in.”

I’m not complaining. Or I am, sort of.

Side note: A lot of crypto has devolved into gambling. And I’ve become a bit disillusioned with it.

Anyway, I went on to help them build a POC for their product. And after I was able to say I worked in Rust/Solana my inbox blew up.

Fun times.

More posts you might like

On latency and how it affects architecture

On latency and how it affects architecture

Serverless tends to push the 'edge' paradigm. The app server lives close to the user to save latency. But it's problematic if you have a database because, generally, you will have multiple db queries for each page / API request. Let's talk about that

SQLite vs PostgreSQL

SQLite vs PostgreSQL

An in-depth comparison of SQLite and PostgreSQL for web development, highlighting their strengths and use cases.