Research -> Reality. AI Engineers.

Since last year, we have once in a generation “shift right” in applied AI, fuelled by OSS, billion foundational models, Libraries and frameworks offering 10000ft abstractions.

May 18, 2024

It’s 2010. A small group of missionaries of product managers, data engineers, and data scientists embark on a daunting journey. They spend five years navigating a labyrinth of tools, technologies, SaaS, and OSS, working tirelessly to achieve their goals. Fast forward to today, and the landscape has dramatically changed. My nephew, armed with a powerful GPU-equipped Mac, can now achieve similar results in a single afternoon. With a solid understanding of making simple API calls and writing basic applications, he tackles these challenges effortlessly, right after coming home from his social science exam.

However as only 11-15% of companies have been successful in delivering any positive business outcomes from GenAI till now, the devil is pretty much in the details - there are 99 problems in successfully evaluating, applying and productizing AI:

Models: From Mistral, OpenAI, Claude, LLama, Gemini, and Phi to Ironman, Black Panther, and Green Swan. (the last three are made up :| )
Tools: Auto-GPT, BabyAGI, MemGPT, Autogen, Ragflow, Evals, and ETLs.
Techniques: Deciding on RAG or no RAG, fine-tuning or not, efficient inference without massive investments.
Infrastructure: Vector databases like Pinecone, Weaviate, Faiss, and Chroma, along with orchestration and structured/unstructured ETL pipelines.
…… 100 other things coming soon

On top of this, the sheer volume of papers and models and techniques published each day is exponentially increasing with interest and funding, so much so that keeping on top of it all is almost a full time job.

Glancing over the “New” type of Jobs popping up on my feed and organisation irrespective of their size, vertical, application, all want the new, jazzy, hot right from the oven “AI engineer”.

This is how other subdiscipline of SWE emerged, for ex “site reliability engineer”, “devops engineer”, “data engineer” and “analytics engineer”.

The thousands of Software Engineers working on productionizing AI APIs and OSS models, whether on company time or on nights and weekends, in corporate Slacks or indie Discords, will professionalize and converge on a title - the AI Engineer. This will likely be the highest-demand engineering job of the decade.

One thing you would notice while doing your own research is not a single PhD in sight. When it comes to shipping AI products, you want engineers, not researchers.

While starting into this field ofcourse I started with my own research and what better place than HN, SWE holy ground. How to become an AI engineer. Here are some resources from my reading list:

1. NLP Course by Hugging Face

2. Building & Evaluating Advanced RAG

3. Scalable Data Pipelines with Kafka

4. Automated Testing for LLMOps

5. Efficiently Serving LLMs

6. Serverless LLM Apps on Amazon Bedrock

7. Mastering AI Product Management

8. Grokking Data Science

9. Build a Writing Assistant Tool with Next.js and OpenAI

10. Transformers for NLP

11. Machine Learning with Python Libraries

12. Neural Networks for Beginners to Advanced

13. Become a Deep Learning Professional

14. Deep Learning with PyTorch Fundamentals

15. Introduction to Deep Learning

Apart from this, the daily bread of Pytorch, Modern Data stack, Databases yada yada..

But what I realised was most people still consider AI Engineering as a form of either Machine Learning or Data Engineering, so they recommend the same prerequisites. But I guarantee you that none of the highly effective AI Engineers I have spoken to or know off have done the equivalent work of the Andrew Ng Coursera courses, nor do they know PyTorch, nor do they know the difference between a Data Lake or DataWarehouse.

SWE are purists and I want to be excellent at what I do and hence I couldnt look myself in the mirror when I realised the reading papers was really really difficult. Slow realisation was just to make omelettes, I dont need to know who came first, chicken or the egg or what happened during the famous Dandi March movement of India, similarly In the near future I believe, nobody will recommend starting in AI Engineering by reading Attention is All You Need.

Papers, fundamentals, history are important and it definitely serves a different purpose and will make me a well read man but we are talking about productising and applying AI and these things do take a little backseat in this context.

Why is it easier (and more pertinent than ever) to educate ourselves about applied AI engineering?

The answer lies when we contrast Traditional vs Modern Applied AI

Reference - https://www.latent.space/p/ai-engineer

Basically, instead of requiring data scientists or machine learning engineers to undertake a laborious data collection exercise before training a single domain-specific model for production, a product manager or software engineer can prompt a large language model (LLM) to build and validate a product idea. This can be done before collecting specific data for fine-tuning.

Apart from this few obvious pointers are

SOTA open and closed models which was a distant dreams for 99% of org in the world.
Frameworks like langchain and llamaindex which have made it SO EASY, that my milkman has created MilkGPT which answers complex questions about everything around “milk”.
GPUs abundance (or scarcity? the rich will never be satisfied I guess)
Multi language support - Python and JavaScript.
etc etc
Prompt engineering - English is the new Coding language.

Your job is safe*

*TnC applied - You have to make friends with AI. Cmon, AI has developed empathy also know, what more do you need?

I don’t have a secret recipe on how to make friends with AI but I know few of the cool places they hangout. DM me and I would be more than glad to help.

potentialmind's substack

Discussion about this post