Research -> Reality. AI Engineers.
Since last year, we have once in a generation “shift right” in applied AI, fuelled by OSS, billion foundational models, Libraries and frameworks offering 10000ft abstractions.
It’s 2010. A small group of missionaries of product managers, data engineers, and data scientists embark on a daunting journey. They spend five years navigating a labyrinth of tools, technologies, SaaS, and OSS, working tirelessly to achieve their goals. Fast forward to today, and the landscape has dramatically changed. My nephew, armed with a powerful GPU-equipped Mac, can now achieve similar results in a single afternoon. With a solid understanding of making simple API calls and writing basic applications, he tackles these challenges effortlessly, right after coming home from his social science exam.
However as only 11-15% of companies have been successful in delivering any positive business outcomes from GenAI till now, the devil is pretty much in the details - there are 99 problems in successfully evaluating, applying and productizing AI:
Models: From Mistral, OpenAI, Claude, LLama, Gemini, and Phi to Ironman, Black Panther, and Green Swan. (the last three are made up :| )
Tools: Auto-GPT, BabyAGI, MemGPT, Autogen, Ragflow, Evals, and ETLs.
Techniques: Deciding on RAG or no RAG, fine-tuning or not, efficient inference without massive investments.
Infrastructure: Vector databases like Pinecone, Weaviate, Faiss, and Chroma, along with orchestration and structured/unstructured ETL pipelines.
…… 100 other things coming soon
On top of this, the sheer volume of papers and models and techniques published each day is exponentially increasing with interest and funding, so much so that keeping on top of it all is almost a full time job.
Glancing over the “New” type of Jobs popping up on my feed and organisation irrespective of their size, vertical, application, all want the new, jazzy, hot right from the oven “AI engineer”.
This is how other subdiscipline of SWE emerged, for ex “site reliability engineer”, “devops engineer”, “data engineer” and “analytics engineer”.
The thousands of Software Engineers working on productionizing AI APIs and OSS models, whether on company time or on nights and weekends, in corporate Slacks or indie Discords, will professionalize and converge on a title - the AI Engineer. This will likely be the highest-demand engineering job of the decade.
One thing you would notice while doing your own research is not a single PhD in sight. When it comes to shipping AI products, you want engineers, not researchers.
While starting into this field ofcourse I started with my own research and what better place than HN, SWE holy ground. How to become an AI engineer. Here are some resources from my reading list:
2. Building & Evaluating Advanced RAG
3. Scalable Data Pipelines with Kafka
4. Automated Testing for LLMOps
6. Serverless LLM Apps on Amazon Bedrock
7. Mastering AI Product Management
9. Build a Writing Assistant Tool with Next.js and OpenAI
11. Machine Learning with Python Libraries
12. Neural Networks for Beginners to Advanced
13. Become a Deep Learning Professional
14. Deep Learning with PyTorch Fundamentals
15. Introduction to Deep Learning
Apart from this, the daily bread of Pytorch, Modern Data stack, Databases yada yada..
But what I realised was most people still consider AI Engineering as a form of either Machine Learning or Data Engineering, so they recommend the same prerequisites. But I guarantee you that none of the highly effective AI Engineers I have spoken to or know off have done the equivalent work of the Andrew Ng Coursera courses, nor do they know PyTorch, nor do they know the difference between a Data Lake or DataWarehouse.
SWE are purists and I want to be excellent at what I do and hence I couldnt look myself in the mirror when I realised the reading papers was really really difficult. Slow realisation was just to make omelettes, I dont need to know who came first, chicken or the egg or what happened during the famous Dandi March movement of India, similarly In the near future I believe, nobody will recommend starting in AI Engineering by reading Attention is All You Need.
Papers, fundamentals, history are important and it definitely serves a different purpose and will make me a well read man but we are talking about productising and applying AI and these things do take a little backseat in this context.
Why is it easier (and more pertinent than ever) to educate ourselves about applied AI engineering?
The answer lies when we contrast Traditional vs Modern Applied AI
Basically, instead of requiring data scientists or machine learning engineers to undertake a laborious data collection exercise before training a single domain-specific model for production, a product manager or software engineer can prompt a large language model (LLM) to build and validate a product idea. This can be done before collecting specific data for fine-tuning.
Apart from this few obvious pointers are
SOTA open and closed models which was a distant dreams for 99% of org in the world.
Frameworks like langchain and llamaindex which have made it SO EASY, that my milkman has created MilkGPT which answers complex questions about everything around “milk”.
GPUs abundance (or scarcity? the rich will never be satisfied I guess)
Multi language support - Python and JavaScript.
etc etc
Prompt engineering - English is the new Coding language.
Your job is safe*
*TnC applied - You have to make friends with AI. Cmon, AI has developed empathy also know, what more do you need?
I don’t have a secret recipe on how to make friends with AI but I know few of the cool places they hangout. DM me and I would be more than glad to help.