Open Source Release: Ayup: Facing the Deployment Nightmare

So you are given some Python project which creates an API endpoint which does some AI/ML inference. You need to run this on your own infra, a cloud VM or whatever. What do you do?

Open Source Release: Ayup: Facing the Deployment Nightmare
Open Source Release: Ayup
The clock read 2:30 PM, and the office was a fluorescent-lit hellscape of impending doom. You, a junior IT operations engineer, fresh out of college and still wet behind the ears, was handed a ticking time bomb: the source code for an AI/ML project that was as crucial to the company's survival as water is to a desert wanderer. The architect of this digital monstrosity, a remote contractor with a penchant for disappearing acts, had been maintaining radio silence since dropping the code on Tuesday...

Python API Deployment Challenges

Understanding the Problem

You are given a Python project that creates an API endpoint for AI/ML inference. Your task is to run this on your infrastructure, whether it's a cloud VM or another setup. But where do you start?

If you have expertise in Python, Docker, Kubernetes, Sagemaker, etc., you probably have a good idea of what to do. At the very least, you know where to look or what questions to ask your favorite LLM. Yet, even with experience, you might still spend 30 minutes deciphering some cryptic error message that is triggered downstream of the root cause.

The senior staff were no help. They had skillfully found ways to not be involved and waved you off with dismissive hands.

If you don't have the experience, then this could get real ugly. While all that is needed to run the project could be a single YAML file or a one-liner. Getting to that point could require a non-trivial amount of contextual knowledge.

On top of that there are a lot of choices to make. You could be faced with a choice between Docker, Kubernetes or a VM. If that has been made for you, and the decision was Kubernetes, then there is still a vast and wild array of options for doing just about anything.

AI/ML Code Deployment

The Ayup Solution to Infrastructure Complexities

You were on your own, drowning in a sea of code and jargon you'd barely begun to understand. The deadline loomed like a guillotine set to drop at 4 PM. Panic gnawed your insides, a rabid beast that threatened to consume you whole. But then, like a beacon of hope in a stormy night, you remembered a whisper they'd heard in the lunatic fringes of the DevOps community: Ayup.

This isn't considering the code itself, maybe the dependencies are not listed in the project. Perhaps the interface it binds to or the ports it listens are not documented. It could require a volume to be mounted and so on.

Perhaps a seasoned expert could glance at the code and immediately understand what's required, but for a newcomer, it's another rabbit hole. Even with an LLM chat, providing the correct context can be challenging.

Maybe the code doesn't run on a standard laptop, perhaps it needs CUDA, 200GB of RAM, Linux or some more exotic hardware. Then you need a way to push code or executables to a remote machine, debug a remote machine, access the application on a remote machine.

This all adds up to confusion and decision fatigue.

Ayup was the stuff of legends, a tool that could automagically analyze, build, and deploy applications with the grace of a ballet dancer and the precision of a sniper. It was your only hope, a Hail Mary pass in the final seconds of the game.

Simplified DevOps Automation

Ayup's Vision for Seamless Deployment

Even if you do know all these technologies, it's easy to forget when talking about them in the abstract, just how many little paper cuts they can give. It's not that the problems are insurmountable, or even individually challenging. It's that in aggregate they add up to a perceivable distraction, like a mildly flickering light or whining computer fan.

Dealing with these issues is a learning opportunity, but if learning arbitrary details of Kubernetes operators is way off your primary focus, then they can be an unwelcome one.

To that end we want an application which removes the unnecessary decisions, research and challenges. Makes it easy to run code remotely and avoids introducing its own education burden.

A tool where you point it at the code and it figures out how to build and deploy it. If it can't figure out a particular step it falls back to asking you, the user, for the relevant info. Then goes through an interactive process to see if your answer worked.

Thus reducing distractions and focusing limited human attention on the places that matter.

AI/ML Deployment Tool

Leveraging Ayup for Success

With trembling hands, you invoked Ayup. The screen flickered to life, lines of code scrolling by like the Matrix on fast forward. Ayup worked its magic, dissecting the AI/ML monstrosity with surgical precision. You watched in awe as the tool built the application, each step a symphony of ones and zeros.

We also want the option to run this tool on our hardware or cloud and easily manage both the client and the server. We want the freedom to choose where an application runs for two key reasons:

  1. It may require or benefit from particular hardware
  2. The hardware may need to be in a particular location

With the cloud being so dominant it's hard to imagine how often these two things are true. However once you step outside the SaaS world of homogenised compute and reliable internet connections, it becomes pretty obvious.

In fact it is pretty obvious if you consider why it is that people buy expensive Macs to do software development when they could just rent the compute from the cloud and go with the base model.

Ironically Ayup helps with using a remote computer to build your software, in my case enabling me to build and run a heavy AI/ML application while on my ancient Mac Book Air. It's also possible to SSH into my workstation and use Neovim or host VS code remotely.

These all come with tradeoffs, especially latency, and the important thing to recognise is that having the freedom to run the software where you want opens up a whole host of possibilities.

Open Source DevOps Tool

Ayup Saves the Day

Time ticked away, each second a hammer blow to their frazzled nerves. But Ayup was relentless, a digital savant that knew no fear, no hesitation. At 3:58 PM, the deployment process completed with a triumphant beep. The API endpoint was live, the company project saved from the brink of disaster.

Ayup is an Open Source build and deployment tool initially focused on AI/ML projects which provide an inference endpoint or web API. Today it's in the early stages of development and we are trying to position it on a trajectory to meet the vision and overshoot it.

💡
Ayup is in the early stages of development, but you can try it out today on GitHub. If you think it's cool give it a star or tell us what does or doesn't work for you.
You slumped back in their chair, a hero in the shadows. Just as you began to catch your breath, the office erupted in a chorus of notifications. The project manager's eyes widened in disbelief as they saw the API endpoint humming with life. Word spread like wildfire, and soon the senior staff emerged from their daydreams, their faces painted with a mix of shock and admiration.

Monetizing Open Source

Ayup's Revenue Strategy

So you may wonder, if Ayup is Open Source and easy to use, how are Prem, a venture backed startup, going to make money from it?

Presently we have three lines of potential revenue:

  1. Control plane
  2. Cloud partnerships
  3. Integration with Prem's gen AI services

Let's start with the first; while it is easy to install Ayup on one node and link one client to it. Having multiple Ayup servers with multiple clients linked to each and can turn into a management ordeal.

Imagine that you have multiple staff and want to give them all access to a new node or remove one staff member from all the nodes.

Further lets say that one of your Ayup servers is behind NAT, then you have to find a way of forwarding ports to it or join it to a VPN. For this you need a relay node that is publicly available.

For these kinds of situations we will offer an optional control plane as a service. It won't see what work is being done by Ayup servers, instead it will facilitate setting up connections and managing access.

The control plane will also tie into 2. and 3. in the case of cloud partnerships it will allow you to spin up Ayup servers on demand.

For 3. we have something bigger in the works where Ayup in conjunction with the Prem Platform will be the delivery mechanism for generative AI. Enabling a level of automation that is not feasible without fine grained integration of LLMs adapted to this particular use.

AI/ML and LLM Integration

Ayup's Technological Foundation

In the past applications like Ayup relied purely on rigid deductions from properly defined configuration and hard-coded heuristics.

Ayup will use those things too. If we find that there is a common and unambiguous configuration file that we can mechanically detect and use, then we'll do it. There's no need to introduce an LLM into that situation.

On the other hand, let's say we have a project where we can mechanically deduce some parts are written in Python, others in Javascript and some scripts in shell. There are multiple places that could be an entry point and we haven't seen this situation enough to write an analysis function for it.

We can feed the LLM the context, filtered by our deterministic analysis and get it to produce answers to where the entry point is and perhaps how to invoke it.

Possibly once the entry point has been identified the project can be reclassified as something mundane and well known. Then a rigidly coded method can be activated to complete the job.

Instead of actually looking for the entry point, the LLM can write a heuristic for it instead. Meaning that when the heuristic proves reliable, the need to invoke the LLM at all is eliminated in the future.

This all requires a tight integration of the LLM with the rigidly coded parts of the system. The input to the LLM must contain the most relevant information and the output must be tightly bounded by what the rigidly coded part of the system can use.

This is a common challenge to many applications of LLMs and one that Prem is eager to solve and replicate across domains.

Building Ayup

Containers, Kubernetes and Beyond

Today, Ayup leans heavily on containers. It builds and deploys a container using Buildkit with Containerd and Nerdctl to build and run the containers.

Ayup itself could also be run in a container, so that you can put it on a Kubernetes node, give it access to the Containerd socket, then the images it builds will be instantly available to the node.

This isn't the base case, we intend to bundle the necessary components with Ayup and run them in rootless mode. However Ayup is compatible with the Docker/Kubernetes ecosystem and will most likely continue to be.

Compatibility and reuse can sum why we use containers, why Ayup is written in Go and so on.

However we did decide to break from some of the existing ecosystem in the interest of our usability and performance goals.

We could have wholesale adopted Kubernetes, one of the various CI/CD pipelines, thrown Buildpacks in there and so on. It's been done before and the results are OK once you get setup, but it feels like these systems place a high burden on the user.

More to the point, adding another layer to these systems to remove the burden, will conflict with the underlying components. It's like forcing a round peg into a square hole.

For example, the Kubernetes reconciliation loop is perhaps not the primitive you want to build on if you want things to run, pass or fail immediately. You essentially end up doing battle with Kubernetes so that you can tell the end user exactly what is happening.

Buildpacks are not necessarily what you want if you need to ask the user a question at some arbitrary point or roll back to a previous decision if something fails. This isn't to say that we can't support Buildpacks as an opaque module we can call out to, but they aren't going to be a core part of the system.

The same is true for Kubernetes, we can support eventual deployment to it, using our operator perhaps, but it's not at the foundation of the system.

Even adopting Buildkit feels like we are admitting some assumptions about how software is built. However in the interest of being able to show something in a timely manner, that's what we are using today.

Ayup's Next Steps

Presently the system is separated into stages called analysis, build and run. This could prove to be a premature abstraction and may change in the future. Most likely the first change would be to merge them, then as we add support for more program types, create abstractions from the emergent properties of the solution.

Indeed the technology will be shaped by each new application, or application class, we support. If there is a type of application you'd be interested to see supported, feel free to open an issue on our GitHub or E-mail [email protected]. Although the initial focus is on AI/ML inference endpoints, we can always discuss things outside of that.