Babysitting and AI agents -- Six-year recap at Sourcegraph

Preface Link to heading

It has been another three years at Sourcegraph to make my six-year anniversary, I can’t stop saying that time flies. My life has changed a lot, my role at work has changed a lot, and the world has changed a lot. It’s a(nother) good time to sit back and, as I always like to say, think it through. To think about what exactly, though? I think it would be something like how to balance work, life and the future.

As usual, this post is entirely written by my bare hands, nothing is LLM generated or fine-tuned, pure human intelligence/stupidness. Hope you will enjoy it!

Things that I am proud of Link to heading

For my achievements over the last three years, technical-wise, the main theme would be application security and (what I call) application infrastructure, and if I can only name one thing that I am the most proud of, that would be Sourcegraph Accounts (accounts.sourcegraph.com, internal code name, SAMS, stands for Sourcegraph Account Management System).

What was it? Link to heading

In a nutshell, I have introduced it to people by saying “a mini WorkOS inside Sourcegraph” and that seems to click for people. Yes, technically, it was just an unsexy IdP that was built on top of industry standard protocols like OAuth and OIDC. Building the system itself wasn’t hard at all, not to mention we outsourced the trickiest part to org/fosite. The hardest part was to paint a future where all Sourcegraph products can have Google-like cohesive account experience and a path to zero-trust service security, then spec out an iterative roadmap for implementation, and finally convince all the stakeholders by answering the question of “why now”.

“Why now”? Link to heading

We were trying to launch Cody PLG (an AI coding assistant, which had been shut down at the time of this writing), and we have the inertia at Sourcegraph to throw anything non-Enterprise to Sourcegraph.com/search, which is just a public instance backed by a Sourcegraph Enterprise instance. The problem was very obvious (to me and some others) that Sourcegraph (as an application) is built with no security measures in many aspects, because it is designed to be a single-tenant offering deployed to secure and obscure environments. Opening Sourcegraph.com to self-service Cody PLG was IMO a huge security risk, because there were simply too many holes to cover due to conflict of design patterns and too many historical tech debts to carry along. Its account system is not built to handle abuse prevention, management on scale, data security, and more - we needed a standalone system that focuses on only one thing, accounts for the broader internet.

This standalone system would offer following advantages to the success of self-service Cody PLG:

Increase the critical mass of the Cody PLG GA announcement.
- A unique opportunity to advertise our user accounts as Sourcegraph brand accounts (or Sourcegraph Identity as previously attempted) that are naturally used/useful for connecting anything Sourcegraph (editor extensions, Cody clients, and any other future products, e.g. license management console).
Develop and operate in secure environments that protect our highest-staking assets.
Offer robust SLA that do not interrupt our users when they make payments.
- Sourcegraph.com/search had no SLA.
Easier and straightforward SOC2 controls and audit reports.
- Sourcegraph.com/search had way too much unrelated stuff to make a SOC2 audit worth it.
Provide better opportunities for integrating mature third-party providers for things like feature flags systems.

That is the digest of my 10-page Google doc pitch.

What was my role? Link to heading

Obviously, I was the project tech lead from pitching the idea, designing the whole system and derived frameworks and integrations, specing out the roadmap, to its implementation, and of course, successfully launched it. Ask me anything.

What was the impact? Link to heading

The work of the original scope (a centralized account management system) was paid off at the time of launch because two services were able to outsource the accounts capability and focused on core product logic:

Cody PLG that was selling Cody to individual users
Cody Analytics (later renamed to Sourcegraph Analytics) that displays usage analytics dashboard to enterprise customers

Beyond the original scope, notable ones were made possible due to the existence of a centralized account management system:

A unique account ID to identify the same user across different services in our data pipeline. Replaced hacky and fragile user mapping between different services. Abuse management also became way easier because of this.
Because we had a unified way to represent a user, we were then able to establish a unified way to represent user permissions. We implemented a company-wide IAM framework based on ReBAC/FGA (built on top of OpenFGA), replacing ad-hoc permission models with scalable, secure-by-default practices.
Because we already have the OAuth mechanics in place, we standardized company-wide machine-to-machine (M2M) authentication and authorization, elevated infrastructure security by applying zero-trust principles. On the same train, we also implemented Token-as-a-Service for other services to use.
Because we were able to uniquely identify a user, a notifications system was implemented originally for automating account deletions for GDPR purposes, but later also used for time-bound access propagation, session synchronization, etc.
Google-like session federation across different Sourcegraph products even with different domain names.
Ported IdP implementation and production operation experience back to Sourcegraph Enterprise, so that we could finally end the days where our users could only have static access tokens.
Painted another bright future with Security Token Service (STS), did not end up implementing this one except a 21-page RFC covering from system design to implementation details.

Grow the team and carry out multi-year visions Link to heading

When the Core Services team was founded, there were only I and the other engineer, loaned out of the Cloud Ops team (our DevOps team that is responsible for operating our managed Cloud). Sourcegraph was only getting started doing SaaS business with hosted services. Lots of greenfield initiatives along with a very broad ownership of legacy services and practices just happen to fall on our team. Two years later, our team size went from 2 to 3, and now 8 ICs. We become the pioneer and expert on platform engineering within Sourcegraph, we set up, advocate, and provide docs and guidance on standardizing not only how to do DevOps, but also how to outsource frameworks to us and let product teams to be laser focused on core business logic. On-call playbooks are now consistent, learn it once, master it for all. We decomposed and migrated all the legacy services to the new standard platform that every team uses. Completing SOC2 audit has never been easier, time-bound access, staged rollout, audit trailing, observability, debugging, profiling, service dependencies, inter-service communication, email delivery, and everything is in a single solution.

At the same time, we are also not just a traditional platform engineering team. We ship customer-facing product features and services too! Just to name a few, enterprise customer portal, dashboard for Sourcegraph Analytics, workspace services for SMBs, billing, IAM, abuse.

It is an amazing journey when I look back at how much we have delivered and upleveled the company in just two years.

Being a thought leader on complex topics Link to heading

One example of this was how I completely transformed how Sourcegraph thinks about abuse management from ad-hoc “quick wins” that often step on toes in just a few weeks to a systematic approach with methodology. Again, I pitched in an RFC format, outlining why the abuse problem is inherently inevitable (by the nature of growing awareness and user base), false-positives are faulty (high rate with one-off efforts that drives away legitimate users), and why combing signals from all sources to form a holistic decision is much more efficient and effective than making decisions individually in information silos. More importantly, I laid out in a way that doesn’t ask for large upfront investment to see the value, but by having the methodology, we can agree on a framework for making decisions, and by starting with low-handing fruits that can generate immediate values in the very short term, and building iteratively (when ROIs are justified) towards long shots that are building on top of each previous step, and ultimately become value multiplier in the long run. I broke it down into three pillars, which are entry-point gating, reactive checking, and continuous monitoring. Stakeholders include Finance, Operations, Product Management, Engineering, and Security were all part of the discussions and were all aligned on it. Many of the proposed ideas have been implemented with clear values in return.

My personal growth Link to heading

I think I have grown much more in the past three years than the three years preceding it. Owned bigger initiatives that carried more business impact and risks, became a thought leader on IAM, billing and abuse management, an advocator for application security and application infrastructure, and transitioned from an one-man army to a force-multiplier for the team and the company. Last but not the least, a father.

My perspectives on AI agents Link to heading

In my opinion, the commoditization of LLM is the biggest productivity paradigm shift since Google search. It’s not just about programming, or coding, which is merely a specialized use case of infinite others. It is about anything, any problem or task that can be seen as sequence processing is now with added capability from LLM. By commoditization, I mean models that were only accessible to companies with a big enough research team and infrastructure are now only one-API-call away with no time- and money-consuming upfront investment.

Ever since my daughter was born, my wife and I have been using ChatGPT very extensively for childcare questions, send pictures for quick diagnostics (way faster than two business days from pedestrians, and obviously, at no extra charge with our Plus subscription), probe questions we don’t even know how to ask but guided us through, how to plant and care cherry blossom, ask what bugs are crawling on my floor and how to treat them, compare products and do quick researches for us, in-depth chat for system designs of my side projects, design and iterate on project logos (way faster than two-day feedback loop for a real designer at no extra cost, for side projects only obviously). The list goes on and on. Nowadays, unless I am explicitly looking for webpages, I don’t use Google search anymore.

Be mindful of hallucination, of course. That said, I am sincerely confused why people are so mad at LLM when the model hallucinates, while at the same time, human beings can, either by purpose or unintentionally lie or say dumb things, in a confident manner.

AI agents are basically programs now with added capability from LLM to either solve some existing problems more efficiently, or solve problems that were extremely tedious or practically impossible to think about a solution. I have been joking to my colleagues that I now do programming in natural language. However, LLM being LLM, it doesn’t truly understand things like our human brains, both “Generative AI” and “Large Language Model” are still too much of an oversell in my point of view, thus mentally I see LLM as Statistical Guessing Machines. After all, all it does is to guess what’s the most-likely output. That’s to say, LLM has its strengths as well as its limits, it’s not a silver bullet or a cure to everything like what you see on social media, it’s a tool.

LLM vendors are the new players to PaaS, yeah, I see LLM models as new VMs. Which means, judging by how pricing drama has been going, we are still in the very early stage of the product lifecycle for AI agents. When was the last time all those SaaS products passed along their CPU, RAM, disk and network bandwidth cost directly to its customers? How often do SaaS customers care about how many CPU seconds it requires to refresh a webpage, make an API request, change something on the UI? The trend is going from seat-based pricing to usage-based pricing because there are not enough value added on top of all those LLM models by today’s AI agents, companies have no choice but to pass along the direct cost of LLM models to its customers in the name of transparency, but also in the cost of unsustainable business, or uncompetitive business revenue model. One possible theory is that the industry hasn’t figured out a way to charge customers by the outcome produced by those AI agents, in the same way that we are being charged by CPU seconds because it is unclear how much value we could get by using those CPU seconds, i.e. could be idling, could be running useless programs, could be serving a bank transfer, who knows. At the end of the day, people pay for something because they get more value out of it for a “profit”.

The more exciting future about AI agents is how we empower old features and workflows with their added capability to make them more powerful. In that sense, while devs, or programmers, or software engineers are the loudest group on social media about their opinions on AI (coding) agents, and while we in the industry will continue to exist as a market segment, I hold my belief that future big spenders market segments are actually non-devs. Taking a simple example like website builder, if we calculate the absolute cost of a domain, a VPS (maybe not even a VPS but actually a web hosting with a simple control panel), and an in-browser file editor, that’s insane to think about they could have people charged anywhere from a 30 dollars per month to hundreds, and even charge for number of visitors or page views. In contrast, I, as a worker in the tech industry, will continue trying to squeeze everything possible into my $6 DigitalOcean droplet and call it a day. My point being, people who do not understand how things work behind the scenes, they come to find a solution that works for them, and are more willing to pay absurd markups than people who are capable of DIY.

What about the moat, is there a moat in the world of AI agents? Please don’t quote me, but I think the access to user data continues to be the moat since the introduction of online services. In terms of LLMs and AI agents, it’s the data that can be used for model training, and better than ever, they’re provided automatically while using existing models, and no longer publicly accessible to everyone but gated behind corporation network firewalls. Agents are tirelessly uploading instead of users being selective and manually posting. Enough has been stolen from the internet. Just an obvious example, all the private clients banking data will only be accessible by those banks and that’s not something a generalized model provider could easily get access to, if ever, to train models for financial specialist agents.

Workflow orchestration with AI agents (a combination of rule-based and agentic-based), IMO, is the next big trend, and more importantly, whoever wins the human-in-the-loop UI/UX will likely be substantially more successful than its competitors. Things like, when to stop and ask for human feedback, how to resume or pick up the feedback, how to collaborate with humans and use the strength of both sides, how to “infinitely” scale the agents to make 10x or even 100x output and beyond.

The role of security and platform engineering Link to heading

The role of both security and platform engineering isn’t changing much, if not getting more important.

For security, it is currently an exciting and evolving time as the industry hasn’t figured the stuff yet. Completely new attack vectors, very similar to the early days where most web applications were vulnerable to SQL injections due to lack of industry-wide solutions and best practices. Think of it as another cycle of “vulnerable PHP websites” where exploits are all around the places for pentesters to discover and bad actors to abuse. Also see Your AI coding agent is a spy.

For platform engineering, every future service is gonna have a bit of “AI” in it, it’s not that much different with some changing characteristics of services. We continue to provide support in the outer loop as my great team lead Robert Lin likes to say.

My plan to stay on top of the trend Link to heading

It’s just my wild dream, but I feel my blood pump every time I think about it. I am experimenting to build an AI agentforce with workflow orchestration to ship more for me way beyond the number of hours I can sit with my keyboard. Ultimately, I need “dev” help to keep up maintaining and creating more side projects while I just keep brainstorming with my phone while babysitting. I see the light with AI agents.

This is nothing really new architecture-wise because it’s the decades-old workflow engine technology, and now with the ability to “shell out” to AI agents in addition to good-old APIs to make do more things possible.

I think it’s good for me in a couple of very good reasons:

Getting my hands deep in the mud from designing, implementing and operating a workflow engine. A very good learning experience.
Having the perfect excuse to trial and error all possible AI agents on the market, with actual use cases not just demo-grade “wow” moments, or even making some myself, to keep on top of the trend with everything happening around here.
Chasing a dream to unlock 100x shipping throughput for my side projects.
- Maybe some at work, too. I really want an agent that monitors all of the Slack channels and prompts me about the threads that I might be interested in. Keyword matching sucks!

Unsolicited advice for new professionals Link to heading

The beginning of my middle school history book goes like this, “what separates humans from other species is the ability to find, use, improve and create tools.” People can argue for specifics with no real value added to this conversation, but I think it is generally true. AI agents are a tool that sets a new ground-level expectation of productivity, just like all the predecessors. People who can’t adapt to the new productivity expectation will be replaced, or working for a lower margin than ever before. What previously mattered continues to matter now if not more, like problem-solving skills, critical thinking, the desire and ability to just figure things out. Use the tools that are available to you, and use them often to know them well.

Outro Link to heading

Thank you for reading to the end, and I wish you a nice day (even if you just scrolled to the bottom and did not read at all)!