A Practical Guide to AI Programming: From Traditional Coding to AI Collaboration

Over the past six months, I have personally built and delivered several projects, from requirements clarification, implementation, testing, and deployment to subsequent production issue fixes and refactoring. Most of the concrete coding work was handed over to AI.

These included both business projects and internal tools. What I went through was not as simple as "building a page". It was the full chain of requirement changes, API integration, testing, deployment, incidents, rollback, and continued iteration. After actually doing this work, one thing has become increasingly clear to me: the biggest change brought by AI programming is not that "code generation is faster", but that the focus of an engineer's work is shifting.

Many things that previously required personally writing line after line of code can now be handed to an Agent. At the same time, however, requirement decomposition, context organization, process constraints, and result verification have become more important than before. You are no longer just writing code. You are designing an engineering system that allows AI to work continuously and reliably.

This article is not trying to prove that "humans no longer need to write code", nor is it trying to describe AI as an omnipotent black box. It is more of a practical interim summary: once AI truly enters the everyday development workflow, how should engineers reorganize the way they work, where is investment worthwhile, and where are the easiest traps to fall into?

Redefining the Engineer's Role

At first, I thought the core of AI programming was whether the prompts were good enough. Later I realized that was only the surface.

What really affects output is often not a single prompt itself, but whether you have thought the work through clearly:

Whether the requirements have been broken down clearly enough
Whether the constraints and boundaries have been written clearly
Whether the project structure and rules are understandable to the Agent
Whether there is a complete verification loop before delivery

So the core work of engineers is shifting from "directly implementing features" to "designing task systems".

In the past, what we cared about most was: how to write this piece of code, how to call this API, how to fix this bug. Now the more common questions have become:

How should this task be broken down?
Which constraints must be established up front?
What context does the Agent need?
Which information should the Agent obtain automatically instead of being supplemented through chat?
What tools and processes are needed as safeguards?

This is also why many people have an illusion: when they first use AI, it feels amazing, but once the task becomes complex, things begin to lose control. It is not that the model suddenly stops working. It is that engineering problems have not disappeared; they have merely changed from "implementation problems" into "organization problems".

In other words, engineers are increasingly becoming system designers. Code is still important, but it is no longer the only main battlefield. What you need to design is an entire environment that makes it hard for AI to go off track. How context is organized, where documentation lives, when processes are triggered, and how results are verified all determine whether AI can ultimately deliver reliably.

Using Agents to Review Agents

Once AI starts producing large amounts of code, a new bottleneck quickly appears: human review cannot keep up.

Previously, a team's daily code increment was relatively manageable, and humans could still barely handle reviewing diffs. Now, if an Agent can generate several files, dozens of functions, and modify tests and documentation along the way, the pressure on human review suddenly becomes much greater. It is not that humans cannot review; it is that human attention is a scarce resource and cannot be scaled infinitely.

So later I built an automated review process, where an Agent performs the first round of checks before submission, and then another Agent performs cross-review. Humans still retain final decision-making authority, but no longer have to handle all initial screening work.

The value of this is not just "saving time". More importantly, it turns review from something that depends on sheer attention into a process that can run at scale. Some issues that humans easily miss, such as boundary branches, exception paths, naming, and logical consistency, are actually easier to expose early in Agent review.

Of course, there is a very important boundary here: I never treat Agent review as a reason to "replace human review". It is more like a quality filter placed earlier in the process. Problems that can first be filtered out by machines should not all be left for humans to bear at the end.

If the code is written by an Agent, then review should also be handled by an Agent as much as possible in the first round. This is not replacing humans. It is reserving human attention for more critical judgments, such as whether the business logic is correct, whether the architectural direction is reasonable, and whether long-term technical debt has been introduced.

Making Applications Readable to Agents

Once the bottleneck of code generation decreases, another more realistic problem appears: verification.

Many teams think the hard part of AI programming is "writing". In reality, the truly difficult part is "confirming that what it wrote is correct". If the application's runtime results, logs, monitoring metrics, and error messages can only be inspected manually, then the whole chain will still get stuck at human QA.

So one thing I later paid special attention to was making applications readable to Agents.

"Readable" here does not mean rendering a page for the model to glance at. It means allowing the Agent to systematically read and understand feedback after the application runs, such as:

Whether the UI state has changed
Whether the console has errors
Whether API responses match expectations
Whether there are exception stacks in the logs
Whether key metrics show obvious abnormalities

Once this information can be reliably obtained by the Agent, verification can move from "a human staring at the page and clicking around" to "an Agent independently performing a round of self-testing and troubleshooting". This does not mean it can replace all testing, but it does mean that many repetitive verification tasks finally have a foundation for automation.

Many times, what really limits AI is not that it "cannot write code", but that it "cannot see the consequences after writing it". So making code writable by AI solves only half the problem. Only by making applications readable to AI do we have a chance to connect development, verification, and troubleshooting.

AGENTS.md Is a Directory, Not a Manual

Context management is the key to moving AI programming from "usable" to "useful".

One very typical trap we fell into early on was trying to stuff all rules into one huge AGENTS.md. We soon found that once the project became even slightly complex, this approach would almost certainly fail.

The reasons are simple:

The context window is a scarce resource, and large files crowd out truly important information
If everything is written in, nothing has focus
Large, all-encompassing rule files rot extremely quickly and are very costly to maintain
A single giant file makes freshness checks, ownership management, and automated validation difficult

The problem is not merely that it is "too long". It creates an illusion for the team: as if putting everything in one place will naturally make the Agent smarter. In reality, the opposite is true. As more and more information piles up, the Agent often ends up doing local pattern matching rather than understanding the system structure.

Later, we changed our approach: treating AGENTS.md as a directory, not a manual.

It does only three things:

Tells the Agent the general structure of the repository
Points out the locations of core entry points and key standards
Guides the Agent to read the content that should actually be maintained in docs/ and the code repository

The benefits are direct. First, AGENTS.md itself can remain concise and will not rot over time. Second, real knowledge can live where it belongs and be maintained by the people closest to the business. Third, the Agent receives a map, not a clump of rule text pasted together.

What you give the Agent should be a map, not a 1,000-page manual. Knowledge that remains effective over the long term should settle into standards documents, design documents, API documents, and the code itself, rather than relying on a single entry file to cover everything.

Why AI Needs a Toolchain, Not Just Prompts

When many people first start using AI programming, the most natural approach is to "rely on prompts". This is completely fine for individual work, small tasks, and short workflows.

But as soon as tasks become complex or enter team collaboration, relying on prompts alone quickly hits a ceiling.

The reason is that prompts are essentially one-off, while engineering requires capabilities that are reusable, collaborative, and governable. Just because you made things clear in one conversation today does not mean another person, another repository, or another Agent can reproduce the same quality tomorrow.

I later became increasingly certain of one thing: if AI is to truly participate in team development, capabilities must be upgraded from "individual improvisation" to a "team toolchain".

A mature toolchain needs to solve at least the following problems:

Allow Agents to connect to external systems instead of only seeing the current conversation
Allow team constraints to be repeatedly enforced instead of relying on verbal reminders
Allow high-frequency processes to be triggered automatically instead of manually each time
Allow effective methodologies to be accumulated, reused, and distributed

For this reason, I later spent less and less attention on "how to write a longer prompt" and more on "how to organize capabilities". Prompts are still important, but they are more like on-the-spot command. What truly determines the upper limit is whether there is a reusable infrastructure underneath.

From Plugin to Marketplace: Turning Team Capabilities into Reusable Assets

To bring these capabilities into team collaboration, I am more accustomed to understanding the whole system from the perspective of a Plugin.

You can think of a Plugin as a packaging unit: it is not a single tool, but a team capability package that combines connectivity, constraints, automation, and methodology into something installable, upgradable, and reviewable.

From this perspective, a complete Plugin usually contains four layers:

MCP: solves the connection problem, allowing the Agent to read systems, documents, and environments
Rules: solves the constraint problem, allowing team standards to be enforced uniformly
Hooks: solves the trigger problem, allowing processes to happen automatically at the right time
Skill: solves the method problem, turning "how this should be done" into reusable workflows

Among these four layers, I think the most critical are MCP and Skill.

MCP determines whether the Agent can touch the real world. Without connections, the Agent can only remain in "chat-style development". No matter how decent what it writes looks, it can easily detach from the real context.

Skill determines whether the Agent knows how to proceed. With connections but no method, the Agent is still stuck in a state of "knowing how to use tools, but having an unstable process". It knows where the documents are, where the tasks are, and where the logs are, but it does not know in what order it should act after obtaining that information.

Rules and Hooks are also important. The former solidifies constraints, while the latter embeds these capabilities into actual workflows. Without constraints, team style drifts. Without triggers, many good processes eventually fall back into "knowing but not doing".

One step further up is the Marketplace.

If Plugin solves "packaging capabilities", then Marketplace solves "distributing capabilities". Its significance is not introducing yet another new concept, but allowing capabilities already accumulated by the team to be installed, reused, and governed at scale. Only at this point does the AI toolchain stop being a private toolbox for a few skilled users and truly enter the level of organizational capability.

Case Study: Using an Open-Source Skill Framework to Carry Engineering Methods

In addition to accumulating some project-specific capabilities myself, I have also long used open-source Skill frameworks to carry engineering methods. Superpowers is a typical example. It has received considerable attention in the GitHub community and has already accumulated a relatively complete Skill system.

The reason I like it is not that it has many Skills, but that it turns many "good engineering practices" from verbal advice into explicit workflows.

For example:

Thinking through requirements and design before starting work
Breaking the design into executable small tasks
Advancing implementation in an isolated environment to reduce the risk of polluting the main workspace
Performing verification after writing instead of verbally declaring "done"
Using multiple Agents to divide work in scenarios suitable for parallelization

These things were not impossible before. They were just highly dependent on individual discipline. Experienced engineers know they should work this way, but when things get busy or a team grows, processes can easily degrade. This is where Skill frameworks are valuable: they turn these behaviors from "suggestions" into "default actions".

The real value of this kind of framework is not "how many techniques it provides", but that it helps turn engineering habits that originally depended on team experience and discipline into a system that can be reused reliably.

It also reminded me of one thing: AI tools themselves are easily over-mythologized, but what often creates the real gap is not the model, but whether you have distilled your working methods. Models can become stronger and stronger, but if the methodology still depends entirely on improvisation, quality will ultimately drift.

A Complete AI Development Loop

If we connect the capabilities above, I now prefer to understand AI development as a complete loop rather than a single-point efficiency tool.

A relatively stable process usually looks like this:

First think through the problem and clarify the goal, constraints, and boundaries
Break the task into smaller execution units
Let the Agent complete the implementation in an isolated environment
Connect external context such as documentation, tasks, logs, or APIs
Use tests, review, logs, and runtime results for verification
Finally decide whether to merge, release, or continue iterating

It is more like a circulatory system than "throwing requirements at AI and waiting for it to spit out some code".

The direct benefits of this loop are not only greater speed, but also greater stability:

During development, missing context and constraints are easier to discover
During integration, external information is easier to connect instead of being manually moved around
During self-testing, Agents can take on more repetitive verification work
During delivery, documentation, status, and change notes can be wrapped up together more easily

Truly sustainable efficiency comes from the loop, not from one impressive generation result.

More importantly, this loop reshapes the way engineers work in return. You will spend less and less time on mechanical implementation, and more and more time designing boundaries, organizing information, checking risks, and validating results. This does not mean the work has decreased. It means the work has changed from "manual coding labor" into "engineering system design".

Pitfall Guide for AI Programming

Finally, here are several traps I think are most worth warning about in advance.

1. If You Do Not Set Rules at the Start, Everything Later Becomes Debt

The initial mode of a project will be quickly copied by AI. If the structure, naming, constraints, and directories of the first few thousand lines of code are unstable, later output will usually only amplify those problems. AI is an amplifier. Good patterns are amplified, and bad patterns are amplified as well.

2. Long Conversations Are Not Assets; They Are Often Hidden Liabilities

After roughly a dozen or twenty rounds, an Agent can easily start conflicting with early decisions. It is not that it suddenly became stupid; it is that the context has already drifted. For large tasks, you should proactively start a new session and explicitly reference previously confirmed conclusions. A lot of rework does not happen because the model is bad, but because people mistake "continuous conversation" for "continuous consistency".

3. "Completed" Does Not Mean Truly Completed

AI is very good at giving an answer that looks complete, but that does not mean it has actually run successfully and been fully verified. Boundary cases, exception paths, timeout handling, and concurrency issues still require explicit verification. Once you frequently start believing "it says it is done", you will very quickly carry bugs into later processes.

4. Critical Logic Must Be Reviewed in the Diff

Some issues may not be immediately visible in functional tests, but they will be very obvious in the diff. The more critical the change, the less you can afford to completely skip change review. Especially in areas such as data fields, rollback logic, permission checks, and time handling, AI can easily make a change that looks reasonable but is actually dangerous.

5. Do Not Make the Architecture Too Fancy

Complex multi-Agent orchestration, too many role definitions, and overly heavy meta-rule systems may look powerful, but their maintenance costs are often very high. Most of the time, simple, clear, composable capability design is more reliable. If something can be solved with one Skill, do not start by introducing a flashy central control system.

6. Just Because One Person Can Make It Work Does Not Mean a Team Can

Individual efficiency and team efficiency are not the same thing. The truly difficult part is getting everyone to share the same standards, tools, and processes, and keeping them updated as the project evolves. Problems that one person can cover with experience are often amplified in team collaboration.

7. Not Every Problem Needs an LLM

Use deterministic programs to solve deterministic problems. AI should be used for understanding, generation, summarization, and planning, where it is truly needed, rather than piling every operation onto an LLM. Many "AI-ified" solutions end up costly and ineffective precisely because processes that should have been hard-coded were forcibly pushed into model reasoning.

Conclusion: AI Programming Is Not About Making You Write Code Faster

Looking back over the past six months, I have become increasingly certain of one thing: the most important change in AI programming is not making you "type code faster", but forcing you to put more energy back into the parts of engineering that truly matter.

That is, these three things:

Designing context
Designing constraints
Designing verification

These things were important in the era of handwritten code, and they only become more important after AI participates in development.

Technology is accelerating, but the essence of engineering has not changed. Requirements must be clear, boundaries must be explicit, structures must be clean, and verification must be solid. What AI amplifies is the capability of the entire system. So what you ultimately get will not merely be "more code", but the engineering method you designed.

If I had to summarize my current understanding of AI programming in one sentence, it would be this: it does not turn engineers into "faster programmers"; it pushes engineers toward becoming "stronger system designers". Whoever can organize context, processes, constraints, and verification better will have a better chance of truly using AI for productivity, rather than stopping at demos and short-term amazement.

# A Practical Guide to AI Programming: From Traditional Coding to AI Collaboration