Contributing to llm-d
Contributing Guidelines
Thank you for your interest in contributing to llm-d. Community involvement is highly valued and crucial for the project's growth and success. The llm-d project accepts contributions via GitHub pull requests. This outlines the process to help get your contribution accepted.
To ensure a clear direction and cohesive vision for the project, the project leads have the final decision on all contributions. However, these guidelines outline how you can contribute effectively to llm-d.
How You Can Contribute
There are several ways you can contribute to llm-d:
- Reporting Issues: Help us identify and fix bugs by reporting them clearly and concisely.
- Suggesting Features: Share your ideas for new features or improvements.
- Improving Documentation: Help make the project more accessible by enhancing the documentation.
- Submitting Code Contributions (with consideration): While the project leads maintain final say, code contributions that align with the project's vision are always welcome.
Code of Conduct
This project adheres to the llm-d Code of Conduct and Covenant. By participating, you are expected to uphold this code.
Community and Communication
- Developer Slack: Join our developer Slack workspace to connect with the core maintainers and other contributors, ask questions, and participate in discussions.
- Weekly Meetings: Project updates, ongoing work discussions, and Q&A will be covered in our bi-weekly project meeting every other Wednesday at 12:30 PM ET. Please join by adding the shared calendar. You can also join our Google Group for access to shared diagrams and other content.
- Access meeting recordings and meeting notes on the llm-d Public Google Drive
- SIGs: Join one of our Special Interest Groups (SIGs) to contribute to specific areas of the project and collaborate with domain experts.
- Access SIG meeting recordings and project documentation on the Public SIG Documentation Google Drive
- Code: Hosted in the llm-d GitHub organization
- Issues: Project-scoped bugs or issues should be reported in llm-d/llm-d
- Mailing List: llm-d-contributors@googlegroups.com for document sharing and collaboration
- Social Media: Follow us on social media for the latest news, announcements, and updates:
- X: https://x.com/_llm_d_
- LinkedIn: https://linkedin.com/company/llm-d
- Reddit: https://www.reddit.com/r/llm_d/
- YouTube @llm-d-project
Contributing Process
We follow a lazy consensus approach: changes proposed by people with responsibility for a problem, without disagreement from others, within a bounded time window of review by their peers, should be accepted.
Types of Contributions
1. Features with Public APIs or New Components
All features involving public APIs, behavior between core components, or new core repositories/subsystems must be accompanied by an approved project proposal.
Process:
- Create a pull request adding a markdown file under
./docs/proposalswith a descriptive name (e.g.,docs/proposals/disaggregated_serving.md) - Use the template at
./docs/proposals/PROPOSAL_TEMPLATE.mdwith these sections:- Summary: A sentence or two suitable for any contributor or any user to understand the change proposed and the outcome
- Motivation: Problem to be solved, including Goals/Non-Goals, and any necessary background
- Proposal: Can include User Stories ("As a User I want to X"), should have enough detail that reviewers can understand exactly what you're proposing, but should not include things like API designs or implementation. What is the desired outcome and how do we measure success?
- Design Details: Should contain enough information that the specifics of your change are understandable. This may include API specs (though not always required) or even code snippets. If there's any ambiguity about HOW your proposal will be implemented, this is the place to discuss them.
- Alternatives: Provide alternative implementations/proposals and a short summary of why they were rejected
- Get review from impacted component maintainers
- Get approval from project maintainers
The proposal must be reviewed by the impacted component maintainers and approved by project maintainers. Proposal review should enforce overall principles and ensure consistency and coherence of the project. Approval of a proposal should reflect lazy consensus that the proposal is the right path, and the proposal should have high priority for review.
2. Fixes, Issues, and Bugs
For changes that fix broken code or add small changes within a component:
- All bugs and commits must have a clear description of the bug, how to reproduce, and how the change is made
- Any other changes can be proposed in a pull-request to a component or an issue in llm-d/llm-d, a maintainer must approve the change (within the spirit of the component design and scope of change)
- A good way to bring attention for moderate size changes is to create an RFC issue in GitHub, then engage in Slack
- Within components, use project proposals when scope of change is large or impact to users is high
Before You Write Code
AI tools have lowered the cost of opening a PR without lowering the cost of reviewing or maintaining one. The guidance below keeps contributions proportional to the project's capacity to support them.
Contributions that Benefit from Discussion First
These are not requirements, but contributions in the following categories tend to land more smoothly when the approach is agreed on first. Opening an issue or proposing in the appropriate SIG channel is usually faster than writing a PR and iterating.
- New features. Even features that do not rise to a project proposal (no new public API or component) benefit from a brief issue describing the problem and the proposed approach before implementation.
- New testing methodologies. Fuzzing, property-based testing, chaos testing, load testing, or other testing approaches that introduce a new class of ongoing maintenance (new CI jobs, curated inputs, triage, release-gating policy). See the worked example below.
- New external dependencies. Require maintainer sign-off.
- Renames or other API-affecting changes. See API Changes and Deprecation.
Every contribution creates ongoing cost: review time, CI time, flake triage, and future maintenance. A good problem statement captures that cost alongside the benefit, which is what an issue or proposal makes visible before code is written.
Worked Example: A New Testing Methodology
Adding a new testing framework (fuzzing, property-based, load, chaos) improves robustness on paper. In practice, the PR itself consumes significant review cycles to align on scope and ownership, and once merged it introduces a new CI job, inputs to curate, a new category of flake to triage, and an implicit policy call about whether findings block releases. An issue first absorbs the alignment work at a fraction of the cost.
Before opening a PR, please raise an issue or proposal covering:
- What robustness gap is this closing? (A reported bug, a history of failures in this area, a security concern.)
- What components are in scope?
- Who owns the inputs, CI job, and triage?
- What is the gating policy for findings from this methodology?
Once those are settled, the implementation PR is usually straightforward.
Contributions We May Decline
Examples of patterns maintainers may close or redirect:
- Speculative hardening. Guards or error handling for conditions that cannot occur given current code invariants. If the condition can actually occur, please open an issue with a reproducer instead.
- Defensive abstractions without a caller. New interfaces, factory indirection, or generic wrappers introduced for anticipated future use. Maintainers may ask you to defer these until there is a concrete caller.
AI-Assisted Contributions
AI-assisted contributions are welcome under the same standards as any other contribution. The human submitter is the author of record and must be able to defend the change on substance.
Feature testing
The first key step in testing a feature, or bugfix is to identify what layer of the stack are you testing. Here are some test cases:
Deployment related changes
- Swapping GIE helm chart version and
llm-d-inference-schedulerimage upgrades - checkllm-d EPPcontainer logs- Check that your
InferencePoolexists (kubectl get InferencePool.inference.networking.k8s.io)
- Check that your
- Upgrading Infra helmchart or anything affecting Gateway infrastructure
- Check the
gatewayobject (kubectl get gateway -o yaml)- Check the
statussection, make sure it has anaddressand that there is a message saying"Resource programmed, assigned to service(s) <gateway_service_address>" - Check the
parametersReffor thegatewayinfrastructureexists (kubectl get gateway wide-ep-inference-gateway -o yaml | yq .spec.infrastructure.parametersRef, and then check to ensure that resource itself exists)
- Check the
- If using
istioalso check that yourDestinationRuleexists
- Check the
- Check the
httpRouteobjectstatussection (kubectl get httpRoute -o yaml | yq '.status.parents[]')- Ensure there is a message in the conditions stating:
"Route was valid" - Ensure there is a parent ref on the
httpRoute, pointing to thehttpRoutebeing attached properly to thegateway
- Ensure there is a message in the conditions stating:
- Modelservice helm chart upgrades
- Ensure
vLLMpods up prefillanddecodepodmonitors are deployed if metrics are enabled
- Ensure
Container Image Build Changes and Upgrades
- Kernel upgrades and changes (
deepep,deepgemm) - Ignoreflash-infer- To test these ensure you use the proper vLLM backend via the
--all2all-backendvLLM CLI arg.- For single-node testing without extra dependencies, use
allgather_reducescatter(the default) - For testing the deepseek kernels, you can set
prefills backend todeepep_high_throughputanddecodebackend todeepep_low_latency- This needs to be tested in either
pd-disaggregationor better yetwide-ep-lws
- This needs to be tested in either
- For single-node testing without extra dependencies, use
- To test these ensure you use the proper vLLM backend via the
UCX+NIXLversion bumps and changes- This can be tested in
pd-disaggregationorwide-ep-lws - Currently we build
UCXfrom source, and then buildNIXLagainst our build ofNIXL
- This can be tested in
LMCacheversion bumps and changes (coming soon)- Currently nothing uses the
LMCachecodepath directly, this will come as a subset of the KVCache offloading epic
- Currently nothing uses the
vLLMversion bumps and changes- By default we build
vLLMwith precompiled binaries from the upstream vLLM wheels index. - This can be tested in any example
- By default we build
EFA- To test the libfabric plugin itself over NIXL you can do the following inside a container image built with EFA support (does not require GPUs or EFA):
export NIXL_LOG_LEVEL=debug
python3 - <<'EOF'
from nixl._api import nixl_agent, nixl_agent_config
agent_config = nixl_agent_config(backends=["LIBFABRIC"])
nixl_agent1 = nixl_agent("target", agent_config)
EOF
- To test actual inference over EFA in AWS with P5+ instances ensure that
UCX_TLSincludes an option with high priority for accelerating over EFA via an ENV var:
- name: UCX_TLS
value: "efa,sockcm,sm,self,cuda_copy,cuda_ipc"
- Ensure that the containers request an instance of the EFA resource:
requests:
vpc.amazonaws.com/efa: 1
Container Image Checklist
-
llm-d Routerguide -
precise-kv-cache-awareexample -
pd-disaggregationexample (also covers deepseek kernels) -
wide-ep-lwsexample (also covers deepseek kernels) - a
guidellmbenchmark to do a load test for performance regressions (any example) - run
pd-disaggregationorwide-ep-lwswith deepseek kernels (forprefills set--all2all-backendtodeepep_high_throughputand setdecode--all2all-backendtodeepep_low_latency)
Code Review Requirements
- All code changes must be submitted as pull requests (no direct pushes)
- All changes must be reviewed and approved by a maintainer other than the author
- All repositories must gate merges on compilation and passing tests
- All experimental features must be off by default and require explicit opt-in
Commit and Pull Request Style
- Pull requests should describe the problem succinctly
- Descriptions should accurately reflect what the diff does
- PR ownership: the submitting contributor is the author of record and should be able to explain the code, justify design choices, and respond to review on substance
- Scope discipline: keep changes sized to the stated problem; large, wide-ranging diffs may be asked to split or trim
- Rebase and squash before merging
- Use minimal commits and break large changes into distinct commits
- Commit messages should have:
- Short, descriptive titles
- Description of why the change was needed
- Enough detail for someone reviewing git history to understand the scope
- DCO Sign-off: All commits must include a valid DCO sign-off line (
Signed-off-by: Name <email@domain.com>)- Add automatically with
git commit -s - See PR_SIGNOFF.md for configuration details
- Required for all contributions per Developer Certificate of Origin
- Add automatically with
Code Organization and Ownership
Components and Maintainers
- Components are the primary unit of code organization (repo scope or directory/package/module within a repo)
- Maintainers own components and approve changes
- Contributors can become maintainers through sufficient evidence of contribution
- Code ownership is reflected in OWNERS files consistent with Kubernetes project conventions
Core vs Incubating Components
- Core components: Supported by the project with strong lifecycle controls and forward compatibility
- Incubating components: Rapidly iterating, not yet ready for production use, allowing greater freedom for testing ideas
Experimental Features and Incubation
We encourage fast iteration and exploration with these constraints:
- Clear identification as experimental in code and documentation
- Default to off and require explicit enablement
- Best effort support only
- Removal if unmaintained with no one to move it forward
- No stigma to experimental or incubating status
Incubating Components Process
- Create repositories in
llm-d-incubationGitHub org with maintainers and clear goals - Define timeframe for experimentation
- Iterate and test with initial users
- For well-lit path components:
- Create project proposal covering integration
- Define graduation success criteria
- Add to well-lit path after approval
- For standalone components:
- Create project proposal with graduation criteria
- Component can be used with experimental label
- Graduation: Move to core
llm-dorg and follow core process - If not graduating: Archive for 3+ months before removal
Experimental Features in Core Components
- Open pull request to existing core component
- Maintainer classifies as experimental, enforces "off-by-default" gating
- Provide tests for both on/off states
- When graduating, default to on and remove conditional logic after one release
Naming convention: Experimental flags must include experimental in name (e.g., --experimental-disaggregation-v2=true)
API Changes and Deprecation
- No breaking changes: Once an API/protocol is in GA release (non-experimental), it cannot be removed or behavior changed
- Includes: All protocols, API endpoints, internal APIs, command line flags/arguments
- Exception: Bug fixes that don't impact significant number of consumers (As the project matures, we will be stricter about such changes - Hyrum's Law is real)
- Versioning: All protocols and APIs should be versionable with clear forward and backward compatibility requirements. A new version may change behavior and fields.
- Documentation: All APIs must have documented specs describing expected behavior
Testing Requirements
We use three tiers of testing:
- Unit tests: Fast verification of code parts, testing different arguments
- Best for fast verification of parts of code, testing different arguments
- Doesn't cover interactions between code
- Integration tests: Testing protocols between components and built artifacts
- Best for testing protocols and agreements between components
- May not model interactions between components as they are deployed
- End-to-end (e2e) tests: Whole system testing including benchmarking
- Best for preventing end to end regression and verifying overall correctness
- Execution can be slow
Strong e2e coverage is required for deployed systems to prevent performance regression. Appropriate test coverage is an important part of code review.
Security
Maintain appropriate security mindset for production serving. The project will establish a project email address for responsible disclosure of security issues that will be reviewed by the project maintainers. Prior to the first GA release we will formalize a security component and process.
Project Structure
Core Organization (llm-d)
- Production-ready code on well-lit path
- Follows API Changes and Deprecation process
- All major changes require project proposals
Incubation Organization (llm-d-incubation)
- Experimental components not yet fully supported
- Bias towards accepting experimentation with clear goals
- Each repo must have README describing purpose and goal
- Graduated components move to
llm-dorg
This content is automatically synced from CONTRIBUTING.md on the main branch of the llm-d/llm-d repository.
📝 To suggest changes, please edit the source file or create an issue.