Boost YC-Killer: Tooling Setup (Ruff, Mypy, Pytest, Pre-commit)
Hey guys! Let's get our hands dirty setting up the tooling for our YC-Killer project. This is all about making sure our code is squeaky clean, well-tested, and easy to maintain. We're talking about configuring linting, type checking, and testing tools specifically tailored to each package within our repository. This ensures that the code quality is top-notch, and any potential issues are caught early in the development process. So, buckle up; we're diving into the nitty-gritty of setting up Ruff, Mypy, Pytest, and Pre-commit to streamline our workflow and make our lives easier.
Setting the Stage: Project Overview and Structure
First things first, let's recap the project's structure. We're working within the YC-Killer repository, which houses independent agents at the root level. This means each agent is self-contained and responsible for its own specific functionalities. We'll have two main root-level directories dedicated to this work:
lean_proof_engine/: This is our reusable Lean 4 line-by-line verification engine. Think of it as the brains of our operation – shared and without any physics-specific logic. It's designed to be a general-purpose tool.physics_copilot/: This is where the magic happens! The Physics Copilot agent depends on thelean_proof_engine/and holds all the physics-specific code. This is the part that will actually be doing the heavy lifting.
It's important to remember that we're keeping each agent isolated. No agents/ folder here, folks! Any reusable utility that can be utilized across the agents will live in its own root package, like the lean_proof_engine/. This helps keep things organized, with minimal, well-documented interfaces. This architecture makes the project more scalable and easier to manage as it grows.
Core Tooling: Ruff, Mypy, Pytest, and Pre-commit
Now, let's get into the heart of the matter: the tools themselves. These are the workhorses that will help us maintain high-quality code. We will also introduce the reasons why we chose them, and the benefits of using them.
- Ruff: This is our linting tool. It helps us catch stylistic errors and potential bugs in our code. Ruff will ensure that our code adheres to a consistent style, making it more readable and maintainable. This will help maintain code readability and prevent style-related errors before they even happen. This will help with the whole team collaboration because everyone follows the same code standard, and also increases the ability to understand code.
 - Mypy: As our type checker, Mypy verifies that our code is type-safe. It catches type-related errors, preventing runtime issues. By using Mypy, we're essentially building a safety net that catches potential bugs early on. This will help avoid unexpected behavior and make our code more robust.
 - Pytest: For testing, we'll use Pytest. It allows us to write and run tests easily, ensuring that our code functions as expected. With Pytest, we'll be able to create comprehensive test suites that cover all aspects of our code. We will test the core functionality of each package, ensuring everything is working as planned.
 - Pre-commit: Finally, Pre-commit is a framework for managing and running pre-commit hooks. These hooks automatically run checks (like linting and type checking) before we commit our code. This helps us catch errors before they even make it into our repository. This is an awesome tool to ensure that every commit meets our quality standards. Also, this automated process keeps our codebase clean and consistent.
 
Configuration: Setting Up pyproject.toml and .pre-commit-config.yaml
Okay, time to roll up our sleeves and start configuring these tools. Let's start with pyproject.toml. This file is the central configuration for our project. It's where we'll specify the settings for Ruff, Mypy, and Pytest. This helps to manage all of our project's dependencies and configuration in one centralized location.
Here’s a general idea of what we'll be doing:
- Ruff: We'll define the rules for linting, such as line length, import order, and other stylistic guidelines.
 - Mypy: We'll configure Mypy to check our types, specifying the paths to our source code and any necessary type checking settings.
 - Pytest: While we usually don't need a lot of configuration for Pytest, we can specify things like test discovery paths and any custom markers.
 
Next up, the .pre-commit-config.yaml file. This file tells Pre-commit which hooks to run. We'll specify the linters and formatters we want to use. We will also include Mypy and other checks. This file makes sure that the checks run automatically before each commit, ensuring our code always meets the required standards. Here’s how this will work:
- Install Pre-commit: We install the pre-commit package using 
pip install pre-commit. - Configure Hooks: In the 
.pre-commit-config.yaml, we'll list the hooks we want to run. This includes the linters, formatters, and type checkers. - Install Hooks: We install the hooks using 
pre-commit install. This adds the hooks to our Git hooks directory. - Commit: Now, every time we commit, Pre-commit will run these checks automatically.
 
CI/CD: GH Action ci.yml and Matrix Configuration
Continuous Integration (CI) is super important for our project. It ensures that our code integrates seamlessly and that any changes won't break anything. We’ll be using GitHub Actions to automate this process. We’ll create a ci.yml file, which defines our CI workflow.
The key feature here is using a matrix configuration. This allows us to run our tests across different Python versions and operating systems. This ensures our code works everywhere. Here’s how it works:
- Matrix Configuration: We'll set up a matrix in our 
ci.ymlfile. This matrix will specify the Python versions and operating systems we want to test on. - Workflow Steps: For each combination in the matrix, the workflow will:
- Checkout the code.
 - Set up Python.
 - Install dependencies.
 - Run linting and type checking using Ruff and Mypy.
 - Run tests using Pytest.
 
 - CI Green: The CI workflow will pass if all tests and checks pass for all combinations in the matrix.
 
This setup will automatically run all of our tests and checks every time we push code to the repository. This will help us catch any issues early on and ensure our project remains stable. We want a green CI (Continuous Integration) because it gives us confidence that everything is working as expected.
Testing and Acceptance Criteria
Testing is key! To ensure that all of our tools are working correctly, we need to create some test plans. We have a couple of important acceptance criteria:
pre-commit run --all-filespasses: This means that all our configured pre-commit hooks run successfully across all files in the repository. This is an important test to confirm the pre-commit configuration is working as expected.pytestgreen for both packages on CI: This indicates that all tests written for both ourlean_proof_engine/andphysics_copilot/packages pass on our CI environment.
Test Plan: Linting Failures and Removal of Sample Code
We need to test that our linting rules work correctly. We'll intentionally violate a rule locally. The test plan is:
- Local Linting Failure: We'll introduce a temporary violation of a linting rule locally. When we run our linter (e.g., 
ruff check), it should fail. - Verification: This confirms that the linter is working as intended and catching the violations.
 - Removal: We'll remove the sample code, ensuring the tests pass again.
 
Guardrails: LOC & Simplicity
To ensure our code remains maintainable and easy to understand, we'll adhere to several guardrails:
- Modules ≤ 200 LOC: We will keep our modules small and focused, ensuring they don't become overly complex.
 - Functions ≤ 40 LOC: We’ll keep our functions concise to promote readability and ease of debugging.
 - Cyclomatic Complexity < 10: This will keep our code complexity under control.
 
We'll also embrace:
- Pure Functions and Tiny Adapters: We will favor pure functions and tiny adapters to enhance testability and reduce side effects.
 - No Deep Inheritance: We will avoid deep inheritance chains to simplify our code structure.
 - Stable, Explicit IO Schemas: We will clearly document all input/output schemas in docstrings.
 
Definition of Done
Finally, we want to know when we're done. Here's our Definition of Done:
- CI Green: Our CI workflow must be green.
 - Unit/Integration Tests Added: Tests are added to cover all the core functionality.
 - Docs Updated: Our documentation is up-to-date, reflecting any changes.
 - Demo Snippet or Example under physics_copilot/examples/: We provide a demo snippet or example under 
physics_copilot/examples/to illustrate how to use the implemented features. 
Conclusion: Keeping Our Code Clean and Robust
Alright, guys, we’ve covered a lot of ground today! We have discussed the initial setup and configuration of our tooling environment. We talked about how to ensure the quality, maintainability, and testability of our YC-Killer project. By using Ruff, Mypy, Pytest, and Pre-commit, we can significantly improve our development workflow and catch errors early on. This will help us write better, more robust code. Remember to stick to our guardrails and keep our focus on creating high-quality, maintainable code. Now, let’s get these tools set up and start building awesome things!