-
Notifications
You must be signed in to change notification settings - Fork 25
Make github-runner-manager runnable as standalone app without system user assumptions #693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ults - Add get_state_dir() utility with XDG Base Directory support - Add --state-dir and --reactive-log-dir CLI options with env var support - Move reconcile.id to use state directory instead of hardcoded home - Use sys.executable for reactive subprocesses instead of /usr/bin/python3 - Guard chown/chgrp operations to only run when running as root - Change metrics log default from file to stderr when METRICS_LOG_PATH not set - Update charm systemd service to pass explicit state-dir and reactive-log-dir Co-authored-by: yanksyoon <[email protected]>
- Add comprehensive tests for get_state_dir() XDG logic - Update reactive process_manager tests to work with new implementation - Tests cover explicit params, env vars, XDG fallbacks, and permission errors Co-authored-by: yanksyoon <[email protected]>
- Update mock_runner_scaler_fixture to accept new reactive_log_dir parameter - Fix reactive runner_manager tests to include reactive_log_dir in assertions - Add METRICS_LOG_PATH backward compatibility alias for tests - Update metrics test fixture to patch both TARGET and PATH Co-authored-by: yanksyoon <[email protected]>
- Remove unused tempfile import from utilities.py - Fix systemd ExecStart line continuation formatting Co-authored-by: yanksyoon <[email protected]>
yanksyoon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot Initial review to make the application standalone.
github-runner-manager/src/github_runner_manager/reactive/process_manager.py
Outdated
Show resolved
Hide resolved
When not running as root, explicitly set the group to the current user's group for spawned reactive processes. This ensures proper group inheritance in unprivileged contexts while maintaining charm behavior when running as root. Co-authored-by: yanksyoon <[email protected]>
yanksyoon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot Simplify the code by adding defaults in the CLI input.
…tions Changed default from empty string to None for --state-dir and --reactive-log-dir to make the "not provided" semantics more explicit. Updated type hints and simplified the config variable assignment logic accordingly. Co-authored-by: yanksyoon <[email protected]>
Set python_path default to None for consistency with state_dir and reactive_log_dir. Remove unnecessary conditional logic since all three parameters now have explicit None defaults, making the code cleaner and more straightforward. Co-authored-by: yanksyoon <[email protected]>
Updated in commit 8b3a987. Simplified by setting explicit None defaults for all optional path parameters and removing unnecessary conditional checks. |
github-runner-manager/src/github_runner_manager/reactive/process_manager.py
Outdated
Show resolved
Hide resolved
yanksyoon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@copilot please start working on the comments
github-runner-manager/src/github_runner_manager/reactive/process_manager.py
Outdated
Show resolved
Hide resolved
github-runner-manager/src/github_runner_manager/reactive/process_manager.py
Outdated
Show resolved
Hide resolved
Changes based on code review feedback: - Remove chown operations from _setup_logging_for_processes since XDG directories are already accessible to the current user - Remove root user differentiation in subprocess spawning since the application does not run as root - Make metrics log follow XDG paths with symlink in /var/log for grafana-agent - Add _create_or_update_symlink helper function to handle both reactive and metrics symlinks - Keep os import for kill_reactive_processes function - Remove unused shutil and constants imports Verification: Linux symlinks are readable by grafana-agent as confirmed by the logrotate configuration which uses the same paths (src/logrotate.py lines 62-78). Grafana-agent reads log files through standard file operations which work with symlinks transparently. Co-authored-by: yanksyoon <[email protected]>
github-runner-manager/src/github_runner_manager/reactive/process_manager.py
Outdated
Show resolved
Hide resolved
github-runner-manager/src/github_runner_manager/reactive/process_manager.py
Outdated
Show resolved
Hide resolved
TICS Quality Gate❌ FailedThe following errors have occurred during analysis:❌ [ERROR 5333] Project database 'github-runner-operator' already is connected to a started TICSQServer process that has not registered a stop time. To avoid data corruption, analysis is aborted. Details of the other TICSQServer process: started at 2025-12-22 09:55:19.000 by user ubuntu. .github/workflows/tics.yaml / TICS / TICS GitHub Action |
Make github-runner-manager runnable as a standalone app
This PR implements changes to make the github-runner-manager application runnable as a regular user without requiring sudo, system users, or hardcoded system paths. The changes separate app-level concerns (unprivileged, portable) from charm-level concerns (privileged, policy-driven).
Summary of Changes
Core Refactoring: Single Base Directory with XDG State Home
Replaced multiple path options with a single base directory approach following XDG Base Directory specification for state files:
--base-dir(replaces--state-dirand--reactive-log-dir)$XDG_STATE_HOME/github-runner-manageror~/.local/state/github-runner-managerstate/- for reconcile.id and other state fileslogs/reactive/- for reactive runner logslogs/metrics/- for metrics logsImplementation Details
1. New utility functions in
utilities.py:get_base_dir(base_dir)- Gets base directory with XDG_STATE_HOME fallbackget_state_dir(base_dir)- Returnsbase_dir/stateget_reactive_log_dir(base_dir)- Returnsbase_dir/logs/reactiveget_metrics_log_dir(base_dir)- Returnsbase_dir/logs/metrics2. CLI changes in
cli.py:--state-dirand--reactive-log-dirwith single--base-diroptionGITHUB_RUNNER_MANAGER_BASE_DIR3. Updated all modules to use
base_dirparameter:reconcile_service.pymanager/runner_scaler.pyreactive/runner_manager.pyreactive/process_manager.py4. Python interpreter fix for reactive subprocesses:
sys.executablefrom parent process (was: hardcoded/usr/bin/python3)5. Simplified permissions model:
6. Metrics logging:
base_dir/logs/metrics//var/log/github-runner-metrics.logfor grafana-agentMETRICS_LOG_PATHnot set7. Charm integration in
manager_service.py**:--base-dirparameter pointing to~runner-manager/.local/state/github-runner-managerMETRICS_LOG_PATHenvironment variable to/var/log/github-runner-metrics.log(symlink)/var/log/reactive_runner→base_dir/logs/reactive/var/log/github-runner-metrics.log→base_dir/logs/metrics/github-runner-metrics.log_create_or_update_symlink()handles symlink managementTesting
Acceptance Criteria Met
✅ Single base directory with organized subdirectories
✅ XDG-compliant defaults for state files (
$XDG_STATE_HOMEor~/.local/state)✅ Running manager via tox works without permission errors
✅ No sudo required in app-level tests
✅ No chown operations needed - directories are user-writable by default
✅ Application does not require or check for root privileges
✅ Charm interface updated to use
--base-dir✅ Symlinks created in /var/log for grafana-agent log scraping
✅ Metrics follow XDG paths with symlink for monitoring
✅ All tests passing
Backward Compatibility
--base-diroption simplifies path managementMETRICS_LOG_PATHstill worksBenefits
Original prompt
This section details on the original issue you should resolve
<issue_title>Make github-runner-manager runnable as a standalone app without charm/system user assumptions</issue_title>
<issue_description>### Enhancement Proposal
Summary
Running the manager directly (tox/pytest, CI) exposes several assumptions baked into the app about users, paths, and Python environment. These cause permission and import failures outside the charm. The app should run cleanly as the invoking user with sane defaults, while the charm applies privileged policies.
Observed problems
Writes reconcile.id to the invoking user’s home (e.g., /home/ubuntu/reconcile.id), failing when run under a different effective user.
Attempts to create/use /var/log/reactive_runner and chown it, failing without root.
Reactive subprocesses use system python and can’t import github_runner_manager despite --python-path; sudo loses tox virtualenv context.
Config file in pytest tempdir not readable when spawning under another user.
Requiring the runner-manager user in app-level tests adds brittle permission and group juggling.
Repro (CI/integration)
Start CLI with tox venv python and reactive mode, dispatch a workflow that spawns a reactive runner.
Errors seen:
PermissionError: [Errno 13] Permission denied: '/var/log/reactive_runner'
PermissionError: [Errno 1] Operation not permitted (chown/chmod/chown during setup)
ModuleNotFoundError: No module named 'github_runner_manager' from reactive subprocess
Error: Invalid value for '--config-file': '/tmp/.../config.yaml': Permission denied
PermissionError: [Errno 13] Permission denied: '/home/ubuntu/reconcile.id'
Root cause
App hardcodes system-level paths (/var/log/reactive_runner) and writes state (reconcile.id) to $HOME.
Assumes privilege to chown/chgrp and to run/react as a dedicated system user.
Reactive subprocess resolves python via PATH/system python rather than the parent interpreter.
Using sudo drops tox venv/PYTHONPATH context unless carefully preserved.
Goal
Make the application runnable as a regular user with no sudo or user switching, with defaults that “just work” in tests/CI and local dev. Keep charm-specific behavior (runner-manager user, /var/log) at the charm layer.
Proposal
State directory (locks, ids)
Add optional --state-dir and env GITHUB_RUNNER_MANAGER_STATE_DIR.
Default (when not set): follow XDG base dirs
XDG_RUNTIME_DIR (if set) → TMPDIR → XDG_STATE_HOME/github-runner-manager → ~/.local/state/github-runner-manager
Write reconcile.id (and any future lock/ledger files) under state dir.
Charm passes a managed state dir; tests rely on the default.
Logging defaults and overrides
Metrics: default to stderr; keep existing METRICS_LOG_PATH support. Charm sets /var/log/github-runner-metrics.log.
Reactive runner logs: default to a user-local directory under state dir (e.g., $STATE_DIR/reactive_runner). Add optional --reactive-log-dir (and env) for charm to set /var/log/reactive_runner.
App must not chown/chgrp unless running as root; otherwise create dirs with current user permissions only.
Python interpreter and import path
Reactive subprocesses should default to using sys.executable from the parent process (not /usr/bin/python3).
Keep existing --python-path as an optional sys.path augment; do not rely on sudo -E or PATH hacks.
This preserves a single source of truth (tox -e . editable install) and works in CI.
User and privilege model
Backward compatibility
Acceptance criteria
Tasks
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.