A simple and fast Go tool for file integrity monitoring. Detects unauthorized file modifications caused by OS command injection and other attacks by recording file hashes during deployment and verifying them periodically.
The name "Kekkai" comes from the Japanese word 結界 (kekkai), meaning "barrier" - a protective boundary that keeps unwanted things out, perfectly representing this tool's purpose of protecting your files from tampering.
Kekkai was designed to solve specific challenges in production server environments:
Traditional tools like tar
or file sync utilities (e.g., rsync
) include metadata like timestamps in their comparisons, causing false positives when only timestamps change. In environments with heavy NFS usage or dynamic log directories, existing tools become difficult to configure and maintain.
-
Content-Only Hashing
- Hashes only file contents, ignoring timestamps and metadata
- Detects actual content changes, not superficial modifications
-
Immutable Exclude Rules
- Exclude patterns are set during manifest generation only
- Cannot be modified during verification, preventing attackers from hiding changes
- Only exclude server-generated files (logs, cache, uploads, NFS mounts)
- Application dependencies (vendor, node_modules) are monitored as they're part of the deployment
-
Symlink Security
- Uses
os.Lstat
to properly detect symlinks without following them - Tracks symbolic links with their target paths (via
os.Readlink
) - Hashes the symlink target path itself, not the target's content
- Detects when symlinks are modified to point to different targets
- Detects when regular files are replaced with symlinks (and vice versa)
- Prevents attackers from hiding malicious changes through symlink manipulation
- Uses
-
Secure Hash Storage with S3
- Deploy servers have write-only access
- Application servers have read-only access
- Even if compromised, attackers cannot modify stored hashes
- Local file output available for testing
-
Tamper-Resistant Distribution
- Single Go binary with minimal dependencies
- Recommended to run with restricted permissions
- Configuration should be read from S3 or managed paths, not local files
- 🚀 Fast: Efficient hash calculation with parallel processing
- 🔒 Secure: Tamper-proof storage with S3 integration
- 📊 Monitoring Ready: Multiple output formats for various monitoring systems
- 🎯 Deterministic: Same file structure always produces the same hash
- ☁️ EC2 Ready: Authentication via IAM roles
# Build from source
git clone https://github.com/catatsuy/kekkai.git
cd kekkai
make
# Or directly with go build
go build -o ./bin/kekkai ./cmd/kekkai
# Run tests
make test
# Generate manifest
kekkai generate --target /var/www/app --output manifest.json
# Verify files
kekkai verify --manifest manifest.json --target /var/www/app
kekkai generate \
--target /var/www/app \
--exclude "*.log" \
--exclude "cache/**" \
--output manifest.json
Kekkai stores manifests in S3 for secure, centralized management. Each deployment updates the same manifest.json
file.
# For production deployment (must explicitly specify --base-path)
kekkai generate \
--target /var/www/app \
--s3-bucket my-manifests \
--app-name myapp \
--base-path production # Explicitly required for production
# For staging/development (uses default "development" if not specified)
kekkai generate \
--target /var/www/app \
--s3-bucket my-manifests \
--app-name myapp \
--base-path staging
# During verification (must match the base-path used during generation)
kekkai verify \
--s3-bucket my-manifests \
--app-name myapp \
--base-path production \
--target /var/www/app
Benefits:
- Lower S3 costs - Minimal S3 operations
- Clean structure - One manifest file per application
# Add to crontab for periodic checks
*/5 * * * * kekkai verify \
--s3-bucket my-manifests \
--app-name myapp \
--base-path production \
--target /var/www/app
# Use cache for faster verification (cache in temp directory)
*/5 * * * * kekkai verify \
--s3-bucket my-manifests \
--app-name myapp \
--base-path production \
--target /var/www/app \
--use-cache \
--verify-probability 0.1
# Use persistent cache in custom directory
*/5 * * * * kekkai verify \
--s3-bucket my-manifests \
--app-name myapp \
--base-path production \
--target /var/www/app \
--use-cache \
--cache-dir /var/cache/kekkai \
--verify-probability 0.1
Configure your monitoring system to alert based on your requirements (e.g., alert after consecutive failures).
These examples show common exclude patterns for various frameworks. Important: Only exclude files generated on the server (logs, cache, uploads). Application dependencies like vendor
or node_modules
MUST be monitored as they are part of the deployed application.
For production use, replace --output manifest.json
with S3 storage options (--s3-bucket
, --app-name
, --base-path
).
kekkai generate \
--target /var/www/app \
--exclude "storage/**" \
--exclude "bootstrap/cache/**" \
--exclude "*.log" \
--output manifest.json
kekkai generate \
--target /var/www/app \
--exclude "*.log" \
--exclude ".npm/**" \
--exclude "tmp/**" \
--output manifest.json
kekkai generate \
--target /var/www/app \
--exclude "log/**" \
--exclude "tmp/**" \
--exclude "public/assets/**" \
--output manifest.json
kekkai generate \
--target /var/www/app \
--exclude "**/__pycache__/**" \
--exclude "media/**" \
--exclude "staticfiles/**" \
--exclude "*.pyc" \
--output manifest.json
For deployment server (write access):
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:PutObject"
],
"Resource": "arn:aws:s3:::my-manifests/*"
}
]
}
For production server (read-only):
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject"
],
"Resource": "arn:aws:s3:::my-manifests/*"
}
]
}
Recommended: Enable S3 versioning to maintain history of manifest changes.
# Optional: Enable versioning for history tracking
aws s3api put-bucket-versioning \
--bucket my-manifests \
--versioning-configuration Status=Enabled
# Enable encryption
aws s3api put-bucket-encryption \
--bucket my-manifests \
--server-side-encryption-configuration '{
"Rules": [{
"ApplyServerSideEncryptionByDefault": {
"SSEAlgorithm": "AES256"
}
}]
}'
# Optional: Set lifecycle policy to delete old versions after N days
aws s3api put-bucket-lifecycle-configuration \
--bucket my-manifests \
--lifecycle-configuration '{
"Rules": [{
"Id": "DeleteOldVersions",
"Status": "Enabled",
"NoncurrentVersionExpiration": {
"NoncurrentDays": 30
}
}]
}'
#!/bin/bash
# deploy.sh
set -e
APP_NAME="myapp"
DEPLOY_DIR="/var/www/app"
S3_BUCKET="my-manifests"
# 1. Install dependencies locally
cd ./src
composer install --no-dev
# 2. Deploy application to server
rsync -av ./src/ ${DEPLOY_DIR}/
# 3. Generate manifest and save to S3
# Note: For production, explicitly specify --base-path production
kekkai generate \
--target ${DEPLOY_DIR} \
--exclude "storage/**" \
--exclude "bootstrap/cache/**" \
--s3-bucket ${S3_BUCKET} \
--app-name ${APP_NAME} \
--base-path production # MUST be explicit for production
echo "Deploy completed with integrity manifest"
echo "Manifest saved to: ${S3_BUCKET}/production/${APP_NAME}/manifest.json"
Generate a manifest file.
Options:
-target string Target directory (default ".")
-output string Output file, "-" for stdout (default "-")
-exclude string Exclude pattern (can be specified multiple times)
-s3-bucket string S3 bucket name
-s3-region string AWS region
-base-path string S3 base path (default "development")
-app-name string Application name (creates path: {base-path}/{app-name}/manifest.json)
-format string Output format: text, json (default "text")
-workers int Number of worker threads (0 = auto detect)
-rate-limit int Rate limit in bytes per second (0 = no limit)
-timeout int Timeout in seconds (default: 300)
Verify file integrity.
Options:
-manifest string Manifest file path
-s3-bucket string S3 bucket name
-s3-region string AWS region
-base-path string S3 base path (default "development")
-app-name string Application name (reads from: {base-path}/{app-name}/manifest.json)
-target string Target directory to verify (default ".")
-format string Output format: text, json (default "text")
-workers int Number of worker threads (0 = auto detect)
-rate-limit int Rate limit in bytes per second (0 = no limit)
-timeout int Timeout in seconds (default: 300)
-use-cache Enable local cache for verification (checks size, mtime, ctime)
-cache-dir string Directory for cache file (default: system temp directory)
-verify-probability float Probability of hash verification with cache hit (0.0-1.0, default: 0.1)
✓ Integrity check passed
Verified 1523 files
{
"success": true,
"timestamp": "2024-01-01T00:00:00Z",
"message": "All files verified successfully",
"details": {
"total_files": 1523,
"verified_files": 1523
}
}
Kekkai uses glob patterns for the --exclude
option to skip specific files and directories during manifest generation.
Pattern | Description | Example |
---|---|---|
*.ext |
Match files with specific extension | *.log matches app.log , error.log |
dir/* |
Match all files in a directory | logs/* matches logs/app.log |
dir/** |
Match all files recursively | cache/** matches cache/data.db , cache/sessions/abc.txt |
**/*.ext |
Match extension at any depth | **/*.pyc matches app.pyc , lib/utils.pyc |
**/dir/* |
Match directory at any depth | **/logs/* matches logs/app.log , app/logs/error.log |
path/to/file |
Exact path match | config/local.ini matches only that file |
- Relative Paths: All patterns match against relative paths from the target directory
- Forward Slashes: Always use
/
as path separator (even on Windows) - No Negation: Patterns cannot be negated (no
!pattern
support) - Order Independent: All patterns are evaluated, order doesn't matter
- Immutable: Exclude patterns cannot be changed during verification
# Laravel/Symfony
--exclude "storage/**" # User uploads
--exclude "var/cache/**" # Framework cache
--exclude "var/log/**" # Application logs
--exclude "public/uploads/**" # Uploaded files
# Python/Django
--exclude "**/__pycache__/**" # Python cache
--exclude "**/*.pyc" # Compiled Python
--exclude "media/**" # User uploads
--exclude "staticfiles/**" # Collected static files
# Node.js
--exclude "*.log" # Log files
--exclude "tmp/**" # Temporary files
--exclude ".npm/**" # NPM cache
# General
--exclude "*.tmp" # Temporary files
--exclude "*.bak" # Backup files
--exclude ".git/**" # Git repository (if needed)
- ❌
--exclude "vendor/**"
(PHP dependencies) - ❌
--exclude "node_modules/**"
(Node.js dependencies) - ❌
--exclude "venv/**"
(Python virtual environment)
These are part of your deployed application and must be monitored for tampering.
✅ Only exclude server-generated content:
- Log files
- Cache directories
- User uploads
- Temporary files
- NFS mounts
Patterns are evaluated in this order:
- Check for
**
recursive matching - Special case:
**/*
or**
matches everything - Suffix pattern:
dir/**
matches everything underdir/
- Prefix pattern:
**/*.ext
matches files with extension at any depth - Simple glob: Standard shell glob matching with
*
and?
Kekkai has comprehensive symlink security to prevent attackers from hiding malicious changes:
Kekkai handles symlinks differently depending on where they appear:
If --target
points to a symlink (e.g., /current
→ /releases/20240101
):
- Automatically resolved: Uses
filepath.EvalSymlinks
to follow the symlink - Operates on real path: All operations happen in the resolved directory
- Transparent to user: Works exactly as if you specified the real directory
Example:
# These produce identical results:
kekkai generate --target /var/www/current # Symlink to /var/www/releases/20240101
kekkai generate --target /var/www/releases/20240101 # Direct path
For symlinks found within the target directory:
- Not followed: Uses
os.Lstat
to detect them without following - Tracked as symlinks: Stored with
IsSymlink: true
flag - Target recorded: Link target path saved for verification
- Hash of target path: Creates hash from
"symlink:" + target_path
string
- Detection: Uses
os.Lstat
to identify symlinks without following them - Target Tracking: Reads the link target with
os.Readlink
- Hash Calculation: Creates hash from
"symlink:" + target_path
string - Verification: Checks both link type and target path during verification
- ✅ Symlink target changes (e.g.,
/usr/bin/php
→/tmp/malicious
) - ✅ File type changes (regular file → symlink or symlink → regular file)
- ✅ Broken symlinks (target doesn't exist)
- ✅ New symlinks added to the directory
- ✅ Deleted symlinks
# Original deployment
/app/config.php -> /etc/app/config.php # Hash: abc123...
# These changes will be detected:
/app/config.php -> /tmp/fake-config.php # Modified: different target
/app/config.php (regular file) # Modified: type change
/app/config.php (deleted) # Deleted: symlink removed
- Symlinks are never followed during hash calculation
- Only the link itself is hashed, not the target's content
- Prevents directory traversal attacks via symlink manipulation
- Cache system skips symlinks (only caches regular files)
A: Kekkai only hashes file contents, so timestamp or permission changes don't affect hashes. Check for line ending differences (CRLF/LF).
A: Verify that the EC2 instance has the correct IAM role configured. Also check that the region is specified correctly.
A: For large file sets, use --exclude
options to skip server-generated directories like logs, cache, and temporary files. You can also optimize performance with:
--use-cache
: Enable local cache that checks file metadata (size, mtime, ctime) to skip hash calculation--verify-probability N
: Set probability of hash verification even with cache hit (0.0-1.0, default: 0.1)--workers N
: Adjust the number of worker threads for your system--rate-limit N
: Limit I/O throughput (bytes per second) to reduce system load
Cache Mode: When using --use-cache
, kekkai maintains a local cache file (.kekkai-cache-{base-name}-{app-name}.json
) in the cache directory (defaults to system temp directory, or specify with --cache-dir
). Cache files are temporary by nature and will be recreated if missing. It checks file metadata including:
- File size
- Modification time (mtime)
- Change time (ctime) - cannot be easily forged
If all metadata matches, it uses the cached hash. The --verify-probability
option adds probabilistic verification:
0.0
: Always trust cache (fastest, least secure)0.1
: 10% chance to verify hash even with cache hit (default, good balance)1.0
: Always verify hash (most secure, no performance benefit)
The cache file itself is protected with a hash to detect tampering.
💡 Cache Behavior: By default, cache files are stored in the system temp directory (e.g., /tmp
on Linux/macOS) and may be automatically cleaned by the system. This is intentional - the cache is designed to be ephemeral and will be recreated as needed for performance optimization.
Note that application dependencies (vendor, node_modules) should still be verified as they are part of the deployed application.
A: Use --rate-limit
to throttle I/O bandwidth. For example, --rate-limit 10485760
limits to 10MB/s. This global rate limit is shared across all worker threads, preventing system overload while still allowing parallel processing.
Alternatively, you can use systemd to control resource usage at the OS level:
# Run with limited CPU and I/O priority (with cache support)
systemd-run --quiet --wait --pipe --collect \
-p Type=oneshot \
-p CPUQuota=25% -p CPUWeight=50 \
-p PrivateTmp=no \
-p User=nobody \
/bin/bash -lc 'nice -n 10 ionice -c2 -n7 /usr/local/bin/kekkai verify \
--s3-bucket my-manifests \
--app-name myapp \
--target /srv/app \
--use-cache \
--verify-probability 0.1 \
--rate-limit 10485760'
This approach provides more comprehensive resource control:
CPUQuota=25%
: Limits CPU usage to 25%CPUWeight=50
: Sets CPU scheduling weight (lower priority)PrivateTmp=no
: Allows cache persistence in/tmp
across runsUser=nobody
: Runs with minimal privileges for securitynice -n 10
: Lower process priorityionice -c2 -n7
: Best-effort I/O scheduling with lowest priority--use-cache
: Enables cache for faster verification--verify-probability 0.1
: 10% chance to verify hash even with cache hit--rate-limit 10485760
: Limits I/O to 10MB/s
Important: The PrivateTmp=no
setting is required when using --use-cache
to ensure cache files persist between systemd-run executions. Without this, systemd creates an isolated /tmp
directory for each run, preventing cache reuse. If you prefer stronger isolation, use --cache-dir
to specify a persistent directory outside of /tmp
:
# Alternative: Keep PrivateTmp=yes but use a custom cache directory
systemd-run --quiet --wait --pipe --collect \
-p Type=oneshot \
-p CPUQuota=25% -p CPUWeight=50 \
-p PrivateTmp=yes \
-p User=nobody \
/bin/bash -lc 'nice -n 10 ionice -c2 -n7 /usr/local/bin/kekkai verify \
--s3-bucket my-manifests \
--app-name myapp \
--target /srv/app \
--use-cache \
--cache-dir /var/cache/kekkai \
--verify-probability 0.1 \
--rate-limit 10485760'
Security Recommendation: Always run kekkai verification with minimal privileges using -p User=nobody
or a dedicated non-privileged user. This follows the principle of least privilege and reduces the security impact if the process is compromised.
Note: With Go 1.25+, CPUQuota
also automatically adjusts GOMAXPROCS
to match the quota, so kekkai will use fewer worker threads when CPU is limited, providing better resource utilization.
--use-cache
on NFS
Cache mode is disabled by default. Do not enable it on NFS-mounted directories as it does not work well and may run slower than without cache.
Kekkai's cache system uses ctime (change time) to detect file changes. Ctime is hard to fake and important for security.
However, NFS clients may not update ctime properly. This causes:
- Very low cache hit rate
- Cache becomes useless
- Runs as slow as normal verification
# Disable cache on NFS
kekkai verify \
--s3-bucket my-manifests \
--app-name myapp \
--target /nfs/app \
--workers 2 \
--rate-limit 5242880
Disabling cache does not reduce security. All files are still verified with SHA256 hashes.