
Adobe Experience Manager's Oak repository is like a bustling warehouse: over time, unused data (old checkpoints, orphaned nodes) piles up, slowing performance and eating disk space. While Adobe's official documentation explains the what and why of Offline Revision Cleanup (ORC), the how often leaves DevOps executing repetitive tasks.
With this in mind, I’ll share how a simple Bash script can transform this critical-but-tedious process into a one-command operation - safely, efficiently, and with zero manual checklists.
The script isn’t just a wrapper around oak-run.jar
. It’s a safety-first automation tool designed for real-world enterprise environments.
Here’s what makes it stand out:
java
, lsof
), validates paths, and ensures AEM is offline.--yes
override): No accidental runs.
# Sample Log Output
[INFO] [2023-10-15 14:30:00] Starting offline revision cleanup process
[ERROR] [2023-10-15 14:30:05] Oak command failed: checkpoints segmentstore
Let’s dissect the critical components:
The script enforces strict input rules. Try passing an invalid port like 99999
? It’ll block you. Forget the AEM directory? A friendly error appears.
# Validate port range
if [[ ! "$PORT" =~ ^[0-9]+$ ]] || [ "$PORT" -lt 1 ] || [ "$PORT" -gt 65535 ]; then
log_message "ERROR" "Invalid port number: $PORT"
exit 1
fi
The run_oak_command
function handles all interactions with oak-run.jar
, including error trapping. Notice the -Dtar.memoryMapped=true
flag, a performance tweak for large repositories: By allowing memory-mapped file access, it reduces I/O overhead during compaction.
run_oak_command() {
java -Dtar.memoryMapped=true -jar "$OAK_RUN" "$@"
if [ $? -ne 0 ]; then
log_message "ERROR" "Oak command failed: $@"
exit 1
fi
}
--yes
is set)oak-run.jar
version matching your AEM/Oak deploymentcrx-quickstart
(yes, really!)
2. Execute the Script:
./aem-orc-cleanup.sh -d /opt/aem/author -o ./oak-run-1.42.0.jar --yes
Problem: Script fails with "AEM is still running"
Fix:
# Find AEM pid among lingering Java processes
ps aux | grep java | grep -v grep | awk '{print $2}'
# Kill AEM process
kill -9 AEM_PID
Problem: oak command failed
during compaction
Fix:
--debug
for verbose logging
Problem: Script fails with Permission denied”
Fix:
chmod +x aem-orc-cleanup.sh
and ensure the user has read/write access to AEM directories.
While the base script works out of the box, its real value shines when tailored to your team's workflow. Here's how to transform it into a production-ready powerhouse:
Add a pre-compaction disk usage snapshot to quantify your ROI:
# Capture before/after metrics
DISK_USAGE_PRE=$(du -sh $CQ_REPO_PATH | awk '{print $1}')
# ... run compaction ...
DISK_USAGE_POST=$(du -sh $CQ_REPO_PATH | awk '{print $1}')
log_message "INFO" "Disk reclaimed: $DISK_USAGE_PRE → $DISK_USAGE_POST"
Replace passive logging with proactive notifications. For example, trigger a Slack alert on failure:
if [ $? -ne 0 ]; then
curl -X POST -H 'Content-type: application/json' \
--data "{\"text\":\"🚨 AEM ORC failed on $(hostname)\"}" \
https://hooks.slack.com/services/YOUR_WEBHOOK
fi
Oak compaction can fail due to transient I/O issues. Add resilience with exponential backoff:
MAX_RETRIES=3
RETRY_DELAY=60 # seconds
for ((i=1; i<=$MAX_RETRIES; i++)); do
run_oak_command "compact" "$store"
if [ $? -eq 0 ]; then
break
fi
sleep $(($RETRY_DELAY * $i))
done
What’s your killer customization? Please share it in the comments.
Let’s face it: tasks like Offline Revision Cleanup rarely make the highlight reel of a developer’s day. They’re the unsung, unglamorous chores that keep systems alive. But the truth is how you handle these tasks defines the reliability of your AEM ecosystem.
This script isn’t just about avoiding OutOfDiskSpaceErrors
or shaving seconds off query times. It’s about operational maturity. By automating ORC, you’re not just solving a problem, you’re institutionalizing consistency. You’re replacing tribal knowledge (“How did we do this last time?”) with version-controlled, auditable code.
For DevOps teams, the payoff is twofold:
What’s Next?
IMHO the future of AEM ops isn’t about working harder — it’s about scripting smarter. Now, go turn those maintenance windows into something more exciting. (Or at least, something that doesn’t keep you up at night.)
What’s your automation success story? Drop a comment below — I’d love to hear how you’re taming AEM’s complexity.
#!/bin/bash
# =============================================================================
# AEM Offline Revision Cleanup Script
# =============================================================================
# Purpose: Performs maintenance operations on Adobe Experience Manager's Oak
# repository, including checkpoint management and compaction.
#
# Source: https://experienceleague.adobe.com/en/docs/experience-manager-65/content/implementing/deploying/deploying/revision-cleanup#how-to-run-offline-revision-cleanup
#
# Version: 1.0.0
# Author: Giuseppe Baglio
# =============================================================================
# Color definitions for better readability
RED='\033[0;31m'
GREEN='\e[0;32m'
YELLOW='\e[0;33m'
BLUE='\e[0;34m'
NC='\033[0m' # No Color
# Default values for command line arguments
PORT=4502
CQ_HOME=""
OAK_RUN=""
SKIP_CONFIRMATION=false # Flag to skip user confirmation
DEBUG_MODE=false
# Function to display usage/help message
show_help() {
cat << EOF
Usage: $(basename "$0") [OPTIONS]
Performs offline revision cleanup for an AEM instance, including checkpoint management
and compaction of the Oak repository.
Options:
-h, --help Display this help message
-p, --port PORT Specify the AEM instance port (default: 4502)
-d, --aem-dir DIR Specify the AEM installation directory (required)
-o, --oak-run FILE Specify the path to oak-run.jar (required)
-y, --yes Skip confirmation prompts and proceed automatically
--debug Enable debug mode for verbose output
Example:
$(basename "$0") --port 4502 --aem-dir /path/to/aem --oak-run /path/to/oak-run.jar --yes
Requirements:
- Java 1.8 or higher
- lsof utility
- AEM must be completely shut down before running this script
- Adequate disk space for compaction
Note: Always maintain a backup before running this script.
EOF
exit 1
}
# Function to log messages with timestamp
log_message() {
local level=$1
shift
local message=$*
local timestamp
timestamp=$(date '+%Y-%m-%d %H:%M:%S')
case "$level" in
"INFO") printf "${GREEN}[INFO]${NC} ";;
"WARN") printf "${YELLOW}[WARN]${NC} ";;
"ERROR") printf "${RED}[ERROR]${NC} ";;
*) printf "[${level}] ";;
esac
printf "[%s] %s\n" "$timestamp" "$message"
}
# Function to handle debug output
debug_log() {
if [ "$DEBUG_MODE" = true ]; then
log_message "DEBUG" "$*"
fi
}
check_dependencies() {
local dependencies=("java" "lsof")
for dep in "${dependencies[@]}"; do
if ! command -v "$dep" > /dev/null 2>&1; then
log_message "ERROR" "'$dep' is not installed. Please install it and try again."
exit 1
fi
debug_log "Found required dependency: $dep"
done
}
# Function to parse command line arguments
parse_arguments() {
while [[ $# -gt 0 ]]; do
case $1 in
-h|--help)
show_help
;;
-p|--port)
PORT="$2"
shift 2
;;
-d|--aem-dir)
CQ_HOME="$2"
shift 2
;;
-o|--oak-run)
OAK_RUN="$2"
shift 2
;;
-y|--yes)
SKIP_CONFIRMATION=true
shift
;;
--debug)
DEBUG_MODE=true
shift
;;
*)
log_message "ERROR" "Unknown option: $1"
show_help
;;
esac
done
# Validate required arguments
if [[ ! -d "$CQ_HOME" ]]; then
log_message "ERROR" "AEM directory (-d|--aem-dir) is required and must exist"
show_help
fi
# Validate oak-run.jar
if [[ ! -f "$OAK_RUN" ]]; then
log_message "ERROR" "oak-run.jar not found at '$OAK_RUN'"
exit 1
fi
# Validate port number
if [[ ! "$PORT" =~ ^[0-9]+$ ]] || [ "$PORT" -lt 1 ] || [ "$PORT" -gt 65535 ]; then
log_message "ERROR" "Invalid port number: $PORT (must be between 1 and 65535)"
exit 1
fi
}
# Function to run oak commands with proper error handling
run_oak_command() {
local command=$1
local store=$2
local args=${3:-""}
debug_log "Running oak command \"$command\" on $store with args \"$args\""
java -Dtar.memoryMapped=true -jar "$OAK_RUN" "$command" "$CQ_REPO_PATH/$store" $args
if [ $? -ne 0 ]; then
log_message "ERROR" "Oak command failed: $command $store $args"
exit 1
fi
}
# Function to check if AEM is running
check_aem_off() {
lsof -i tcp:"${PORT}" | awk 'NR!=1 {print $2}' | sort -u | wc -l
}
# Function to prompt for user confirmation
wait_user_input() {
local prompt_message=${1:-"Do you wish to continue"}
local additional_info=$2
# If --yes flag was used, skip confirmation
if [ "$SKIP_CONFIRMATION" = true ]; then
log_message "INFO" "Skipping confirmation (--yes flag used)"
return 0
fi
# Display additional information if provided
if [ -n "$additional_info" ]; then
printf "\n%s\n" "$additional_info"
fi
printf "\n${YELLOW}%s [Y/n]${NC} " "$prompt_message"
read -r response
case $response in
"y"|"yes"|"Y"|"") # Empty response defaults to yes
return 0
;;
"n"|"N"|"no")
log_message "INFO" "Operation cancelled by user"
exit 0
;;
*)
log_message "ERROR" "Please answer 'y' for yes or 'n' for no (default: y)"
;;
esac
}
# Main script execution starts here
check_dependencies
parse_arguments "$@"
# Set repository paths
CQ_SEGMENTSTORE_HOME="crx-quickstart/repository"
CQ_REPO_PATH="$CQ_HOME/$CQ_SEGMENTSTORE_HOME"
# Validate repository path
if [ ! -d "$CQ_REPO_PATH" ]; then
log_message "ERROR" "Repository folder not found: $CQ_REPO_PATH"
exit 1
fi
# Check if AEM is running
if [ "$(check_aem_off)" -eq 0 ]; then
log_message "WARN" "[!] Always make sure you have a recent backup of the AEM instance."
log_message "INFO" "Starting offline revision cleanup process"
# Process both datastore and segmentstore
for store in segmentstore datastore; do
log_message "INFO" "Processing $store..."
log_message "INFO" "Finding old checkpoints in $store"
wait_user_input
run_oak_command "checkpoints" "$store"
log_message "INFO" "Removing unreferenced checkpoints from $store"
wait_user_input
run_oak_command "checkpoints" "$store" "rm-unreferenced"
log_message "INFO" "Running compaction on $store"
wait_user_input
run_oak_command "compact" "$store"
log_message "INFO" "Completed processing $store"
done
log_message "INFO" "Revision cleanup completed successfully"
else
log_message "ERROR" "AEM is still running on port $PORT. Please shut it down first."
exit 1
fi