Service Errors

This guide covers common service errors encountered with EPMware Agent, including diagnostic procedures and resolution steps.

Common Service Errors

Connection Errors

Error: Connection Refused

Symptoms:

ERROR: Connection refused to epmware-server.com:443
java.net.ConnectException: Connection refused

Causes: - EPMware server is down - Firewall blocking connection - Incorrect server URL or port - Network connectivity issues

Resolution:

Verify server is accessible:

# Test connectivity
ping epmware-server.com

# Test port
telnet epmware-server.com 443

# Test with curl
curl -I https://epmware-server.com

Check firewall rules:

# Linux
sudo iptables -L | grep 443

# Windows
netsh advfirewall firewall show rule name=all | findstr 443

Verify agent configuration:
```
grep "ew.portal.url" agent.properties
```

Test from different network location:

# Use traceroute to identify network issues
traceroute epmware-server.com

Error: Connection Timeout

Symptoms:

ERROR: Connection timeout after 30000ms
java.net.SocketTimeoutException: Read timed out

Resolution:

Increase timeout in agent.properties:

agent.connection.timeout=60000
agent.read.timeout=60000

Check for proxy requirements:

# Set proxy if needed
export HTTP_PROXY=http://proxy.company.com:8080
export HTTPS_PROXY=http://proxy.company.com:8080

Optimize network route:

# Check network latency
ping -c 100 epmware-server.com | grep avg

Authentication Errors

Error: 401 Unauthorized

Symptoms:

ERROR: HTTP 401 Unauthorized
Authentication failed: Invalid token

Resolution:

Verify token is correct:

# Check token in config
grep "ew.portal.token" agent.properties

# Ensure no extra spaces or characters
cat agent.properties | od -c | grep token

Regenerate token:
Log into EPMware
Navigate to Security → Users
Generate new token for agent user
Update agent.properties
Check user permissions:
Verify agent user is active
Confirm user has required roles
Check no IP restrictions

Error: 403 Forbidden

Symptoms:

ERROR: HTTP 403 Forbidden
Access denied to resource

Resolution:

Verify user permissions in EPMware:
Agent user needs appropriate security class
Check application access rights
Verify server configuration access

Check SSL certificate issues:

# Test SSL connection
openssl s_client -connect epmware-server.com:443

Java/JVM Errors

Error: OutOfMemoryError

Symptoms:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

Resolution:

Increase heap size in ew_target_service.sh:

# Original
java -jar epmware-agent.jar --spring.config.name=agent

# Modified with increased memory
java -Xms512m -Xmx2g -jar epmware-agent.jar --spring.config.name=agent

Monitor memory usage:

# Check current usage
jstat -gc $(pgrep -f epmware-agent)

# Generate heap dump for analysis
jmap -dump:format=b,file=heap.bin $(pgrep -f epmware-agent)

Error: ClassNotFoundException

Symptoms:

java.lang.ClassNotFoundException: com.epmware.agent.Main

Resolution:

Verify JAR file integrity:

# Check JAR is not corrupted
jar tf epmware-agent.jar | grep Main

# Verify checksum if provided
sha256sum epmware-agent.jar

Check Java classpath:

# Ensure all required JARs are present
ls -la *.jar

# Verify Java version compatibility
java -version

File System Errors

Error: Permission Denied

Symptoms:

ERROR: Permission denied: /home/user/logs/agent.log
java.io.FileNotFoundException: Permission denied

Resolution:

Fix file permissions:

# Check current permissions
ls -la logs/

# Fix permissions
chmod 755 logs
chmod 644 logs/*.log

# Fix ownership
chown -R agent_user:agent_group ~/

Check SELinux (Linux):

# Check if SELinux is blocking
getenforce

# Temporarily disable to test
sudo setenforce 0

# Add permanent exception if needed
sudo semanage fcontext -a -t user_home_t "/home/agent(/.*)?"
sudo restorecon -Rv /home/agent

Error: No Space Left on Device

Symptoms:

ERROR: No space left on device
java.io.IOException: No space left on device

Resolution:

Check disk space:

# Check overall disk usage
df -h

# Find large files
du -sh /* | sort -hr | head -20

# Check agent logs size
du -sh logs/

Clean up logs:

# Archive old logs
tar -czf logs-$(date +%Y%m%d).tar.gz logs/*.log

# Remove old logs
find logs/ -name "*.log" -mtime +30 -delete

Implement log rotation:

# Add to crontab
0 0 * * * find /home/agent/logs -name "*.log" -mtime +7 -delete

Service Management Errors

Error: Service Fails to Start

Symptoms: - Scheduled task shows "Failed" - systemd service won't start - No process created

Resolution:

Check for existing process:

# Linux
ps -ef | grep epmware-agent

# Windows
Get-Process java | Where {$_.CommandLine -like "*epmware*"}

Review system logs:

# Linux systemd
journalctl -xe | grep epmware

# Windows Event Log
Get-EventLog -LogName Application -Source "EPMware*"

Verify prerequisites:

# Check Java
java -version

# Check required files
ls -la ew_target_service.sh epmware-agent.jar agent.properties

Error: Service Keeps Restarting

Symptoms: - Service restarts every few minutes - Multiple PIDs in short time - Logs show repeated startup attempts

Resolution:

Check restart configuration:

# systemd - Increase restart delay
RestartSec=60
StartLimitBurst=3
StartLimitIntervalSec=300

Investigate crash cause:

# Check for core dumps
find / -name "core.*" 2>/dev/null

# Review error logs
grep -E "ERROR|FATAL|Exception" logs/agent.log

Application-Specific Errors

Error: EPMLCM-13000 (Planning)

Symptoms:

EPMLCM-13000: Service currently not available

Resolution:

Verify Planning application provisioning:
Check user is provisioned to Planning application
Verify application is running
Test with EPM Automate or Workspace

Check Planning services:

# On Planning server
./startPlanning.sh status

Error: HFM Registry Not Found

Symptoms:

ERROR: Cannot find reg.properties file
HFM import/deployment fails

Resolution:

Copy reg.properties to correct location:

copy D:\Oracle\Middleware\user_projects\config\foundation\11.1.2.0\reg.properties ^
     D:\Oracle\Middleware\user_projects\epmsystem1\config\foundation\11.1.2.0\

Verify file permissions:

icacls D:\Oracle\Middleware\user_projects\epmsystem1\config\foundation\11.1.2.0\reg.properties

Error Diagnosis Tools

Log Analysis Script

Create a script to analyze agent logs for errors:

analyze-errors.sh:

#!/bin/bash

LOG_DIR="logs"
REPORT_FILE="error-report-$(date +%Y%m%d).txt"

echo "EPMware Agent Error Analysis Report" > $REPORT_FILE
echo "Generated: $(date)" >> $REPORT_FILE
echo "==========================================" >> $REPORT_FILE
echo >> $REPORT_FILE

# Count errors by type
echo "Error Summary:" >> $REPORT_FILE
echo "--------------" >> $REPORT_FILE
grep -h ERROR $LOG_DIR/*.log | cut -d' ' -f5- | sort | uniq -c | sort -rn | head -20 >> $REPORT_FILE

echo >> $REPORT_FILE
echo "Recent Errors (Last 24 hours):" >> $REPORT_FILE
echo "-------------------------------" >> $REPORT_FILE
find $LOG_DIR -name "*.log" -mtime -1 -exec grep ERROR {} \; | tail -20 >> $REPORT_FILE

echo >> $REPORT_FILE
echo "Connection Issues:" >> $REPORT_FILE
echo "------------------" >> $REPORT_FILE
grep -h "Connection\|Timeout\|refused" $LOG_DIR/*.log | tail -10 >> $REPORT_FILE

echo >> $REPORT_FILE
echo "Memory Issues:" >> $REPORT_FILE
echo "--------------" >> $REPORT_FILE
grep -h "OutOfMemory\|heap" $LOG_DIR/*.log | tail -10 >> $REPORT_FILE

echo "Report saved to: $REPORT_FILE"
cat $REPORT_FILE

Health Check Script

health-check.sh:

#!/bin/bash

echo "=== EPMware Agent Health Check ==="
echo

# Check 1: Process Running
echo -n "Agent Process: "
if pgrep -f epmware-agent > /dev/null; then
    echo "✓ Running (PID: $(pgrep -f epmware-agent))"
else
    echo "✗ Not Running"
fi

# Check 2: Recent Polls
echo -n "Recent Polls: "
LAST_POLL=$(tail -1 logs/agent-poll.log 2>/dev/null | awk '{print $1, $2}')
if [ -n "$LAST_POLL" ]; then
    echo "✓ Last poll: $LAST_POLL"
else
    echo "✗ No recent polls"
fi

# Check 3: Error Count
echo -n "Recent Errors: "
ERROR_COUNT=$(find logs -name "*.log" -mmin -60 -exec grep ERROR {} \; 2>/dev/null | wc -l)
if [ $ERROR_COUNT -eq 0 ]; then
    echo "✓ No errors in last hour"
else
    echo "⚠ $ERROR_COUNT errors in last hour"
fi

# Check 4: Disk Space
echo -n "Disk Space: "
DISK_USE=$(df -h . | tail -1 | awk '{print $5}' | sed 's/%//')
if [ $DISK_USE -lt 80 ]; then
    echo "✓ ${DISK_USE}% used"
else
    echo "⚠ ${DISK_USE}% used (cleanup recommended)"
fi

# Check 5: Memory Usage
echo -n "Memory Usage: "
if pgrep -f epmware-agent > /dev/null; then
    MEM=$(ps aux | grep epmware-agent | grep -v grep | awk '{print $4}')
    echo "✓ ${MEM}% of system memory"
fi

echo
echo "=== Health Check Complete ==="

Recovery Procedures

Emergency Restart Procedure

Kill all agent processes:
```
pkill -9 -f epmware-agent
```
Clean up temporary files:
```
rm -f /tmp/epmware-*
rm -f ~/agent.pid
```

Archive problematic logs:

mkdir -p logs/archive
mv logs/*.log logs/archive/

Start with debug mode:

export AGENT_DEBUG=true
./ew_target_service.sh

Rollback Procedure

If new agent version causes issues:

Stop current agent

Backup current configuration:

cp agent.properties agent.properties.backup

Restore previous version:

cp epmware-agent-old.jar epmware-agent.jar

Restart agent
Monitor for stability

Monitoring for Errors

Automated Error Detection

Set up automated monitoring for critical errors:

#!/bin/bash
# monitor-errors.sh - Run via cron every 5 minutes

CRITICAL_PATTERNS="OutOfMemory|Connection refused|Authentication failed|FATAL"
ALERT_EMAIL="admin@company.com"

# Check for critical errors
ERRORS=$(grep -E "$CRITICAL_PATTERNS" logs/agent.log | tail -5)

if [ -n "$ERRORS" ]; then
    echo "$ERRORS" | mail -s "EPMware Agent Critical Error on $(hostname)" $ALERT_EMAIL
fi

Best Practices

Implement comprehensive logging at appropriate levels
Set up proactive monitoring for common errors
Document error patterns specific to your environment
Create runbooks for common error scenarios
Test recovery procedures regularly
Maintain error knowledge base for your team
Review logs regularly even when no issues are reported

Service Errors

Common Service Errors

Connection Errors

Error: Connection Refused

Error: Connection Timeout

Authentication Errors

Error: 401 Unauthorized

Error: 403 Forbidden

Java/JVM Errors

Error: OutOfMemoryError

Error: ClassNotFoundException

File System Errors

Error: Permission Denied

Error: No Space Left on Device

Service Management Errors

Error: Service Fails to Start

Error: Service Keeps Restarting

Application-Specific Errors

Error: EPMLCM-13000 (Planning)

Error: HFM Registry Not Found

Error Diagnosis Tools

Log Analysis Script

Health Check Script

Recovery Procedures

Emergency Restart Procedure

Rollback Procedure

Monitoring for Errors

Automated Error Detection

Best Practices

Related Topics