chitfund/FIX_502_ERROR.md

454 lines
8.6 KiB
Markdown

# 🚨 Fix 502 Bad Gateway Error - Cloudflare
## Error Details
- **Domain**: chitfund.deepteklabs.com
- **Error**: 502 Bad Gateway
- **Time**: 2025-11-06 00:26:51 UTC
- **Status**: Browser ✅ | Cloudflare ✅ | Origin Server ❌
---
## 🎯 What This Means
**Your browser** is working
**Cloudflare** (CDN) is working
**Your origin server** is NOT responding or PM2 processes are down
---
## 🔍 IMMEDIATE DIAGNOSIS
SSH into your server and run these checks:
```bash
ssh luckychit@192.168.8.148
# 1. Check if PM2 processes are running
pm2 status
# 2. Check if ports are listening
netstat -tulpn | grep -E '(3000|8080)'
# 3. Check if server is responding locally
curl http://localhost:3000/health
curl http://localhost:8080
# 4. Check PM2 logs for errors
pm2 logs --lines 50
```
---
## 🚀 QUICK FIXES (Try in Order)
### Fix 1: Restart PM2 Processes
```bash
pm2 restart all
pm2 status
```
### Fix 2: PM2 Processes Are Down - Start Them
```bash
pm2 start luckychit-api
pm2 start luckychit-frontend
pm2 status
```
### Fix 3: PM2 Lost Configuration - Recreate
```bash
cd /home/luckychit/apps/chitfund
# Start backend
cd backend
pm2 start src/server.js --name luckychit-api
# Start frontend
cd ../luckychit
pm2 serve build/web 8080 --name luckychit-frontend --spa
# Save configuration
pm2 save
pm2 status
```
### Fix 4: Server Rebooted - Restore PM2
```bash
pm2 resurrect
# or
pm2 startup
pm2 save
```
### Fix 5: Firewall Blocking Cloudflare
```bash
sudo ufw status
sudo ufw allow 3000/tcp
sudo ufw allow 8080/tcp
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw reload
```
---
## 🔧 DETAILED TROUBLESHOOTING
### Step 1: Check PM2 Status
```bash
pm2 status
```
**Expected Output:**
```
┌────────────────────┬────┬─────────┬──────┬────────┐
│ App name │ id │ status │ PID │ memory │
├────────────────────┼────┼─────────┼──────┼────────┤
│ luckychit-api │ 0 │ online │ 1234 │ 50 MB │
│ luckychit-frontend │ 1 │ online │ 5678 │ 30 MB │
└────────────────────┴────┴─────────┴──────┴────────┘
```
**If status is "stopped" or "errored":**
```bash
pm2 logs --lines 100 # Check for errors
pm2 restart all # Restart processes
```
**If processes don't exist:**
```bash
# Recreate them (see Fix 3 above)
```
---
### Step 2: Check if Ports are Listening
```bash
netstat -tulpn | grep -E '(3000|8080)'
```
**Expected Output:**
```
tcp6 0 0 :::3000 :::* LISTEN 1234/node
tcp6 0 0 :::8080 :::* LISTEN 5678/node
```
**If nothing shows up:**
- Processes are not running
- Start them with Fix 3
---
### Step 3: Test Local Connectivity
```bash
# Test backend
curl http://localhost:3000/health
# Expected: {"status":"ok"...}
# Test frontend
curl http://localhost:8080
# Expected: HTML content
# Test from server IP
curl http://192.168.8.148:3000/health
curl http://192.168.8.148:8080
```
**If these work but domain doesn't:**
- Issue is with Cloudflare configuration
- See Cloudflare section below
---
### Step 4: Check PM2 Logs
```bash
pm2 logs luckychit-api --lines 50
pm2 logs luckychit-frontend --lines 50
```
**Look for errors like:**
- Database connection errors
- Port already in use
- Module not found
- Crash/exception messages
---
### Step 5: Check Firewall
```bash
sudo ufw status numbered
```
**You should see:**
```
[1] 22/tcp ALLOW IN Anywhere
[2] 3000/tcp ALLOW IN Anywhere
[3] 8080/tcp ALLOW IN Anywhere
[4] 80/tcp ALLOW IN Anywhere
[5] 443/tcp ALLOW IN Anywhere
```
**If ports are missing:**
```bash
sudo ufw allow 3000/tcp
sudo ufw allow 8080/tcp
sudo ufw reload
```
---
### Step 6: Check Server Resources
```bash
# Check disk space
df -h
# If disk is full (100%), clear logs
# Check memory
free -h
# If memory is full, restart processes
# Check CPU
top
# Press 'q' to quit
```
---
## ☁️ CLOUDFLARE CONFIGURATION
### Check DNS Settings
In Cloudflare dashboard:
1. Go to DNS settings
2. Verify A record points to: **192.168.8.148**
3. Check "Proxy status":
- 🟠 **Proxied** (through Cloudflare) - Recommended
-**DNS only** (direct) - Bypass Cloudflare
### Check Origin Rules
If using Cloudflare Tunnels or custom ports:
1. Go to **Network** tab
2. Check origin server settings
3. Make sure it points to correct IP and port
### SSL/TLS Settings
1. Go to **SSL/TLS** tab
2. Set to **Flexible** (if no SSL on origin) OR
3. Set to **Full** (if origin has SSL)
**For your setup (no nginx), use "Flexible"**
---
## 🆘 EMERGENCY RECOVERY
### Complete PM2 Reset
```bash
# Kill all PM2 processes
pm2 kill
# Navigate to project
cd /home/luckychit/apps/chitfund
# Start backend
cd backend
pm2 start src/server.js --name luckychit-api
# Verify backend
curl http://localhost:3000/health
# Start frontend
cd ../luckychit
pm2 serve build/web 8080 --name luckychit-frontend --spa
# Verify frontend
curl http://localhost:8080
# Save configuration
pm2 save
# Setup auto-start
pm2 startup systemd -u luckychit --hp /home/luckychit
# (Run the command it outputs)
pm2 save
# Check status
pm2 status
```
### Server Reboot (Last Resort)
```bash
# Save PM2 list first
pm2 save
# Reboot
sudo reboot
# After reboot, SSH back in
ssh luckychit@192.168.8.148
# Check if PM2 auto-started
pm2 status
# If not:
pm2 resurrect
```
---
## 🔍 COMMON CAUSES
### 1. Server Rebooted
**Symptom**: PM2 processes not running
**Fix**: `pm2 resurrect` or restart manually
### 2. Out of Memory
**Symptom**: Processes killed by system
**Fix**: `free -h` to check, restart processes
### 3. Database Down
**Symptom**: Backend crashes on startup
**Fix**: Check PostgreSQL: `sudo systemctl status postgresql`
### 4. Port Conflict
**Symptom**: "Port already in use" in logs
**Fix**: Kill process using port or change port
### 5. Firewall Rule Changed
**Symptom**: Can't connect to ports
**Fix**: Re-add firewall rules
### 6. Cloudflare Configuration Changed
**Symptom**: 502 only on domain, not IP
**Fix**: Check Cloudflare DNS and SSL settings
---
## 📋 POST-FIX CHECKLIST
After fixing, verify:
- [ ] `pm2 status` shows both processes **online**
- [ ] `curl http://localhost:3000/health` returns JSON
- [ ] `curl http://localhost:8080` returns HTML
- [ ] `curl http://192.168.8.148:3000/health` works
- [ ] `curl http://192.168.8.148:8080` works
- [ ] Domain `chitfund.deepteklabs.com` loads in browser
- [ ] No errors in `pm2 logs`
- [ ] Login works
- [ ] `pm2 save` executed (for persistence)
---
## 🎯 PREVENTION
### Setup Monitoring
```bash
# Check PM2 status regularly
pm2 status
# Setup auto-restart on crash (already enabled)
pm2 start src/server.js --name luckychit-api --watch false --autorestart
# Setup uptime monitoring
# Consider using: UptimeRobot, Pingdom, or StatusCake
```
### Setup Alerts
Consider PM2 Plus for monitoring:
```bash
pm2 plus
# Follow setup instructions
```
---
## 📞 DIAGNOSTIC SCRIPT
Save this as `diagnose-502.sh`:
```bash
#!/bin/bash
echo "🔍 Diagnosing 502 Error"
echo "======================="
echo ""
echo "1. PM2 Status:"
pm2 status
echo ""
echo "2. Listening Ports:"
netstat -tulpn | grep -E '(3000|8080)'
echo ""
echo "3. Test Backend (localhost):"
curl -s http://localhost:3000/health || echo "❌ Backend not responding"
echo ""
echo "4. Test Frontend (localhost):"
curl -s http://localhost:8080 | head -n 5 || echo "❌ Frontend not responding"
echo ""
echo "5. Test Backend (IP):"
curl -s http://192.168.8.148:3000/health || echo "❌ Backend not responding on IP"
echo ""
echo "6. Firewall Status:"
sudo ufw status | grep -E '(3000|8080)'
echo ""
echo "7. Disk Space:"
df -h | grep -E '(Filesystem|/$)'
echo ""
echo "8. Memory Usage:"
free -h
echo ""
echo "9. Recent PM2 Logs (Errors):"
pm2 logs --err --lines 20 --nostream
echo ""
echo "✅ Diagnosis complete!"
```
Run it:
```bash
chmod +x diagnose-502.sh
./diagnose-502.sh
```
---
## 🎉 SUMMARY
**502 Bad Gateway = Origin server not responding**
**Quick Fix:**
1. SSH into server
2. Run `pm2 status`
3. If down: `pm2 restart all`
4. If missing: Recreate processes (see Fix 3)
5. Test: `curl http://localhost:3000/health`
6. Save: `pm2 save`
**Most Common Cause**: Server rebooted and PM2 didn't auto-start
**Best Fix**: Run `pm2 resurrect` or restart manually
---
Need help? Run the diagnostic script and share the output!