Home > Networking > The Curious Case of “Deny” Logs in Palo Alto SSL Decryption: A Troubleshooting Journey

The Curious Case of “Deny” Logs in Palo Alto SSL Decryption: A Troubleshooting Journey

Traffic flow: AWS > Firewall > reverse proxy > Paloalto > Load balancer > Backend server

If you’ve ever stared at firewall logs that make no sense—especially when everything seems to be working fine—you’re not alone. Recently, I found myself knee-deep in a baffling issue with our Palo Alto Networks (PAN) firewall: despite all traffic being set to “allow,” the logs were filled with “deny” entries. Even stranger? The traffic in question was just health checks, and production was running smoothly. Here’s the story of how I unraveled this mystery—and what I learned along the way.

The Setup: SSL Decryption with a Wildcard Cert

Our environment relies on a PAN firewall to handle SSL decryption using a wildcard certificate. Behind it, a reverse proxy routes traffic, and we use AWS Route 53 health checks to monitor our web services. These health checks ping our services every 30 seconds or so, expecting specific return strings to confirm everything’s operational.

Everything was running smoothly until I noticed something odd in the traffic logs: a flood of “deny” entries tied to these health check requests. But here’s the twist—despite the “deny” label, the security policies were set to “allow,” and our services were humming along without a hitch. No outages, no user complaints—just these confusing log entries staring back at me.

The Investigation Begins: Deny, But Allowed?

Naturally, I dove in to figure out what was going on. First stop: the security policies. I confirmed everything was set to allow the traffic—no sneaky rules blocking anything. To be extra sure, I disabled all security-related policies temporarily. Guess what? The “deny” logs kept piling up. This wasn’t a policy issue.

Next, I reached out to PAN support. After some back-and-forth and a few packet captures, they floated a theory: TCP port reuse. Their idea was that since all traffic was funneled through the reverse proxy, the firewall might be seeing reused ports too quickly, tripping up its session tracking.

Intrigued, I analyzed the traffic myself. Here’s where it got interesting—I didn’t see any source port reuse within a few minutes. If port reuse was the issue, I’d expect to see the same source ports cycling rapidly, but that wasn’t happening. So, was PAN’s hypothesis off the mark?

Tweaking Timeouts: A Shot in the Dark

PAN support suggested tweaking some session timeout parameters to see if it would help. Specifically, they recommended:

  • Reducing the TCP discard time from 90 seconds to 5 seconds
  • Dropping the TCP time wait from 15 seconds to 1 second

I made the changes and held my breath. No dice—the “deny” logs kept coming. At this point, I was starting to suspect this might be a phantom issue—a logging quirk rather than an actual traffic problem.

A Clue: AWS Route 53 Health Checks

One detail kept nagging at me: all the problematic requests were from AWS Route 53 health checks. These checks are frequent and routed through the reverse proxy, so I wondered if their volume or behavior was throwing the firewall for a loop.

But here’s the weird part—despite the “deny” logs, production traffic was completely unaffected. The health checks were succeeding, and our services stayed online. Was this just a logging glitch, or was there something deeper going on?

Possible Explanations: What’s Really Happening?

With another remote session with PAN support scheduled, I started brainstorming potential causes. Here are some theories I’m chewing on:

  • SSL Decryption Oddity: Wildcard certificates can sometimes cause issues with SSL decryption—maybe the firewall is misinterpreting the SNI (Server Name Indication) or the certificate chain for these health checks.
  • Reverse Proxy Shenanigans: Could the reverse proxy be closing and reopening connections in a way that confuses the firewall’s session tracking, even without obvious port reuse?
  • Logging Bug: Maybe this is a firmware glitch where allowed traffic is mistakenly logged as “deny.” I’ve checked our PAN-OS version against known issues, but nothing matches yet.
  • Health Check Overload: The frequent Route 53 checks might be overwhelming the firewall’s session table, leading to misclassified logs. But if so, why isn’t production traffic impacted?

What’s Next: Back to the Drawing Board

This week, I’ve got another remote session lined up with PAN support. We’ll likely dig deeper into packet captures, scrutinize the health check traffic, and maybe adjust the Route 53 intervals to see if that shakes anything loose. I’m also thinking about bypassing the reverse proxy temporarily to rule it out as a factor.

For now, I’m monitoring the logs and hoping for a lightbulb moment. While it’s not affecting production, it’s still an unsolved puzzle—and I’m not one to let those slide.

Lessons Learned (So Far)

This isn’t my first rodeo with quirky tech issues, but it’s reminded me of a few key takeaways:

  • Question Vendor Theories: PAN’s port reuse idea didn’t match my data, so I kept digging. Always verify suggestions against what you’re seeing.
  • Lean on Packet Captures: They’ve been a lifesaver for ruling out causes and will likely be the key to cracking this.
  • Patience Pays Off: Not every issue has an obvious fix right away. Troubleshooting is a marathon, not a sprint.

Your Turn: Seen This Before?

I’d love to hear from you—have you ever run into “deny” logs for traffic that’s clearly allowed? How did you sort it out? Drop a comment below and share your own troubleshooting tales. And keep an eye out—I’ll update this post once I’ve got more answers.

Leave a Comment