CodeNewbie Community 🌱

Sharon428931
Sharon428931

Posted on

WAF Protection Benchmark: SafeLine vs ModSecurity vs Coraza

Recently, I helped several clients evaluate security tools—and one recurring topic was WAFs (Web Application Firewalls).

WAFs are essential for blocking attacks like SQL injection, RCE, and XSS. But how do you know if a WAF actually works in real-world scenarios?

To answer that, I ran a comparative test of several open-source or free WAFs, using a consistent methodology and transparent test data.


How We Measured WAF Performance

The effectiveness of a WAF should be measured scientifically. We used four key metrics:

  • Detection Rate – Measures how many attacks are caught (True Positive Rate).
  • False Positive Rate – How often normal traffic is incorrectly blocked.
  • Accuracy Rate – A combined score of true positives and true negatives.
  • Detection Latency – The average time a WAF takes to process and respond to a request.

These were calculated using classic classification metrics:

Term Meaning
TP Malicious requests correctly blocked
TN Legitimate requests correctly allowed
FP Legitimate requests incorrectly blocked
FN Malicious requests incorrectly allowed

Formulas used:

  • Detection Rate = TP / (TP + FN)
  • False Positive Rate = FP / (TP + FP)
  • Accuracy Rate = (TP + TN) / (TP + TN + FP + FN)

To measure performance precisely, we used 90% and 99% latency percentiles.


Sample Data

All tests were run using open tools and publicly available data:

  • Normal traffic (white samples): 60,707 HTTP requests from real forum browsing (~2.7GB)
  • Attack traffic (black samples): 600 curated payloads gathered over 5 hours, using:
    • DVWA + common attack scenarios
    • Payloads from PortSwigger
    • VulHub targets + classic PoCs
    • DVWA with increasing protection levels (med/high)

Ratio of normal to attack traffic was 100:1, reflecting real-world internet traffic.


Test Setup

  • Web Server: Nginx, returns 200 OK to any request.
location / {
    return 200 'hello WAF!';
    default_type text/plain;
}
Enter fullscreen mode Exit fullscreen mode
  • WAF Config: All WAFs tested using default configurations, with no custom tuning.
  • Testing Tool: Custom script that:
    • Parses Burp Suite exports
    • Deletes cookies and sets Host headers
    • Mixes normal and malicious traffic
    • Determines WAF response by checking for HTTP 200
    • Outputs all metric calculations automatically

Test Results

SafeLine WAF

TP: 426    TN: 33056    FP: 38    FN: 149
Detection Rate: 74.09%
False Positive Rate: 8.19%
Accuracy Rate: 99.44%
90% Latency: 0.73 ms
99% Latency: 0.89 ms
Enter fullscreen mode Exit fullscreen mode

Coraza

TP: 404    TN: 27912    FP: 5182    FN: 171
Detection Rate: 70.26%
False Positive Rate: 92.77%
Accuracy Rate: 84.10%
90% Latency: 3.09 ms
99% Latency: 5.10 ms
Enter fullscreen mode Exit fullscreen mode

ModSecurity

TP: 400    TN: 25713    FP: 7381    FN: 175
Detection Rate: 69.57%
False Positive Rate: 94.86%
Accuracy Rate: 77.56%
90% Latency: 1.36 ms
99% Latency: 1.71 ms
Enter fullscreen mode Exit fullscreen mode

Baota WAF

TP: 224    TN: 32998    FP: 96    FN: 351
Detection Rate: 38.96%
False Positive Rate: 30.00%
Accuracy Rate: 98.67%
90% Latency: 0.53 ms
99% Latency: 0.66 ms
Enter fullscreen mode Exit fullscreen mode

nginx-lua-waf

TP: 213    TN: 32619    FP: 475    FN: 362
Detection Rate: 37.04%
False Positive Rate: 69.04%
Accuracy Rate: 97.51%
90% Latency: 0.41 ms
99% Latency: 0.49 ms
Enter fullscreen mode Exit fullscreen mode

SuperWAF

TP: 138    TN: 33048    FP: 46    FN: 437
Detection Rate: 24.00%
False Positive Rate: 25.00%
Accuracy Rate: 98.57%
90% Latency: 0.34 ms
99% Latency: 0.41 ms
Enter fullscreen mode Exit fullscreen mode

Summary Table

WAF False Negatives False Positives Accuracy Rate Avg Latency (90%)
SafeLine 149 38 99.44% 0.73 ms
Coraza 171 5182 84.10% 3.09 ms
ModSecurity 175 7381 77.56% 1.36 ms
Baota 351 96 98.67% 0.53 ms
nginx-lua-waf 362 475 97.51% 0.41 ms
SuperWAF 437 46 98.57% 0.34 ms

Final Thoughts

  • SafeLine WAF delivered the best balance of high detection accuracy and low false positives.
  • ModSecurity and Coraza had decent detection, but excessive false positives make them hard to use in production.
  • Simpler WAFs like Baota, nginx-lua-waf, and SuperWAF were fast and light, but missed a large portion of attacks.

Reminder: These tests reflect only one set of samples, tools, and environments. Real-world performance can vary significantly. Always test in your own environment before deploying.

Want to test these yourself? I’ll be open-sourcing the full dataset and testing scripts soon. Follow me to stay updated.


Join the SafeLine Community

Top comments (0)