The Trustwave SpiderLabs Research team is committed to making ModSecurity the best open source WAF possible. To this end, we have deployed Buildbot platforms and revamped regression tests for our different ports to ensure code quality and reliability. But we want to take it even further. The question is, how else can we improve ModSecurity development and support?
To best answer that question, we need some basic insight into the ModSecurity user community:
- How many ModSecurity deployments are there?
- What versions of ModSecurity are being used?
- How many users are running the latest version?
- What web server platform is ModSecurity used with?
With this sort of information, we could more easily prioritize issue resolution based on the amount of users of each version or feature and deliver new features and fixes more quickly. To gather this sort of insight, today we are introducing a new real-time status reporting mechanism.
Solution – Status Reporting
We are introducing a new DNS-based reporting mechanism to ModSecurity with the inclusion of the "SecStatusEngine" directive. When enabled, this directive will send the following data to the ModSecurity Project team:
- Anonymous unique id for host
- Versions of:
- Web Server Software
Once the HTTP server is started, ModSecurity gathers this "status" data and then encodes it using Base32. This encoded string is tempered with dots in a given regular space, after that a suffix is added. This suffix is the subdomain "status" which is part of the domain "modsecurity.org". Below it is an example of the information that is transmitted. Figure 1.a. contains the information in plain text, and the Figure 1.b. contains the very same information encoded. As you can see, it looks like a normal DNS name, except for the length.
Here is an example of what would be reporting in the local web server log files upon ModSecurity startup:
[Mon Jan 20 10:55:22.000876 2014] [:notice] [pid 18231:tid 140735189168512] ModSecurity for Apache/2.7.7 (http://www.modsecurity.org/) configured. [Mon Jan 20 10:55:22.000937 2014] [:notice] [pid 18231:tid 140735189168512] ModSecurity: APR compiled version="1.4.6"; loaded version="1.4.6" [Mon Jan 20 10:55:22.000944 2014] [:notice] [pid 18231:tid 140735189168512] ModSecurity: PCRE compiled version="8.32 "; loaded version="8.32 2012-11-30" [Mon Jan 20 10:55:22.000948 2014] [:notice] [pid 18231:tid 140735189168512] ModSecurity: LUA compiled version="Lua 5.1" [Mon Jan 20 10:55:22.000951 2014] [:notice] [pid 18231:tid 140735189168512] ModSecurity: LIBXML compiled version="2.7.8" [Mon Jan 20 10:55:22.001020 2014] [:notice] [pid 18231:tid 140735189168512] ModSecurity: StatusEngine call: "2.7.7,Apache/2.4.4 (Unix),1.4.6/1.4.6, 8.32 /8.32 2012-11-30,Lua 5.1/(null),2.7.8/(null),96ce9ba3c2fb71f7a8bb92a88d560d44dbe459b8" [Mon Jan 20 10:55:22.089012 2014] [:notice] [pid 18231:tid 140735189168512] ModSecurity: StatusEngine call successfully submitted.
ModSecurity will then perform a DNS query to a subdomain under ".status.modsecurity.org", by performing a normal name resolution (DNS lookup). The DNS protocol was selected as most organizations deploy their web servers behind a firewall with restricted egress filtering that do not allow outbound HTTPs calls or any other kind of connection to the Internet. Although outbound connections are very restrictive from DMZs, DNS queries are often allowed. We use a special configured tinydns (DNS Server) that responds to all subdomains under the "status.modsecurity.org" domain. Every time that a query is received, the server responds to it by pointing to an IP address, and then it saves the request in the log file. This log file is parsed by a script which decodes the Base32 information and gathers other information, such as date and time of the request. With the information in plaintext, the content is saved in a database.
Once the data is saved in our database, it is exported to a public web page through a JSON API. Since we don't have any significant data yet, the API is working with very limited information – it basically returns results that were gathered in a given space of time (this time slice can be chosen by the API user). Figure 2. illustrates an example of how the JSON looks. In this particular example, the information between epoch (Unix format) 0 and 1390252267 was requested, this specific set of data was collected from internal tests. The url used to generate this example was: http://status.modsecurity.org/api/0/1390252267. For more information about the API, have a look on the API documentation: https://github.com/SpiderLabs/ModSecurity-status/wiki/API-Documentation
The geolocation data presented is based on the IP address of the DNS server that sent the final query to our servers. This information is not obtained from the server that is actually running ModSecurity. Assuming that the DNS server is somewhat near the ModSecurity host (in order to avoid latency), it is possible to retrieve the approximate geolocation information without exposing the server/user. Another concern for the ModSecurity Project team was to decide exactly what data to report. For instance: it is necessary to uniquely identify each server, otherwise we would fail to distinguish between a server that was restarted and a server that just started to use ModSecurity. In order to do that, without retrieving any information about the user, a SHA-1 hash was generated. It uses as input some information extracted from the server, like server name and MAC address of the network device. But, to protect the user's data and privacy, it never exchanges the name or the MAC address in plaintext. Even with these precautions, users may still be uncomfortable to sharing information. It is for this reason that this feature is disabled by default. In order to participate and "opt-in" users have to set SecEngineStatus On within their ModSecurity configuration files. If you use the modsecurity.conf-recommended file in the modsec_status GitHub branch, this setting is already enabled.
Status Reporting Map
You are also invited to have a look at our status map. This last piece of the status reporting feature shows a live listing "heatmap" of servers that are using ModSecurity and can be found at http://status.modsecurity.org. Since we are in beta, we have just a few registrations. Figure 3. illustrates what we do expect to have in a near future (the heatmap was randomly generated).
Figure 3. Example on how the data will be displayed at http://status.modsecurity.org website.
We need help from the community testing this new feature. Since the code is in a beta state, it is not merged into ModSecurity mainline, but it is available under our GitHub repository in the branch "modsec_status". Don't forget to set SecEngineStatus to On. By using it, you may notice the log entry regarding SecSatusEngine in your error.log.
The code from all the components used by SecStatusEngine (web, scripts, etc.) are open source and published under the SpiderLabs GitHub Repo: https://github.com/SpiderLabs/ModSecurity-status. Patches for ModSecurity itself are available under ModSecurity git repo at: https://github.com/SpiderLabs/ModSecurity.
The next steps we will take in developing this feature will be including charts and a richer API with filters and more data. This will allow us to retrieve more valuable information from this raw data. But first we need some significant raw data because it is very hard to plot any chart without it.
This project is open source. So if you are interested in any particular data from this database, feel free to get your hands dirty and submit a patch or provide suggestions. Both are very welcomed!