Edit: The SecToolMarket industry-wide scanner benchmark has been updated to include Arachni Framework v1.1. You can consult it for an independent comparison of all webappsec scanners. Hint: If you don’t care about SWF and VBScript Arachni tops the scoreboard.
Arachni Framework v1.1 & WebUI v0.5.7 are out!
This is the first big release after the very successful v1.0 overhaul (which added HTML5/DOM/JS/AJAX support) and includes a great many bug fixes, optimizations and refinements for these new features.
So, let’s go over the most important changes.
More sensible defaults
Some defaults have been updated to be more sensible and reduce the system’s overzealousness:
- Request timeout: Lowered from 50 to 10 seconds.
- Response maximum size: Set to 500KB.
- Browser cluster
- Job timeout: Lowered from 120 to 15 seconds.
- DOM depth limit: Lowered from 10 to 5.
- --audit-parameter-names — Injects payloads into parameter names.
- --audit-with-extra-parameter — Injects payloads into an extra parameter.
- --http-ssl-verify-peer — Verify SSL peer.
- --http-ssl-verify-host — Verify SSL host.
- --http-ssl-certificate — SSL certificate to use.
- --http-ssl-certificate-type — SSL certificate type.
- --http-ssl-key — SSL private key to use.
- --http-ssl-key-type — SSL key type.
- --http-ssl-key-password — Password for the SSL private key.
- --http-ssl-ca — File holding one or more certificates with which to verify the peer.
- --http-ssl-ca-directory — Directory holding multiple certificate files with which to verify the peer.
- --http-ssl-version — SSL version to use.
Kerberos HTTP authentication
After a long discussion with a very helpful user, a very nice last minute feature was implemented: Support for Kerberos HTTP authentication.
This had more to do with building the packages to include all the necessary dependencies rather than significant changes in Arachni and after a few days of frustration everything was good to go.
The authentication process is a bit different than usual, you can find the relevant documentation in the Wiki.
Custom 404 detection overhaul
It came to my attention that there were certain edge-cases not covered by the previous 404 detection heuristics. Thus, it was time to overhaul that part of the system in order to cover those cases and allow the system to easily accommodate new ones.
Unfortunately, those edge cases used to lead to discovery checks yielding lots of false positives under certain circumstances, and even though the meta-analysis stage did flag those issues as untrusted, it was better to just address the root cause and be done with it.
So, if you used to get false-positives from checks that looked for files and directories, this update will fix that.
If you come across a case where you still get a false-positive by a check that relies on this subsystem please get in touch, it should be easy to fix just by adding one more training scenario, so that the system can learn that new webapp behavior.
The focus during the development of Framework v1.0 was, of course, the architecture surrounding the browser analysis. It was a conscious decision to provide a clean, scalable design but avoid clever optimizations, and instead introduce them carefully and in small batches.
So, here’s the first batch of optimizations:
- An inter-browser, inter-scan, shared, HTTP resource disk cache has been enabled. The first time a browser hits a cacheable resource it will save it to disk and subsequent browser calls (from any browser worker) and subsequent scans won’t have to request it again.
- The browsers are now smarter and can determine when forms and cookies are actually involved in DOM processing, and thus avoid unnecessary (and costly) DOM audits.
- This wasn’t that hard for forms, just paying closer attention to their associated DOM events.
- This was a bit tricky for cookies, as they’re not DOM elements, so the browsers are using their JS data-flow tracing capabilities to determine whether a page is using cookies at each given DOM state.
- A lot of active checks have had their payloads optimized, resulting in significantly less injections.
- Updated default options are less aggressive than in previous versions.
All in all, the above will save thousands of HTTP requests (lots of which would be very slow, coming from the browsers) and thousands of browser DOM audit operations (and thus more thousands of HTTP requests).
You can expect scans to be many times faster, depending on web application characteristics.
Let’s use the http://testhtml5.vulnweb.com demo site to compare the performance difference between v1.0.6 and v1.1.
|Defaults||With platforms||Crawl only|
Even though specifying platforms results in a significant reduction of HTTP requests, the duration doesn’t change that much because that particular site has a browser-focused workload, which acts as a bottleneck.
arachni http://testhtml5.vulnweb.com --platforms=python,nginx,nosql,linux
arachni http://testhtml5.vulnweb.com --checks -
Even though the performance optimizations play a big part in the reduction of the scan duration, the reduced default value of the DOM depth limit from 10 to 5 plays a significant part too.
The new default value of 5 should provide adequate coverage without wasting time on resources that will probably yield no results. So it’s sensible as a default value but you can of course change it as needed, depending on your paranoia level.
To keep this comparison fair, by using the older default value with v1.1 you get a scan with 32,836 requests and a duration of 00:12:09.
Configuration: arachni http://testhtml5.vulnweb.com --scope-dom-depth=10
If you were to take the time to do some manual recon and provide a more optimized configuration you’d get 19,342 requests and a duration of 00:02:39.
Configuration: arachni http://testhtml5.vulnweb.com --scope-dom-depth-limit=3 --platforms=python,nginx,nosql,linux
And if you have a few CPU cores to spare and know that file and directory discovery checks aren’t applicable you’d get 5,858 requests and a duration of 00:01:32.
Configuration: http://testhtml5.vulnweb.com --scope-dom-depth-limit=3 --platforms=python,nginx,nosql,linux --browser-cluster-pool-size=20 --checks=*,-common_*,-backup_*,-backdoors
As you can see, it pays off to get a bit familiar with Arachni and the web application you’re targeting prior to running a scan as there are plenty of opportunities to optimize.
Not done yet
These optimizations are just the first wave, there will be more to come, some of which will have a similar impact on scan duration.
Put simply, now that the v1.x series is stable, we’re back into speed demon mode. :)
XML and JSON element support
A natural follow-up to adding real browser analysis is extracting XML and JSON inputs from requests and auditing them like any other element. The Framework can now do this for you.
There’s not much to say here really, auditing these elements is enabled by default and everything is automated.
The proxy plugin has also been updated to extract XML and JSON input vectors from HTTP requests, which means you can use Arachni to perform service scans by first training it via the plugin.
Soon enough, there’ll be specialized service crawlers, until then training the system via the proxy plugin should cover you.
- unvalidated_redirect_dom — Logs DOM-based unvalidated redirects.
- xxe — Logs XML External Entity vulnerabilities.
- trainer — Disabled parameter flip for the payload to avoid parameter pollution.
- os_cmd_injection — Only use straight payload injection instead of straight and append.
- code_injection — Only use straight payload injection instead of straight and append.
- xss — When auditing links don’t require a tainted response for browser analysis.
- Updated payloads.
- Only use straight payload injection instead of straight and append.
- xss_dom_script_context — Only use straight payload injection instead of straight and append.
- xss_tag — Updated payloads to handle cases when more data are appended to the landed value.
- xss_event — Added proof to the issue.
- insecure_cross_domain_policy_access — Checks crossdomain.xml files for allow-access-from wildcard policies.
- insecure_cross_domain_policy_headers — Checks crossdomain.xml files for wildcard allow-http-request-headers-from policies.
- insecure_client_access_policy — Checks clientaccesspolicy.xml files for wildcard domain policies.
- insecure_cors_policy — Logs wildcard Access-Control-Allow-Origin headers per host.
- x_frame_options — Logs missing X-Frame-Options headers per host.
common_directories — Added:
- http_put — Try to DELETE the PUT file.
- html_objects — Updated regexp to use non-capturing groups.
- vector_collector — Collects information about all seen input vectors which are within the scan scope.
- headers_collector — Collects response headers based on specified criteria.
- exec — Calls external executables at different scan stages.
- Added domain option.
- Fixed extension for html reporter.
- Added support for afr report type.
- proxy — Added XML and JSON input vector extraction.
MS Windows support
Good progress has been been made on supporting MS Windows by leveraging the JRuby interpreter in order to run the system on the JVM.
Good news is that both the Framework and WebUI can now run on JRuby; bad news is that it’s not 100% ready yet as there’s one last bug that needs to be addressed. Weirdly enough, that bug only occurs when running on MS Windows and it looks like it’s in one of Arachni’s dependencies, so figuring it out is a bit tricky.
Still, I’ve come really close so I’m not ready to give up on it yet.
As usual, you can find links to the packages and detailed changelogs at the download page.