Exciting new features and optimizations can now be found in the nightlies and I’d like to take some time to go over the biggest changes.
A new executable is now available, called arachni_reproduce, that let’s you reproduce all issues in a report and then create a new report containing only the issues that still exist.
So, if you’ve got an Arachni report and are working to fix all the identified issues, you can just pass that report to arachni_reproduce and get immediate feedback as to how you’re doing instead of having to rerun a full scan.
For each run, arachni_reproduce will generate a new report that only includes unfixed issues, so, again, you won’t have to waste time testing issues that you’ve already fixed.
In addition to that, you can specify individual issues to be reproduced, based on their digest, if you only care about particular issues rather than the entire report.
Lastly, during the reproduction of each issue, extra HTTP request headers are set that contain information about which issue is being reproduced, thus allowing you to set server-side debugging or instrumentation in order to make fixing it even easier:
X-Arachni-Issue-Replay-Id: Unique token for requests pertaining to individual issues.
- Differs for each run and can be used to group requests for each issue together.
X-Arachni-Issue-Seed: Seed payload used to identify the original issue.
- Initial payload used to identify the vulnerability in the given report.
- X-Arachni-Issue-Digest: Digest uniquely identifying each issue across scans and reports.
Debugging Rack-based webapps (Ruby-on-Rails, Sinatra etc.)
Followers of this blog may remember an old post showcasing an IAST project I was working on some time ago, called the Arachni Introspector.
Based on the same principles, a subset of that functionality has now been introduced as a new project that let’s you debug Ruby web applications like Ruby-on-Rails, Sinatra and anything else that is based on the Rack framework.
Eventually, I hope to combine the 2 and make available the functionality of the Introspector, which would be pretty cool because last time I checked there are no other IAST systems for Ruby web applications.
New HTML parser
However, in my search for a faster and more lightweight alternative I came across an optimized XML/HTML parser called Ox, and the results speak for themselves really.
For a huge page — about 1MB of HTML code:
Nokogiri::HTML 8.323 (±12.0%) i/s - 42.000 in 5.082606s
Ox 49.934 (±14.0%) i/s - 245.000 in 5.007568s
For a large page — about 43KB of HTML code:
Nokogiri::HTML 349.181 (±18.9%) i/s - 1.664k in 5.032479s
Ox 1.682k (± 6.4%) i/s - 8.415k in 5.026887s
For a medium page — about 10KB of HTML code:
Nokogiri::HTML 1.315k (±18.2%) i/s - 6.324k in 5.068736s
Ox 8.283k (± 2.8%) i/s - 41.871k in 5.059968s
For a small page — about 364B of HTML code:
Nokogiri::HTML 15.095k (± 4.0%) i/s - 76.128k in 5.050787s
Ox 209.697k (± 2.1%) i/s - 1.055M in 5.032398s
i/s means iterations per second, and as you can see Ox is hugely faster, thus allowing Arachni to use less resources overall and also allow it to handle huge pages (very rare but it happens) with relative ease.
In addition to the above, Arachni now builds its own document tree using SAX, allowing it to ignore irrelevant HTML elements and only store interesting ones like forms and links, while not wasting valuable resources on junk.
Lastly, I’d like to sincerely thank Peter Ohler (the Ox developer and maintainer) for his assistance and remarkable willingness to accommodate my need for some unusual features.
Continuing with the spirit of optimization, the analysis techniques have been updated to support reading and processing HTTP responses as streams, chunks and chunks of lines. This allows the system to process the data as soon as they become available, while making the storage of full HTTP responses mostly unnecessary.
For example, say there’s a page that returns a response body that’s fairly large, like 700KB.
When all the inputs of this page are audited, the system will have to store (by default) about 100 HTTP responses in memory at any given time, thus resulting in at least 70MB of RAM, which is a considerable amount.
In addition to the above, the response bodies may need to be parsed into HTML document trees or have some other expensive operation performed upon them, thus making the overall amount of memory needed to perform the security checks explode — pretty rare but it happens.
Now, HTTP responses are buffered and processed in easily digestible chunks, resulting in only small parts of the entire responses being stored in memory at any given time.
The browser operations have been profiled to hell and back and received a large amount of optimizations in order to improve the handling of large rich web applications with thousands of elements and thousands of events.
Like with the HTTP responses in the section above, the elements and their events are now extracted and processed in batches to keep memory utilization low. In addition, operations pertaining to their processing and filtering have further been delegated to the custom JS browser environment, because a large part of them was previously written in Ruby.
This has had a dramatic effect on RAM and CPU usage when it comes to browser operations.
To further ease Arachni’s integration into your SDLC, you can now use the new webhook_notify plugin (courtesy of Sean Handley):
Name: Webhook notify
Sends a webhook payload over HTTP/HTTPS at the end of the scan.
Valid payload variables to use in the payload:
* $SEED$ -- Unique seed used for the scan.
* $URL$ -- Targeted URL.
* $MAX_SEVERITY$ -- Maximum severity of the identified issues.
* $ISSUE_COUNT$ -- Amount of identified issues.
* $DURATION$ -- Scan duration.
[~] url - Webhook URL (fully qualified including scheme)
[~] Type: url
[~] Required?: true
[~] content_type - Content type of payload (XML or JSON).
[~] Type: multiple_choice
[~] Choices: json, xml
[~] Default: json
[~] Required?: true
[~] payload - Either XML or JSON payload. Must be well-formed and valid. You can interpolate variables with $VARIABLE_NAME$
[~] Type: string
[~] Required?: true
Author: Sean Handley <email@example.com>, Tasos Laskos <firstname.lastname@example.org>
All the above goodies can be found in the nightlies, so please give them a try and let me know if you run across any issues.
Do keep in mind though that the nightlies are not considered stable and will in fact probably be quite unstable indeed, which is why I need your feedback.