Go to Top

Arachni::Browser for DOM/AJAX analysis

Hey folks,

As some of you may know, work has began on v0.5 (branch/milestone) which, among many other cool things, will include proper support for DOM/AJAX analysis for both coverage/discovery and audit. I have made some progress on this and, in fact, I’ve got the hardest part all sorted out.

That part is the Browser (powered by WATIR and PhantomJS), which is responsible for analyzing each page and traversing the DOM tree and capturing AJAX requests in order to extract more inputs/resources from them.

Explaining this would probably bore you half to death so let me show you instead.

Given a page which does DOM manipulation via JS and AJAX like:

First of all, once this page is loaded it will populate the level1 div with whatever the GET /level2 AJAX call returns and that creates a problem. Even if you use a browser to fire events and capture an HTML version of the DOM, you won’t be able to just load that code again and have the same page. And that’s the general problem with client-side dynamic content, it’s all about state.

The way Arachni::Browser works is by traversing through all available DOM states (DOM tree crawling in essence) and it takes page snapshots after each event is triggered, capturing both data (for easy analysis later on so that you won’t have to use the expensive browser again) and state (in the form of “transitions”, so that you’ll be able to replay them later on and get the same page).

The way this gets done currently (well not really, to make the audit more efficient the framework schedules this differently but the results are the same) is:

Let’s look at the first (just after the page is loaded), middle (after a few events have been triggered) and last (after we’ve reached the last event) transitions (there are a total of 9 but it’s too much crap to display here).

First

First page snapshot, right after it has been loaded.

DOM depth

We didn’t do anything to get to this state, other than load the page, so it has a depth of 1.

Transitions

We see the initial HTTP request (to get the page) and the AJAX request that immediately results, once those have been completed the page load has completed.

DOM body

To make analysis easier, the evaluated DOM is captured as HTML as well.

The AJAX call populated the level1 with the above HTML.

Middle

A link that resulted from the first AJAX call gets onmouseovered.

DOM depth

DOM depth is 2 as we first loaded it (1) and then triggered an event which became available after the load (2).

Transitions

We see the event and the affected element.

DOM body

The triggered event called the writeButton() method which added a button to the DOM.

Last

After a bunch of stuff happens, we get to one of the one of the deepest states.

DOM depth

This page has a depth of 3 as you’ll see from the transitions.

Transitions

DOM body

All levels have been populated after a given series of events, which were bound to take place as the DOM tree/states is/are completely crawled.

And best of all, those cool page snapshots are just normal Framework pages which means that they can be audited and analyzed just as easily and, even better, this sort of full browser analysis takes place in parallel to the normal (dumb) audit so its latency will be completely concealed during the scan (given that there are some normal elements to be audited and that the site is not just JS, otherwise you will notice it).

Pretty cool right?

, , , , , , ,

About Tasos Laskos

CEO of Sarosys LLC, founder and lead developer of Arachni.

One Response to "Arachni::Browser for DOM/AJAX analysis"

  • Louis Nadeau
    August 5, 2013 - 5:16 pm Reply

    Pretty cool indeed !

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.