Extract all request URLs from a session

Hi all,

Is there a(n easy) way I can export all request URLS from a mitmproxy session?
I know I can copy&paste the URLs from the flow view, and I could probably do something with a file of saved flows.

While copy&paste no longer scales, I’m looking into other options. And while I’m thinking about how to filter URLs from a flows file, I was also wondering if this could not be a standard option of mitmproxy.

Curious if I am missing something and/or what your opinions are.

The easiest way to do this is to use mitmproxy’s insanely flexible scripting abilities to process a saved dumpfile. First, save your flows to file from mimtproxy (“w” shortcut key). Now put the following in a file called script.py:

def request(ctx, flow):
    print flow.request.url

Now run the script over the dump file using mitmdump:

mitmdump -q -s script.py -r mydumpfile

The -q flag here tells mitmdump to suppress its normal output. The script is run on each flow in turn, and just outputs the request URL. The result is that all the URLs are printed to screen, one per line.

2 Likes

Thank you very much for your excellent answer!

In this context: any chance/plans there will be a mutt-like way of executing macros from within mitmproxy?
I’m thinking of a customizable shortcut that executes a macro/script that runs several internal and external actions (e.g. save the current flows, extracts the urls from the flows, delete all flows from the view)

So, at the moment the best we can do is external script invocation on a specific flow (| shortcut key), and on all flows “live” through the -s command-line argument.

We now have a marking system for flows, and it would be 100% natural to extend this to allow scripts to be run on all marked flows, all flows, etc. I’d also like to have a better way to manage scripts (maybe a .mitmproxy/scripts directory that’s presented by default when choosing scripts, rather than having to specify a specific file). All of these things together would be a macro-like system on steroids, and let you do almost anything. Please do let me know if you’d like to help push this along - we have a lot of other things competing for our mitmproxy time at the moment. :slight_smile:

For anyone else reading this in 2019, the mitmproxy API has changed slightly, and the ctx parameter is no longer passed through. Just remove it from your method signature and you’ll be ok.

You may also need to add parenthesis around the parameters to print if you’re using python 2