I want to export my mitmproxy dump files (captured using -w switch) to JSON.
I wrote a script which does this, more or less, here:
from mitmproxy import flow
import sys
import jsonpickle
items = []
with open(sys.argv[1], "rb") as logfile:
freader = flow.FlowReader(logfile)
try:
for f in freader.stream():
items.append( f )
except Exception as e:
print("Flow file corrupted: {}".format(e))
print jsonpickle.encode( items )
This works, but (as expected) is dumping out the python objects. The output looks like this:
[[{"py/object":"mitmproxy.models.http.HTTPFlow","server_conn":{"py/object":"mitmproxy.models.connections.ServerConnection","server_certs":[],"via":null,"protocol":null,"timestamp_tcp_setup":1473286806.433876,"cert":{"py/object":"netlib.certutils.SSLCert","x509":{"py/object":"OpenSSL.crypto.X509","_x509":{"py/object":"_cffi_backend.CDataGCP"}}},"timestamp_ssl_setup":1473286806.68306,"_TCPClient__source_address":{"py/object":"netlib.tcp.Address","family":2,"address":{"py/tuple":["1...
I’d love to find a better way to dump the files which gets me closer to the raw HTTP request/response itself and less of the internal python details. Perhaps go to HAR file, and then convert to JSON? I’m not much of a python programmer, so I’m unsure how to get started inspecting the objects and pulling out the important information.
My goal is to have:
- the request: headers and body
- the response: headers and body
- host and path.
- maybe timing information?
Any suggestions?