Compress content

Mradr1 · September 9, 2017, 10:28pm

Sorry to bother - but I tried a few different ways and I can not get this to work right. I am trying to build out a way to compress the data to response using gzip, but the data that the client gets is compress, but does not uncompressed when viewing the data. My understanding is that chrome/ie should be able to see this and uncompressed content data to show the website, but all I see if the compress content version. How would I go about this to make it work?

My end goal here is to improve on website bandwidth just a bit by compress the returning data, remove white-space, convert some image files into more static type information files, file caching, and etc. Not sure if there are any projects going on to maybe do such a thing already?

Return the request information

if "html" in file["content-type"]:
    gzip_buffer = io.BytesIO()
    with gzip.GzipFile(mode='wb', compresslevel=8, fileobj=gzip_buffer) as gzip_file:
        gzip_file.write(temp)
    temp = gzip_buffer.getvalue()
return http.HTTPResponse.make(200, temp, {"Content-Type": file["content-type"], 'Content-Encoding': 'gzip', "Cache-Control": "private", "Vary": 'Accept-Encoding'})

Phisik · June 5, 2018, 1:36pm

I wanted to make the same thing, compressing proxy. I tried to follow this guide, but it only worked out for images. And gzip was something that took some time to deal with. Here is my final script:

from mitmproxy import http
from PIL import Image
import gzip, io

def response(flow: http.HTTPFlow):
    
    if "Content-Type" in flow.response.headers:
        ct = flow.response.headers["Content-Type"]
        
        # compress raw text data 
        if ct[0:5] == "text/" or ct[0:12] == "application/" or ct == "image/svg":
            if "Content-Encoding" in flow.response.headers:
                # if already compressed with gzip/deflate/LZW/Brotli  etc. - skip it
                print("Skipping compressed response")
                return
            
            print("Processing text: " + flow.request.url)
            print("Compressing content...")
            content = gzip.compress(flow.response.content)
            size_difference = len(flow.response.content)-len(content)
            if size_difference < 0:
                print("Original content was smaller than compressed. Skipping conversion...")
                return
            print("Saved " + str(size_difference) + " bytes")
            
            headers = flow.response.headers
            status_code = flow.response.status_code
            
            # this works
            flow.response = http.HTTPResponse.make(
                 200, content,
            )
            
            # this does not works, can anyone say why?
            #flow.response.content = content
            
            flow.response.headers = headers
            flow.response.status_code = status_code
            flow.response.headers["Content-Type"] = "text/html" 
            flow.response.headers["Content-Encoding"] = "gzip" 
            flow.response.headers["Content-Length"] = str(len(content)) 
            flow.response.headers["Modified-By-Mitmproxy"] = "true" 
        # compress images
        elif ct[0:6] == ("image/") and len(flow.response.content) > 400:
            print("Processing image:" + flow.request.url)
            
            # convert to BW
            s = io.BytesIO(flow.response.content)
            img = Image.open(s).convert("L")
            
            width, height = img.size
            if max([width, height]) > 300:
                img = img.resize((round(width/2), round(height/2)))
            
            # save as jpeg
            s2 = io.BytesIO()
            img.save(s2, "jpeg", quality=25)
            
            size_difference = len(flow.response.content)-len(s2.getvalue())
            if size_difference < 0:
                print("Original image was smaller. Skipping conversion...")
                return
            
            print("Saved " + str(size_difference) + " bytes")
            
            flow.response.content = s2.getvalue()
            flow.response.headers["content-type"] = "image/jpeg"
            flow.response.headers["content-length"] = str(len(s2.getvalue()))
            flow.response.headers["Modified-By-Mitmproxy"] = "true"

Topic		Replies	Views
Get raw (unzipped) HTML when export using mitmdump -w filename (zlib) help	2	3005	January 17, 2017
Change images in HTTPResponse help	2	5628	May 19, 2017
Control caching and streaming via scripting help	4	2667	January 18, 2018
How do i edit a decoded deflated Json response using python script?	2	1982	February 1, 2018
Modifying HTTPS response body not working	7	9913	January 21, 2018

Compress content

Return the request information

Related topics