Compress content


#1

Sorry to bother - but I tried a few different ways and I can not get this to work right. I am trying to build out a way to compress the data to response using gzip, but the data that the client gets is compress, but does not uncompressed when viewing the data. My understanding is that chrome/ie should be able to see this and uncompressed content data to show the website, but all I see if the compress content version. How would I go about this to make it work?

My end goal here is to improve on website bandwidth just a bit by compress the returning data, remove white-space, convert some image files into more static type information files, file caching, and etc. Not sure if there are any projects going on to maybe do such a thing already?

Return the request information

if "html" in file["content-type"]:
    gzip_buffer = io.BytesIO()
    with gzip.GzipFile(mode='wb', compresslevel=8, fileobj=gzip_buffer) as gzip_file:
        gzip_file.write(temp)
    temp = gzip_buffer.getvalue()
return http.HTTPResponse.make(200, temp, {"Content-Type": file["content-type"], 'Content-Encoding': 'gzip', "Cache-Control": "private", "Vary": 'Accept-Encoding'})

#2

I wanted to make the same thing, compressing proxy. I tried to follow this guide, but it only worked out for images. And gzip was something that took some time to deal with. Here is my final script:

from mitmproxy import http
from PIL import Image
import gzip, io

def response(flow: http.HTTPFlow):
    
    if "Content-Type" in flow.response.headers:
        ct = flow.response.headers["Content-Type"]
        
        # compress raw text data 
        if ct[0:5] == "text/" or ct[0:12] == "application/" or ct == "image/svg":
            if "Content-Encoding" in flow.response.headers:
                # if already compressed with gzip/deflate/LZW/Brotli  etc. - skip it
                print("Skipping compressed response")
                return
            
            print("Processing text: " + flow.request.url)
            print("Compressing content...")
            content = gzip.compress(flow.response.content)
            size_difference = len(flow.response.content)-len(content)
            if size_difference < 0:
                print("Original content was smaller than compressed. Skipping conversion...")
                return
            print("Saved " + str(size_difference) + " bytes")
            
            headers = flow.response.headers
            status_code = flow.response.status_code
            
            # this works
            flow.response = http.HTTPResponse.make(
                 200, content,
            )
            
            # this does not works, can anyone say why?
            #flow.response.content = content
            
            flow.response.headers = headers
            flow.response.status_code = status_code
            flow.response.headers["Content-Type"] = "text/html" 
            flow.response.headers["Content-Encoding"] = "gzip" 
            flow.response.headers["Content-Length"] = str(len(content)) 
            flow.response.headers["Modified-By-Mitmproxy"] = "true" 
        # compress images
        elif ct[0:6] == ("image/") and len(flow.response.content) > 400:
            print("Processing image:" + flow.request.url)
            
            # convert to BW
            s = io.BytesIO(flow.response.content)
            img = Image.open(s).convert("L")
            
            width, height = img.size
            if max([width, height]) > 300:
                img = img.resize((round(width/2), round(height/2)))
            
            # save as jpeg
            s2 = io.BytesIO()
            img.save(s2, "jpeg", quality=25)
            
            size_difference = len(flow.response.content)-len(s2.getvalue())
            if size_difference < 0:
                print("Original image was smaller. Skipping conversion...")
                return
            
            print("Saved " + str(size_difference) + " bytes")
            
            flow.response.content = s2.getvalue()
            flow.response.headers["content-type"] = "image/jpeg"
            flow.response.headers["content-length"] = str(len(s2.getvalue()))
            flow.response.headers["Modified-By-Mitmproxy"] = "true"