python - How to read filenames included into a gz file -


i've tried read gz file:

with open(os.path.join(storage_path,file), "rb") gzipfile:         gzip.gzipfile(fileobj=gzipfile) datafile:             data = datafile.read() 

it works need filenames , size of every file included gz file. code print out content of included file archive.

how can read filenames included gz file?

the python gzip module not provide access information.

the source code skips on without ever storing it:

if flag & fname:     # read , discard null-terminated string containing filename     while true:         s = self.fileobj.read(1)         if not s or s=='\000':             break 

the filename component optional, not guaranteed present (the commandline gzip -c decompression option use original filename sans .gz in case, think). uncompressed filesize not stored in header; can find in last 4 bytes instead.

to read filename header yourself, you'd need recreate file header reading code, , retain filename bytes instead. following function returns that, plus decompressed size:

import struct gzip import fextra, fname  def read_gzip_info(gzipfile):     gf = gzipfile.fileobj     pos = gf.tell()      # read archive size     gf.seek(-4, 2)     size = struct.unpack('<i', gf.read())[0]      gf.seek(0)     magic = gf.read(2)     if magic != '\037\213':         raise ioerror('not gzipped file')      method, flag, mtime = struct.unpack("<bbixx", gf.read(8))      if not flag & fname:         # not stored in header, use filename sans .gz         gf.seek(pos)         fname = gzipfile.name         if fname.endswith('.gz'):             fname = fname[:-3]         return fname, size      if flag & fextra:         # read & discard field, if present         gf.read(struct.unpack("<h", gf.read(2)))      # read null-terminated string containing filename     fname = []     while true:         s = gf.read(1)         if not s or s=='\000':             break         fname.append(s)      gf.seek(pos)     return ''.join(fname), size 

use above function already-created gzip.gzipfile object:

filename, size = read_gzip_info(gzipfileobj) 

Comments

Popular posts from this blog

php - Wordpress website dashboard page or post editor content is not showing but front end data is showing properly -

How to get the ip address of VM and use it to configure SSH connection dynamically in Ansible -

javascript - Get parameter of GET request -