python - How to read filenames included into a gz file -

March 15, 2012

i've tried read gz file:

with open(os.path.join(storage_path,file), "rb") gzipfile:         gzip.gzipfile(fileobj=gzipfile) datafile:             data = datafile.read()

it works need filenames , size of every file included gz file. code print out content of included file archive.

how can read filenames included gz file?

the python gzip module not provide access information.

the source code skips on without ever storing it:

if flag & fname:     # read , discard null-terminated string containing filename     while true:         s = self.fileobj.read(1)         if not s or s=='\000':             break

the filename component optional, not guaranteed present (the commandline gzip -c decompression option use original filename sans .gz in case, think). uncompressed filesize not stored in header; can find in last 4 bytes instead.

to read filename header yourself, you'd need recreate file header reading code, , retain filename bytes instead. following function returns that, plus decompressed size:

import struct gzip import fextra, fname  def read_gzip_info(gzipfile):     gf = gzipfile.fileobj     pos = gf.tell()      # read archive size     gf.seek(-4, 2)     size = struct.unpack('<i', gf.read())[0]      gf.seek(0)     magic = gf.read(2)     if magic != '\037\213':         raise ioerror('not gzipped file')      method, flag, mtime = struct.unpack("<bbixx", gf.read(8))      if not flag & fname:         # not stored in header, use filename sans .gz         gf.seek(pos)         fname = gzipfile.name         if fname.endswith('.gz'):             fname = fname[:-3]         return fname, size      if flag & fextra:         # read & discard field, if present         gf.read(struct.unpack("<h", gf.read(2)))      # read null-terminated string containing filename     fname = []     while true:         s = gf.read(1)         if not s or s=='\000':             break         fname.append(s)      gf.seek(pos)     return ''.join(fname), size

use above function already-created gzip.gzipfile object:

filename, size = read_gzip_info(gzipfileobj)

Search This Blog

Live one

python - How to read filenames included into a gz file -

Comments

Post a Comment

Popular posts from this blog

php - XML feed for Wordpress Social Board plugin modifications -

php - Wordpress website dashboard page or post editor content is not showing but front end data is showing properly -

javascript - Twitter Bootstrap - how to add some more margin between tooltip popup and element -