File and Directory Classes¶
File Objects¶
-
class
mediafs.File(path, parent=None)[source]¶ Object that represents a file in the filesystem
-
abspath¶ The absolute path to the file or directory. Uses
os.path.abspath(). Lazily evaluated and cached.
-
atime()¶ Last access time as reported by the underlying filesystem. Calls
os.path.getatime()on the file or directory and returns the result as a datetime object.
-
crc(refresh=False)[source]¶ Calculate the CRC for this file. The result is cached, so subsequent calls do not result in calculating the CRC multiple times. If
refreshis True, then the result is recalculated.
-
deserialize(attrs)¶ Takes a dict object and returns a new instance of this class with all attributes initialized to the values contained in the dict.
-
exists()¶ Does the file exist? Calls
os.path.exists()on the file or directory and returns the result.
-
fasthash(refresh=False)[source]¶ Calculate a hash for this file that works well on larger files but is optimized for speed. The result is cached, so subsequent calls do not result in calculating the hash multiple times. If
refreshis True, then the result is recalculated.
-
get(key, default=None)¶ Helper method for getting values from the metadata dict. Primarily useful for shortening
Directory.query()lambda functions.Example:
directory.query(lambda f: 'author' in f.metadata and f.metadata['author'] == "The Clash")can be shortened to:
directory.query(lambda f: f.get('author') == "The Clash")The default argument is the value that will be returned if key is not a valid key in the metadata dict. This is useful if you are expecting a particular type and want to do some operation on that type. For example:
directory.query(lambda f: f.get('year', default=0) > 1990))
-
hash()[source]¶ For files, instead of returning the relative path of the file, return the hash, so that if a file is moved or renamed the metadata will remain associated with it. This will also result in duplicate files having the same metadata (which is the intended behavior).
-
matches(other)¶ Returns
Trueif this file or directory is the same as another file or directory. Compares by hash, sofile1.matches(file2) == Trueiffile1andfile2have identical contents.
-
md5(refresh=False)[source]¶ Calculate the MD5 sum for this file. The result is cached, so subsequent calls do not result in calculating the MD5 sum multiple times. If
refreshis True, then the result is recalculated.
-
metadata¶ The metadata dict for this file or directory
-
mtime()¶ Last modified time as reported by the underlying filesystem. Calls
os.path.getmtime()on the file or directory and returns the result as a datetime object.
-
relpath¶ The file or directory path relative to the root directory.
-
rename(newName, syscall=True)¶ Renames the file or directory. Raises a FileExistsError exception if the new name already exists.
If the
syscallargument is True, thenos.rename()will be called on the underlying file or directory. Setting this to False is primarily useful for keeping things in sync if you know a rename occured and want to avoid the overhead of a refresh() call.
-
root¶ A reference to the root directory object
-
serialize()¶ Returns a dict object containing the attributes of this object. Used for serializing the directory tree to a file.
-
size¶ The size of the file or directory contents in bytes. Lazily evaluated and cached.
-
stat()¶ Calls
os.stat()on the file or directory and returns the result.
-
Directory Objects¶
-
class
mediafs.Directory(path, parent=None)[source]¶ Object that represents a directory in the filesystem
-
__getitem__(key)[source]¶ Directory objects support a number of different indexing methods, all of which either return a single object or a list containing multiple objects, which is useful when you want to assign the results to a variable (as opposed to the searching methods
filter(),search(),query(), andall(), which are generators).Directories support the following syntaxes for indexing:
An ellipsis object returns a list of all children, recursively.
directory[...](same as
list(directory.all(recursive=True)))
An integer, which is treated as an index and returns one item based on the directory ordering. Because the ordering is precalculated, this is O(1). Returns exactly one item.
directory[2]A slice, which is treated as a range of indices based on the directory ordering.
directory[1:3]An empty slice, which returns a list of items in the directory.
directory[:](same as
list(directory.all(recursive=False)))
A string key, which is treated as a file or directory name and uses a dict-based lookup for O(1) lookups. Returns exactly one item.
directory["asdf.txt"]A string which contains either a
*or a?. This string is passed to the Python stdlib libraryfnmatchto support searches and returns a list of files or directories that match the pattern. See the documentation for thefnmatchlibrary for more information.directory["*.txt"](same as
list(directory.filter("*.txt")))
-
abspath¶ The absolute path to the file or directory. Uses
os.path.abspath(). Lazily evaluated and cached.
-
all(recursive=False, reverse=False, dirs=True, files=True)[source]¶ A generator that yields all files and subdirectories contained within this directory.
- If
recursiveis True, then it will also yield all items contained in those subdirectories. - If
reverseis True, then it will iterate in reverse order. - The
dirsargument indicates whether or not directories should be yielded. - The
filesargument indicates whether or not files should be yielded.
- If
-
atime()¶ Last access time as reported by the underlying filesystem. Calls
os.path.getatime()on the file or directory and returns the result as a datetime object.
-
contents¶ The dict representing the contents of this directory. If this directory has not been refreshed yet, accessing this property will trigger a
refresh(recursive=False)before returning the dict.If you have code accessing a single specific file or directory object in an inner loop, a small optimization could be calling
directory.contents[filename]instead ofdirectory[filename], due to the number of overloads inDirectory.__getitem__.
-
classmethod
deserialize(attrs)[source]¶ Takes a dict object, and returns a new instance of this class with all attributes initialized to the values contained in the dict.
-
exists()¶ Does the file exist? Calls
os.path.exists()on the file or directory and returns the result.
-
filter(pattern, recursive=False, dirs=True, files=True, ignoreCase=True)[source]¶ Uses the Python stdlib
fnmatchlibrary to search the filesystem.If
ignoreCaseis True, thenfnmatch.fnmatch()will be used, and filenames will be converted to lowercase before comparisons are made.If
ignoreCaseis False, thenfnmatch.fnmatchcase()will be used.See https://docs.python.org/library/fnmatch.html for more information about the pattern syntax.
recursive,dirs, andfilesarguments are passed toDirectory.all().
-
get(key, default=None)¶ Helper method for getting values from the metadata dict. Primarily useful for shortening
Directory.query()lambda functions.Example:
directory.query(lambda f: 'author' in f.metadata and f.metadata['author'] == "The Clash")can be shortened to:
directory.query(lambda f: f.get('author') == "The Clash")The default argument is the value that will be returned if key is not a valid key in the metadata dict. This is useful if you are expecting a particular type and want to do some operation on that type. For example:
directory.query(lambda f: f.get('year', default=0) > 1990))
-
hash()¶ Return a hash suitable for storing the metadata dict for this object. This should be unique among all files and directories in the RootDirectory object. For directories, its best to use the relative path. For files, we can hash the file and use that, which means that moving or renaming the file won’t lose track of data.
-
matches(other)¶ Returns
Trueif this file or directory is the same as another file or directory. Compares by hash, sofile1.matches(file2) == Trueiffile1andfile2have identical contents.
-
metadata¶ The metadata dict for this file or directory
-
mtime()¶ Last modified time as reported by the underlying filesystem. Calls
os.path.getmtime()on the file or directory and returns the result as a datetime object.
-
order¶ A list representing the order of the items in this directory. Lazily evaluated and cached.
Accessing this property will trigger
refresh(recursive=False)if a refresh has never been run on this directory.
-
query(query, recursive=False, dirs=True, files=True)[source]¶ Uses a custom function to search the filesystem. That function is passed a single argument, an FSObject, and should return a boolean that determines if the file matches.
recursive,dirs, andfilesarguments are passed toDirectory.all().Examples:
- All files that are named “file1.txt” or “file2.txt”, recursively:
>>> directory.query(lambda f: f.name in ("file1.txt", "file2.txt"), recursive=True)
- All files larger than 1024 bytes:
>>> directory.query(lambda f: f.size > 1024, dirs=False)
- All files and directories that start with E:
>>> directory.query(lambda f: f.name.startswith("E"))
- All files modified within the last 7 days:
>>> from datetime import datetime, timedelta >>> directory.query(lambda f: f.mtime > (datetime.now() - timedelta(days=7)), dirs=False)
- All directories with more than 10 items:
>>> directory.query(lambda d: len(d) > 10, recursive=True, files=False)
- All directories that contain a file called “asdf.txt”:
>>> directory.query(lambda d: "asdf.txt" in d, recursive=True, files=False)
-
refresh(*files, **kwargs)[source]¶ Rescans the filesystem and rebuilds the index for this directory. If any
filesare specified, thenrefresh()will only scan those files. Otherwise it will scan all files.If
recursive=Trueis passed in, thenrefresh()will also be called on all subdirectories.
-
relpath¶ The file or directory path relative to the root directory.
-
rename(newName, syscall=True)¶ Renames the file or directory. Raises a FileExistsError exception if the new name already exists.
If the
syscallargument is True, thenos.rename()will be called on the underlying file or directory. Setting this to False is primarily useful for keeping things in sync if you know a rename occured and want to avoid the overhead of a refresh() call.
-
root¶ A reference to the root directory object
-
search(regex, recursive=False, dirs=True, files=True, flags=2)[source]¶ Uses a regex as a query string to search the filesystem. Uses case-insensitive matching by default. Passes the value of the
flagsargument directly through tore.compile(), so check out the docs on theregexmodule for how that works.The default value for
flagsisre.IGNORECASE.recursive,dirs, andfilesarguments are passed toDirectory.all().Example:
directory.search(r'(.*)\.txt')
-
serialize()¶ Returns a dict object containing the attributes of this object. Used for serializing the directory tree to a file.
-
size¶ For directories, recursively calculate the size of the contents of the directory. This value is lazily evaluated and cached.
-
size For directories, recursively calculate the size of the contents of the directory. This value is lazily evaluated and cached.
-
stat()¶ Calls
os.stat()on the file or directory and returns the result.
-