nxt.fs

File Scanning Engine.

Make rich use of Sparse Distributed Representations (SDR) using Hash Digests for relating Data and its Relations/Properties/Meta-Data.

Members

Aliases

RequestedBinType
alias RequestedBinType = uint

Not very likely that we are interested in histograms 64-bit precision * Bucket/Bin Counts so pick 32-bit for now.

ShCmd
alias ShCmd = string

Shell Command.

Classes

Dir
class Dir

Dir.

DirKind
class DirKind

Directory Kind.

FKind
class FKind

File Kind.

FKinds
class FKinds

Set of File Kinds with Internal Hashing.

File
class File

File.

FileTags
class FileTags

Maps Files to their tags.

GStats
class GStats

Global Scanner Statistics.

RegFile
class RegFile

Regular File.

Scanner
class Scanner(Term)

File System Scanner.

SpecFile
class SpecFile

Special File (Character or Block Device).

Symlink
class Symlink

Symlink.

Enums

BitStatus
enum BitStatus

Bit (Content) Status.

DirOp
enum DirOp

Directory Operation Type Code.

DirSorting
enum DirSorting

Directory Sorting Order.

FOp
enum FOp

File Operation Type Code.

FileContent
enum FileContent

File Content Type Code.

FileKindDetection
enum FileKindDetection

How File Kinds are detected.

OpArity
enum OpArity

Language Operator Arity.

OpAssoc
enum OpAssoc

Language Operator Associativity.

ScanContext
enum ScanContext

Key Scan (Search) Context.

SymlinkTargetStatus
enum SymlinkTargetStatus

Symlink Target Status.

isFile
eponymoustemplate isFile(T)

Traits

isFileIO
eponymoustemplate isFileIO(T)

Return true if T is a class representing File IO.

Functions

getDir
Dir getDir(NotNull!Dir rootDir, string dirPath)

(Cached) Lookup of Directory dirPath.

getFile
File getFile(NotNull!Dir rootDir, string filePath, bool isDir, bool tolerant)

(Cached) Lookup of File filePath.

loadRootDirTree
Dir loadRootDirTree(Viz viz, string cacheFile, GStats gstats)

Load File System Tree Cache from cacheFile.

matchContents
bool matchContents(FKind kind, Range range, RegFile regFile)

Match (Magic) Contents of kind with range.

matchExtension
bool matchExtension(FKind kind, string ext)

Match kind with file extension ext.

matchFullName
bool matchFullName(FKind kind, string full, size_t six)

Match kind with full filename full.

ofKind
KindHit ofKind(NotNull!RegFile regFile, NotNull!FKind kind, bool collectTypeHits, FKinds allFKinds)
ofKind1
KindHit ofKind1(NotNull!RegFile regFile, NotNull!FKind kind, bool collectTypeHits, FKinds allFKinds)

Helper for ofKind.

saveRootDirTree
const(ubyte[]) saveRootDirTree(Viz viz, Dir rootDir, string cacheFile)

Save File System Tree Cache under Directory rootDir.

treeSizeMemoized
Bytes64 treeSizeMemoized(NotNull!File file, Bytes64[File] cache)

Externally Directory Memoized Calculation of Tree Size. Is it possible to make get any of @safe pure nothrow?

Structs

CStat
struct CStat

Contents Statistics of a Regular File.

Delim
struct Delim

Pair of Delimiters. Used to desribe for example comment and string delimiter syntax.

Op
struct Op

Language Operator.

OpAlias
struct OpAlias

Language Operator Alias.

Variables

ctrlC
uint ctrlC;

Exception Describing Process Signal.

See Also

http://stackoverflow.com/questions/12629749/how-does-grep-run-so-fast

http:www.regular-expressions.info/powergrep.html http://ridiculousfish.com/blog/posts/old-age-and-treachery.html

http://www.olark.com/spw/2011/08/you-can-list-a-directory-with-8-million-files-but-not-with-ls/

TODO Make use parallelism_ex: pmap

TODO Call filterUnderAnyOfPaths using std.algorithm.filter directly on AAs. Use byPair or use AA.get(key, defaultValue) http://forum.dlang.org/thread/mailman.75.1392335793.6445.digitalmars-d-learn@puremagic.com https://github.com/D-Programming-Language/druntime/pull/574

TODO Count logical lines. TODO Lexers should be loosely coupled to FKinds instead of Files TODO Generic Token[] and specific CToken[], CxxToken[]

TODO Don't scan for duplicates inside vc-dirs by default

TODO Assert that files along duplicates path don't include symlinks

TODO Implement FOp.deduplicate TODO Prevent rescans of duplicates

TODO Defined generalized_specialized_two_way_relationship(kindD, kindDi)

TODO Visualize hits using existingFileHitContext.asH!1 followed by a table: ROW_NR | hit string in <code lang=LANG></code>

TODO Parse and Sort GCC/Clang Compiler Messages on WARN_TYPE FILE:LINE:COL:MSGWARN_TYPE and use Collapsable HTML Widgets: http://api.jquerymobile.com/collapsible/

when presenting them

TODO Maybe make use of https://github.com/Abscissa/scriptlike

TODO Calculate Tree grams and bist

TODO Get stats of the link itself not the target in SymLink constructors

TODO RegFile with FileContent.text should be decodable to Unicode using either iso-latin1, utf-8, etc. Check std.uni for how to try and decode stuff.

TODO Search for subwords. For example gtk_widget should also match widget_gtk and GtkWidget etc.

TODO Support multi-line keys

TODO Use hash-lookup in txtFKinds.byExt for faster guessing of source file kind. Merge it with binary kind lookup. And check FileContent member of kind to instead determine if it should be scanned or not. Sub-Task: Case-Insensitive Matching of extensions if nothing else passes.

TODO Detect symlinks with duplicate targets and only follow one of them and group them together in visualization

TODO Add addTag, removeTag, etc and interface to fs.d for setting tags: --add-tag=comedy, remove-tag=comedy

TODO If files ends with ~ or .backup assume its a backup file, strip it from end match it again and set backupFlag in FileKind

TODO Acronym match can make use of normal histogram counts. Check denseness of binary histogram (bist) to determine if we should use a sparse or dense histogram.

TODO Activate and test support for ELF and Cxx11 subkinds

TODO Call either File.checkObseleted upon inotify. checkObseleted should remove stuff from hash tables TODO Integrate logic in clearCStat to RegFile.makeObselete TODO Upon Dir inotify call invalidate _depth, etc.

TODO Following command: fs.d --color -d ~/ware/emacs -s lispy -k shows "Skipped PNG file (png) at first extension try". Assure that this logic reuses cache and instead prints something like "Skipped PNG file using cached FKind".

TODO Cache each Dir separately to a file named after SHA1 of its path

TODO Add ASCII kind: Requires optional stream analyzer member of FKind in replacement for magicData. ASCIIFile

TODO Defined NotAnyKind(binaryKinds) and cache it

TODO Create PkZipFile() in Dir.load() when FKind "pkZip Archive" is found. Use std.zip.ZipArchive(void[] from mmfile)

TODO Scan Subversion Dirs with http://pastebin.com/6ZzPvpBj

TODO Change order (binHit || allBHist8Miss) and benchmark

TODO Display modification/access times as: See: http://forum.dlang.org/thread/k7afq6$2832$1@digitalmars.com

TODO Use User Defined Attributes (UDA): http://forum.dlang.org/thread/k7afq6$2832$1@digitalmars.com TODO Use msgPack @nonPacked when needed

TODO Limit lines to terminal width

TODO Create array of (OFFSET, LENGTH) and this in FKind Pattern factory function. Then for source file extra slice at (OFFSET, LENGTH) and use as input into hash-table from magic (if its a Lit-pattern to)

TODO Verify that "f.tar.z" gets tuple extensions tuple("tar", "z") TODO Verify that "libc.so.1.2.3" gets tuple extensions tuple("so", "1", "2", "3") and "so" extensions should the be tried TODO Cache Symbols larger than three characters in a global hash from symbol to path

TODO Benchmark horspool.d and perhaps use instead of std.find

TODO Splitting into keys should not split arguments such as "a b"

TODO Perhaps use http://www.chartjs.org/ to visualize stuff

TODO Make use of @nonPacked in version(msgpack).

Meta