nxt.fse

File Scanning Engine.

Make rich use of Sparse Distributed Representations (SDR) using Hash Digests for relating Data and its Relations/Properties/Meta-Data.

Members

Aliases

Bist
alias Bist = NGram!(ubyte, 1, ngram.Kind.binary, ngram.Storage.denseStatic, ngram.Symmetry.ordered, void, immutable(ubyte)[])
Undocumented in source.
Bytes64
alias Bytes64 = Bytes!ulong
Undocumented in source.
RequestedBinType
alias RequestedBinType = uint

Not very likely that we are interested in histograms 64-bit precision * Bucket/Bin Counts so pick 32-bit for now.

ShCmd
alias ShCmd = string

Shell Command.

XGram
alias XGram = NGram!(ubyte, NGramOrder, ngram.Kind.saturated, ngram.Storage.sparse, ngram.Symmetry.ordered, RequestedBinType, immutable(ubyte)[])
Undocumented in source.
signalHandler_t
alias signalHandler_t = void function(int)
Undocumented in source.

Classes

Dir
class Dir

Dir.

DirKind
class DirKind

Directory Kind.

FKind
class FKind

File Kind.

FKinds
class FKinds

Set of File Kinds with Internal Hashing.

File
class File

File.

FileTags
class FileTags

Maps Files to their tags.

GStats
class GStats

Global Scanner Statistics.

RegFile
class RegFile

Regular File.

Scanner
class Scanner(Term)

File System Scanner.

SignalCaughtException
class SignalCaughtException
Undocumented in source.
SpecFile
class SpecFile

Special File (Character or Block Device).

Symlink
class Symlink

Symlink.

Enums

BitStatus
enum BitStatus

Bit (Content) Status.

BuildType
enum BuildType
Undocumented in source.
DirOp
enum DirOp

Directory Operation Type Code.

DirSorting
enum DirSorting

Directory Sorting Order.

DuplicatesContext
enum DuplicatesContext
Undocumented in source.
FOp
enum FOp

File Operation Type Code.

FileContent
enum FileContent

File Content Type Code.

FileKindDetection
enum FileKindDetection

How File Kinds are detected.

KeyStrictness
enum KeyStrictness
Undocumented in source.
KindHit
enum KindHit
Undocumented in source.
OpArity
enum OpArity

Language Operator Arity.

OpAssoc
enum OpAssoc

Language Operator Associativity.

PathFormat
enum PathFormat
Undocumented in source.
ScanContext
enum ScanContext

Key Scan (Search) Context.

SymlinkFollowContext
enum SymlinkFollowContext
Undocumented in source.
SymlinkTargetStatus
enum SymlinkTargetStatus

Symlink Target Status.

isAnyFile
eponymoustemplate isAnyFile(T)
Undocumented in source.
isDir
eponymoustemplate isDir(T)
Undocumented in source.
isFile
eponymoustemplate isFile(T)

Traits

isFileIO
eponymoustemplate isFileIO(T)

Return true if T is a class representing File IO.

isRegFile
eponymoustemplate isRegFile(T)
Undocumented in source.
isSpecialFile
eponymoustemplate isSpecialFile(T)
Undocumented in source.
isSymlink
eponymoustemplate isSymlink(T)
Undocumented in source.

Functions

displayedFilename
string displayedFilename(GStats gstats, AnyFile theFile)
Undocumented in source. Be warned that the author may not have intended to support it.
getDir
Dir getDir(NotNull!Dir rootDir, string dirPath, DirEntry dent, Symlink[] followedSymlinks)
Undocumented in source.
getDir
Dir getDir(NotNull!Dir rootDir, string dirPath)

(Cached) Lookup of Directory dirPath.

getDirs
Dir[] getDirs(NotNull!Dir rootDir, string[] topDirNames)
Undocumented in source. Be warned that the author may not have intended to support it.
getFile
File getFile(NotNull!Dir rootDir, string filePath, bool isDir, bool tolerant)

(Cached) Lookup of File filePath.

grain
void grain(Cereal cereal, SysTime systime)
Undocumented in source. Be warned that the author may not have intended to support it.
loadRootDirTree
Dir loadRootDirTree(Viz viz, string cacheFile, GStats gstats)

Load File System Tree Cache from cacheFile.

matchContents
bool matchContents(FKind kind, Range range, RegFile regFile)

Match (Magic) Contents of kind with range.

matchExtension
bool matchExtension(FKind kind, string ext)

Match kind with file extension ext.

matchFullName
bool matchFullName(FKind kind, string full, size_t six)

Match kind with full filename full.

matchName
bool matchName(FKind kind, string full, size_t six, string ext)
Undocumented in source. Be warned that the author may not have intended to support it.
ofAnyKindIn
Tuple!(KindHit, FKind, size_t) ofAnyKindIn(NotNull!RegFile regFile, FKinds kinds, bool collectTypeHits)
Undocumented in source. Be warned that the author may not have intended to support it.
ofKind
KindHit ofKind(NotNull!RegFile regFile, NotNull!FKind kind, bool collectTypeHits, FKinds allFKinds)
ofKind
KindHit ofKind(NotNull!RegFile regFile, string kindName, bool collectTypeHits, FKinds allFKinds)
Undocumented in source. Be warned that the author may not have intended to support it.
ofKind1
KindHit ofKind1(NotNull!RegFile regFile, NotNull!FKind kind, bool collectTypeHits, FKinds allFKinds)

Helper for ofKind.

pageSize
auto pageSize()
Undocumented in source. Be warned that the author may not have intended to support it.
saveRootDirTree
const(ubyte[]) saveRootDirTree(Viz viz, Dir rootDir, string cacheFile)

Save File System Tree Cache under Directory rootDir.

scanner
void scanner(string[] args)
Undocumented in source. Be warned that the author may not have intended to support it.
signal
signalHandler_t signal(int signal, signalHandler_t handler)
Undocumented in source but is binding to C. You might be able to learn more by searching the web for its name.
signalHandler
void signalHandler(int signo)
Undocumented in source. Be warned that the author may not have intended to support it.
treeSizeMemoized
Bytes64 treeSizeMemoized(NotNull!File file, Bytes64[File] cache)

Externally Directory Memoized Calculation of Tree Size. Is it possible to make get any of @safe pure nothrow?

tryLookupKindIn
FKind tryLookupKindIn(RegFile regFile, FKind[SHA1Digest] kindsById)
Undocumented in source. Be warned that the author may not have intended to support it.

Manifest constants

NGramOrder
enum NGramOrder;
Undocumented in source.
cCommentDelims
enum cCommentDelims;
Undocumented in source.
dCommentDelims
enum dCommentDelims;
Undocumented in source.
defaultCommentDelims
enum defaultCommentDelims;
Undocumented in source.
defaultStringDelims
enum defaultStringDelims;
Undocumented in source.
pythonStringDelims
enum pythonStringDelims;
Undocumented in source.

Structs

CStat
struct CStat

Contents Statistics of a Regular File.

Delim
struct Delim

Pair of Delimiters. Used to desribe for example comment and string delimiter syntax.

Op
struct Op

Language Operator.

OpAlias
struct OpAlias

Language Operator Alias.

Results
struct Results
Undocumented in source.

Variables

ctrlC
uint ctrlC;

Exception Describing Process Signal.

mmfile_size
enum ulong mmfile_size;
Undocumented in source.

See Also

http://stackoverflow.com/questions/12629749/how-does-grep-run-so-fast

http:www.regular-expressions.info/powergrep.html http://ridiculousfish.com/blog/posts/old-age-and-treachery.html

http://www.olark.com/spw/2011/08/you-can-list-a-directory-with-8-million-files-but-not-with-ls/

TODO: Make use parallelism_ex: pmap

TODO: Call filterUnderAnyOfPaths using std.algorithm.filter directly on AAs. Use byPair or use AA.get(key, defaultValue) http://forum.dlang.org/thread/mailman.75.1392335793.6445.digitalmars-d-learn@puremagic.com https://github.com/D-Programming-Language/druntime/pull/574

TODO: Count logical lines. TODO: Lexers should be loosely coupled to FKinds instead of Files TODO: Generic Token[] and specific CToken[], CxxToken[]

TODO: Don't scan for duplicates inside vc-dirs by default

TODO: Assert that files along duplicates path don't include symlinks

TODO: Implement FOp.deduplicate TODO: Prevent rescans of duplicates

TODO: Defined generalized_specialized_two_way_relationship(kindD, kindDi)

TODO: Visualize hits using existingFileHitContext.asH!1 followed by a table: ROW_NR | hit string in <code lang=LANG></code>

TODO: Parse and Sort GCC/Clang Compiler Messages on WARN_TYPE FILE:LINE:COL:MSGWARN_TYPE and use Collapsable HTML Widgets: http://api.jquerymobile.com/collapsible/

when presenting them

TODO: Maybe make use of https://github.com/Abscissa/scriptlike

TODO: Calculate Tree grams and bist

TODO: Get stats of the link itself not the target in SymLink constructors

TODO: RegFile with FileContent.text should be decodable to Unicode using either iso-latin1, utf-8, etc. Check std.uni for how to try and decode stuff.

TODO: Search for subwords. For example gtk_widget should also match widget_gtk and GtkWidget etc.

TODO: Support multi-line keys

TODO: Use hash-lookup in txtFKinds.byExt for faster guessing of source file kind. Merge it with binary kind lookup. And check FileContent member of kind to instead determine if it should be scanned or not. Sub-Task: Case-Insensitive Matching of extensions if nothing else passes.

TODO: Detect symlinks with duplicate targets and only follow one of them and group them together in visualization

TODO: Add addTag, removeTag, etc and interface to fs.d for setting tags: --add-tag=comedy, remove-tag=comedy

TODO: If files ends with ~ or .backup assume its a backup file, strip it from end match it again and set backupFlag in FileKind

TODO: Acronym match can make use of normal histogram counts. Check denseness of binary histogram (bist) to determine if we should use a sparse or dense histogram.

TODO: Activate and test support for ELF and Cxx11 subkinds

TODO: Call either File.checkObseleted upon inotify. checkObseleted should remove stuff from hash tables TODO: Integrate logic in clearCStat to RegFile.makeObselete TODO: Upon Dir inotify call invalidate _depth, etc.

TODO: Following command: fs.d --color -d ~/ware/emacs -s lispy -k shows "Skipped PNG file (png) at first extension try". Assure that this logic reuses cache and instead prints something like "Skipped PNG file using cached FKind".

TODO: Cache each Dir separately to a file named after SHA1 of its path

TODO: Add ASCII kind: Requires optional stream analyzer member of FKind in replacement for magicData. ASCIIFile

TODO: Defined NotAnyKind(binaryKinds) and cache it

TODO: Create PkZipFile() in Dir.load() when FKind "pkZip Archive" is found. Use std.zip.ZipArchive(void[] from mmfile)

TODO: Scan Subversion Dirs with http://pastebin.com/6ZzPvpBj

TODO: Change order (binHit || allBHist8Miss) and benchmark

TODO: Display modification/access times as: See: http://forum.dlang.org/thread/k7afq6$2832$1@digitalmars.com

TODO: Use User Defined Attributes (UDA): http://forum.dlang.org/thread/k7afq6$2832$1@digitalmars.com TODO: Use msgPack @nonPacked when needed

TODO: Limit lines to terminal width

TODO: Create array of (OFFSET, LENGTH) and this in FKind Pattern factory function. Then for source file extra slice at (OFFSET, LENGTH) and use as input into hash-table from magic (if its a Lit-pattern to)

TODO: Verify that "f.tar.z" gets tuple extensions tuple("tar", "z") TODO: Verify that "libc.so.1.2.3" gets tuple extensions tuple("so", "1", "2", "3") and "so" extensions should the be tried TODO: Cache Symbols larger than three characters in a global hash from symbol to path

TODO: Benchmark horspool.d and perhaps use instead of std.find

TODO: Splitting into keys should not split arguments such as "a b"

TODO: Perhaps use http://www.chartjs.org/ to visualize stuff

TODO: Make use of @nonPacked in version (msgpack).

Meta