shattr
A shatag clone in Ceylon.
Status
Basic features implemented:
-
--lookup
: Lookup for duplicates. -
--scrub
: Recompute checksums to detect silent corruption. -
--tag
: Compute new checksums for files that don't have one, or when it is outdated.
Why
shatag -rl
is slow since it needs to query SQL database for every file.
shattr
reads all SHA-256 checksums in memory instead.
Installation
ceylon
With If you have Ceylon
installed, you can download the .car
archive (< 4K
) at
Releases and put it into Ceylon module repository.
If your ceylon
is recent enough, you can package it to a jar file via ceylon --fat-jar
.
Running with java -jar
starts faster than ceylon run
.
java
directly
With If you have Java Runtime (7+) installed , but not Ceylon,
you can download the fat jar file (3.2M
).
Compile manually
Clone this repository and run ceylon compile
.
Tested with Ceylon 1.2+
.
May work with older versions.
Usage
Lookup
$SHATTR_COMMAND -l PATH_TO_HASHLIST
will print status of files under the current directory.
N empty file
D duplicated file
U unique file
? unknown file (without `sha256` xattr, no read permission, etc)
$SHATTR_COMMAND
is one of:
-
ceylon run io.github.weakish.shattr
if usingCeylon
; -
java -jar /path/to/io.github.weakish.shattr-0.2.0.jar
if usingjava
directly.
If PATH_TO_HASHLIST
is not specified,
shattr
will use ~/.shatagdb-hash-list.txt
.
Hash list format
PATH_TO_HASHLIST
is a text file,
containing all SHA256 hashes of known files, one per line.
For example, if using shatag
with an sqlite3 backend,
PATH_TO_HASHLIST
can be produced via:
sqlite3 -noheader -csv ~/.shatagdb "select hash from contents;" > hashlist.csv
Customize output
By default we use a git status style output.
You can change output format style with --format FORMAT
.
FORMAT
is one of git
, inotifywait
, and csv
.
--format FORMAT
should be specified before hash list file.
--format inotifywait
EMPTY empty file
DUMPLICATED duplicated file
UNIQUE unique file
UNKNOWN file (without `sha256` xattr, no read permission, etc)
--format csv
Like --format inotifywait
, but separated with comma ,
, with path name quoted.
EMPTY,"empty_file.txt"
UNIQUE,"A file containing spaces and ""double quotes"""
--format your_own
You need to write a formatting function typed String(Status, Path)
.
Then register it in command line option parsing code in run()
.
scrub/tag
$SHATTR_COMMAND -s
$SHATTR_COMMAND -t
Will compute checksums for all files under current directory (recursively).
Unlike shatag
, -t
will warn if checksum changes.
Contribute
Send pull requests at https://github.com/weakish/shattr.
Coding style
if . then . else .
to . then . else .
Prefer We feel A then B else C
is confusing.
Readers may think A then B else C
is A ? B : C
in other languages, but they are not the same:
-
A then B else C
is actually(A then B) else C
:-
A then B
evaluates toB
ifA
is notnull
, otherwise evaluates tonull
. -
X else Y
evaluates toX
ifX
is notnull
, otherwise evaluates toY
.
-
Thus the type of
B
isT given T satisfies Object
, i.e. requires to not benull
.
I think if (A) then B else C
is much cleaner.
i++
to increase i
.
Only use y=i++
and y=++i
is really confusing to me.
So I prefer to only uses i++
to increase i
, e.g. in a while loop.
I think a meaningful evaluated value of i++
should be void
if the a programming language allows ++
.
Same applies to i--
and --i
.
Prefer functions to classes
We prefer to declare classes for new types (or type aliases).
Other
If you disagree the above, file an issue.
Send pull requests to add new coding style.
Please do not add formatting style such as use two spaces
and closing braces on their own line
.
Formatting style is unlikely to affect readability of code,
and can be auto adjusted via ceylon format
.
License
0BSD.