Skip to content

Commit 8057e5a

Browse files
author
piotrj
committed
initial release
1 parent 22d634b commit 8057e5a

11 files changed

+42
-78
lines changed

.github/workflows/run.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -134,7 +134,7 @@ jobs:
134134
tag_name: ${{ steps.version.outputs.version }}
135135
name: Release ${{ steps.version.outputs.version }}
136136
draft: false
137-
prerelease: true
137+
prerelease: false
138138
files: |
139139
librer.${{ steps.version.outputs.version }}.portable.linux.zip
140140
librer.${{ steps.version.outputs.version }}.portable.windows.zip

LICENSE

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
MIT License
22

3-
Copyright (c) 2023 Piotr Jochymek
3+
Copyright (c) 2023-2024 Piotr Jochymek
44

55
Permission is hereby granted, free of charge, to any person obtaining a copy
66
of this software and associated documentation files (the "Software"), to deal

README.md

+5-10
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,12 @@
22

33
A cross-platform GUI file cataloging program with extensive customization options to suit user preferences. Highly optimized for multi-core parallel search speed, data integrity, and repository portability.
44

5-
**The software and this document are under development and in a Release Candidate (3) state.**
6-
75
## Features:
86
The primary purpose of this software is to enable users to catalog their files, especially on removable media such as memory cards and portable drives. It allows user to add metadata, referred to here as **custom data**, and later search the created records with multiple cryteria. **Custom data** consists of textual information acquired through the execution of user-chosen commands or scripts. **Custom data** may include any text data customized to meet the user's requirements, restricted only by the available memory and the software accessible for data retrieval. The retrieved data is stored in a newly created database record and can be utilized for search or verification purposes. **Liber** allows you to search files using regular expressions, glob expressions and **fuzzy matching** on both filenames and **custom data**. Created data records can be exported and imported to share data with others or for backup purposes.
97

108
## Screenshots:
119

12-
main window, new record creation dialog and running **custom data** extraction:
10+
#### Main window, new record creation dialog and running **custom data** extraction:
1311
![image info](./info/scanning.png)
1412

1513
#### Search dialog:
@@ -23,16 +21,16 @@ Portable executable packages created with [PyInstaller](https://pyinstaller.org/
2321

2422
https://github.com/PJDude/librer/releases
2523

24+
## [Tutorial](./info/tutorial.md) ##
2625

2726
## Guidelines for crafting custom data extractors
2827
Custom data extractor is a command that can be invoked with a single parameter - the full path to a specific file from which data is extracted. The command should provide the expected data in any textual format to the standard output (stdout). CDE can be an executable file (e.g., 7z, zip, ffmpeg, md5sum etc.) or an executable shell script (extract.sh, extract.bat etc.). The conditions it should meet are reasonably short execution time and reasonably limited information output. The criteria allowing the execution of a particular **Custom data extractor** include the glob expression (on file name) and the file size range.
2928

30-
## [Tutorial](./info/tutorial.md) ##
3129

3230
## Usage tips:
33-
- don't put any destructive actions in your Custom Data Extractors scrips
31+
- do not put any destructive actions in your Custom Data Extractor scripts
3432
- try to keep as little custom data as possible, to speed up scanning and searching records and keep record files small
35-
- if general purpose and generally available tools produce too much and not needed text data, write a wrapper script (*.sh, *bat) executing specific tool and post-process retrieved data. Wrapper can also help when the tool expects more variable parameters or in an unusual order
33+
- if general purpose and generally available tools produce too much and not needed text data, write a wrapper script (*.sh, *bat) executing specific tool and post-process retrieved data.
3634
- You don't have to use Custom Data if you don't need to. Only the file system will be cataloged.
3735

3836
## Supported platforms:
@@ -43,7 +41,7 @@ Custom data extractor is a command that can be invoked with a single parameter -
4341
**librer** writes log files, configuration and record files in runtime. Default location for these files is **logs** and **data** subfolders of **librer** main directory.
4442

4543
## Technical information
46-
Record in librer is the result of a single scan operation and is shown as one of many top nodes in the main tree window. Contains a directory tree with collected custom data. It is stored as a single .dat file in librer database directory. Its internal format is optimized for security, fast initial access and maximum compression (just check :)) Every section is a python data structure serialized by [pickle](https://docs.python.org/3/library/pickle.html) and compressed separately by [Zstandard](https://pypi.org/project/zstandard/) algorithm. The record file, once saved, is never modified afterward. It can only be deleted upon request or exported. All record files are independent of each other.Fuzzy matching is implemented using the SequenceMatcher function provided by the [difflib](https://docs.python.org/3/library/difflib.html) package. Searching records is performed as a separate subprocess for each record. The number of parallel searches is limited by the CPU cores.
44+
Record in librer is the result of a single scan operation and is shown as one of many top nodes in the main tree window. Contains a directory tree with collected custom data. It is stored as a single .dat file in librer database directory. Its internal format is optimized for security, fast initial access and maximum compression (just check :)) Every section is a python data structure serialized by [pickle](https://docs.python.org/3/library/pickle.html) and compressed separately by [Zstandard](https://pypi.org/project/zstandard/) algorithm. The record file, once saved, is never modified afterward. It can only be deleted upon request or exported. All record files are independent of each other. Fuzzy matching is implemented using the SequenceMatcher function provided by the [difflib](https://docs.python.org/3/library/difflib.html) module. Searching records is performed as a separate subprocess for each record. The number of parallel searches is limited by the CPU cores.
4745

4846
###### Manual build (linux):
4947
```
@@ -73,8 +71,5 @@ python3 ./src/librer.py
7371
- calculate the **CRC** of scanned files and use it to search for duplicates among different records, verify current data with the saved file system image
7472
- comparing two records with each other. e.g. two scans of the same file system performed at different times
7573

76-
## Known issues
77-
For still unknown reason, Custom Data Extraction (Execution of tons of subprocesses) on the exact same hardware is much slower on Windows than on Linux.
78-
7974
## Licensing
8075
- **librer** is licensed under **[MIT license](./LICENSE)**

src/core.py

+11-18
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
####################################################################################
44
#
5-
# Copyright (c) 2023 Piotr Jochymek
5+
# Copyright (c) 2023-2024 Piotr Jochymek
66
#
77
# MIT License
88
#
@@ -40,15 +40,13 @@
4040
else:
4141
from os import getpgid, killpg
4242

43-
from os.path import abspath,normpath,basename,dirname
44-
from os.path import join as path_join
43+
from os.path import abspath,normpath,basename,dirname,join as path_join
4544

4645
from zipfile import ZipFile
4746
from platform import system as platform_system,release as platform_release,node as platform_node
4847

4948
from fnmatch import fnmatch
5049
from re import search as re_search
51-
from sys import getsizeof
5250
import sys
5351
from collections import defaultdict
5452
from pathlib import Path as pathlib_Path
@@ -125,12 +123,14 @@ def fnumber(num):
125123
return str(format(num,',d').replace(',',' '))
126124

127125
def str_to_bytes(string):
128-
units = {'kb': 1024,'mb': 1024*1024,'gb': 1024*1024*1024,'tb': 1024*1024*1024*1024, 'b':1}
129126
try:
130-
string = string.replace(' ','')
131-
for suffix,weight in units.items():
132-
if string.lower().endswith(suffix):
127+
string = string.replace(' ','').lower()
128+
string_endswith = string.endswith
129+
for suffix,altsuffix,weight in ( ('kb','k',1024),('mb','m',1024*1024),('gb','g',1024*1024*1024),('tb','t',1024*1024*1024*1024),('b','b',1) ):
130+
if string_endswith(suffix):
133131
return int(string[0:-len(suffix)]) * weight #no decimal point
132+
elif string_endswith(altsuffix):
133+
return int(string[0:-len(altsuffix)]) * weight #no decimal point
134134

135135
return int(string)
136136
except:
@@ -196,7 +196,6 @@ def get_command(executable,parameters,full_file_path,shell):
196196
#'ignore','replace','backslashreplace'
197197
def popen_win(command,shell):
198198
return Popen(command, stdout=PIPE, stderr=STDOUT,stdin=DEVNULL,shell=shell,text=True,universal_newlines=True,creationflags=CREATE_NO_WINDOW,close_fds=False,errors='ignore')
199-
universal
200199

201200
def popen_lin(command,shell):
202201
return Popen(command, stdout=PIPE, stderr=STDOUT,stdin=DEVNULL,shell=shell,text=True,universal_newlines=True,start_new_session=True,errors='ignore')
@@ -610,7 +609,7 @@ def threaded_cde(timeout_semi_list):
610609

611610
aborted_string = 'Custom data extraction was aborted.'
612611
for (scan_like_list,subpath,rule_nr,size) in self.customdata_pool.values():
613-
#decoding_error=False
612+
614613
self.killed=False
615614
self.abort_single_file_cde=False
616615

@@ -636,7 +635,7 @@ def threaded_cde(timeout_semi_list):
636635
subprocess = uni_popen(command,shell)
637636
timeout_semi_list[0]=(timeout_val,subprocess)
638637
except Exception as re:
639-
print('threaded_cde error:',re)
638+
#print('threaded_cde error:',re)
640639
subprocess = None
641640
timeout_semi_list[0]=(timeout_val,subprocess)
642641
returncode=201
@@ -651,12 +650,6 @@ def threaded_cde(timeout_semi_list):
651650
while True:
652651
line = subprocess_stdout_readline().rstrip()
653652

654-
#try:
655-
#except Exception as le:
656-
#print(command,le)
657-
# line = str(le)
658-
#decoding_error = True
659-
660653
output_list_append(line)
661654

662655
if not line and subprocess_poll() is not None:
@@ -681,7 +674,7 @@ def threaded_cde(timeout_semi_list):
681674
if not aborted:
682675
self_header.files_cde_quant += 1
683676
self_header.files_cde_size += size
684-
self_header.files_cde_size_extracted += getsizeof(output)
677+
self_header.files_cde_size_extracted += asizeof(output)
685678

686679
new_elem={}
687680
new_elem['cd_ok']= bool(returncode==0 and not self.killed and not aborted)

src/dialogs.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
####################################################################################
44
#
5-
# Copyright (c) 2023 Piotr Jochymek
5+
# Copyright (c) 2023-2024 Piotr Jochymek
66
#
77
# MIT License
88
#

src/librer.py

+5-9
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
####################################################################################
44
#
5-
# Copyright (c) 2023 Piotr Jochymek
5+
# Copyright (c) 2023-2024 Piotr Jochymek
66
#
77
# MIT License
88
#
@@ -27,9 +27,8 @@
2727
####################################################################################
2828

2929
from os import sep,system,getcwd,name as os_name
30-
from os.path import abspath,normpath,dirname
31-
from os.path import join as path_join
32-
from os.path import isfile as path_isfile
30+
from os.path import abspath,normpath,dirname,join as path_join,isfile as path_isfile
31+
3332
from pathlib import Path
3433
from time import strftime,time,mktime
3534
from signal import signal,SIGINT
@@ -2730,22 +2729,19 @@ def tree_semi_focus(self):
27302729
tree.see(item)
27312730
tree.selection_set(item)
27322731

2733-
self.tree_sel_change(item,True)
2732+
self.tree_sel_change(item)
27342733
else:
27352734
self.sel_item = None
27362735

27372736
@catched
2738-
def tree_sel_change(self,item,force=False,change_status_line=True):
2737+
def tree_sel_change(self,item,change_status_line=True):
27392738
self.sel_item = item
27402739

27412740
if change_status_line :
27422741
self.status('')
27432742

27442743
self_tree_set_item=lambda x : self.tree_set(item,x)
27452744

2746-
#path=self_tree_set_item('path')
2747-
2748-
#self.sel_kind = self_tree_set_item('kind')
27492745
self.tree_select()
27502746

27512747
def menubar_unpost(self):

src/png.2.py.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
####################################################################################
44
#
5-
# Copyright (c) 2022-2023 Piotr Jochymek
5+
# Copyright (c) 2022-2024 Piotr Jochymek
66
#
77
# MIT License
88
#

src/record.py

+14-34
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
####################################################################################
44
#
5-
# Copyright (c) 2022-2023 Piotr Jochymek
5+
# Copyright (c) 2023-2024 Piotr Jochymek
66
#
77
# MIT License
88
#
@@ -26,14 +26,14 @@
2626
#
2727
####################################################################################
2828

29-
import argparse
30-
import os
31-
#import signal
32-
#from sys import exit
33-
#from subprocess import DEVNULL
34-
import pathlib
3529
import sys
3630

31+
from os.path import dirname,join as path_join
32+
from os import name as os_name
33+
34+
from pathlib import Path as pathlib_Path
35+
from argparse import ArgumentParser,RawTextHelpFormatter
36+
3737
from time import sleep,perf_counter
3838

3939
from threading import Thread
@@ -42,39 +42,28 @@
4242
from fnmatch import translate
4343
from difflib import SequenceMatcher
4444

45-
#from pickle import load
46-
47-
#from multiprocessing import Process, Queue
48-
#from queue import Empty
49-
50-
#import base64
51-
#import codecs
52-
53-
import io
54-
5545
from json import dumps as json_dumps
46+
from collections import deque
5647

5748
from core import *
5849

5950
VERSION_FILE='version.txt'
6051

6152
def get_ver_timestamp():
6253
try:
63-
timestamp=pathlib.Path(os.path.join(os.path.dirname(__file__),VERSION_FILE)).read_text(encoding='ASCII').strip()
54+
timestamp=pathlib_Path(path_join(dirname(__file__),VERSION_FILE)).read_text(encoding='ASCII').strip()
6455
except Exception as e_ver:
6556
print(e_ver)
6657
timestamp=''
6758
return timestamp
6859

6960
def parse_args(ver):
70-
parser = argparse.ArgumentParser(
71-
formatter_class=argparse.RawTextHelpFormatter,
72-
prog = 'record.exe' if (os.name=='nt') else 'record',
73-
description = f"librer record version {ver}\nCopyright (c) 2023 Piotr Jochymek\n\nhttps://github.com/PJDude/librer",
61+
parser = ArgumentParser(
62+
formatter_class=RawTextHelpFormatter,
63+
prog = 'record.exe' if (os_name=='nt') else 'record',
64+
description = f"librer record version {ver}\nCopyright (c) 2023-2024 Piotr Jochymek\n\nhttps://github.com/PJDude/librer",
7465
)
7566

76-
#parser.add_argument('--foo', required=True)
77-
7867
parser.add_argument('command',type=str,help='command to execute',choices=('load','search','info'))
7968

8069
parser.add_argument('file',type=str,help='dat file')
@@ -143,13 +132,9 @@ def find_params_check(self,
143132

144133
return None
145134

146-
from collections import deque
147-
148-
#results_queue=Queue()
149135
results_queue=deque()
150136

151137
def printer():
152-
#results_queue_get = results_queue.get
153138
results_queue_get = results_queue.popleft
154139

155140
try:
@@ -161,7 +146,7 @@ def printer():
161146
print(json_dumps(result))
162147
else:
163148
sys.stdout.flush()
164-
sleep(0.01)
149+
sleep(0.001)
165150

166151
except Exception as pe:
167152
print_info('printer error:{pe}')
@@ -172,12 +157,7 @@ def printer():
172157
def print_info(*args):
173158
print('#',*args)
174159

175-
176160
if __name__ == "__main__":
177-
#buffer_size = 1024*1024*64
178-
#sys.stdout = io.TextIOWrapper(sys.stdout.detach(), write_through=True, line_buffering=False)
179-
#sys.stdout._CHUNK_SIZE = buffer_size
180-
181161
VER_TIMESTAMP = get_ver_timestamp()
182162

183163
args=parse_args(VER_TIMESTAMP)

src/version.pi.template.librer.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ VSVersionInfo(
3333
StringStruct(u'FileDescription', u'Librer'),
3434
StringStruct(u'FileVersion', u'0.0.0.0'),
3535
StringStruct(u'InternalName', u'Librer'),
36-
StringStruct(u'LegalCopyright', u'Piotr Jochymek 2023'),
36+
StringStruct(u'LegalCopyright', u'Piotr Jochymek 2023-2024'),
3737
StringStruct(u'OriginalFilename', u'librer.exe'),
3838
StringStruct(u'ProductName', u'Librer'),
3939
StringStruct(u'ProductVersion', u'VER_TO_REPLACE')])

src/version.pi.template.record.txt

+1-1
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ VSVersionInfo(
3333
StringStruct(u'FileDescription', u'Librer-Record'),
3434
StringStruct(u'FileVersion', u'0.0.0.0'),
3535
StringStruct(u'InternalName', u'record'),
36-
StringStruct(u'LegalCopyright', u'Piotr Jochymek 2023'),
36+
StringStruct(u'LegalCopyright', u'Piotr Jochymek 2023-2024'),
3737
StringStruct(u'OriginalFilename', u'record.exe'),
3838
StringStruct(u'ProductName', u'Librer-Record'),
3939
StringStruct(u'ProductVersion', u'VER_TO_REPLACE')])

src/version.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
####################################################################################
44
#
5-
# Copyright (c) 2023 Piotr Jochymek
5+
# Copyright (c) 2023-2024 Piotr Jochymek
66
#
77
# MIT License
88
#

0 commit comments

Comments
 (0)