ragno

A crawler for extracting info from web domains

Usage

Copy and edit the configuration file

cp ragno-sample.edn ragno.edn

Invoke a library API function from the command-line:

$clojure -X net.clojars.matteoredaelli.ragno/cli :urlfile \"urls.csv\"  :config-file \"ragno.edn\"

Using Redis or Apache kvrocks or Snap KeyDB

$clojure -X net.clojars.matteoredaelli.redis/cli :config-file \"ragno.edn\"

$redis-cli publish ragno "https://www.redaelli.org/"

Run the project's tests (they'll fail until you edit them):

$ clojure -T:build test

Run the project's CI pipeline and build a JAR (this will fail until you edit the tests to pass):

$ clojure -T:build ci

This will produce an updated pom.xml file with synchronized dependencies inside the META-INF directory inside target/classes and the JAR in target. You can update the version (and SCM tag) information in generated pom.xml by updating build.clj.

Install it locally (requires the ci task be run first):

$ clojure -T:build install

Deploy it to Clojars -- needs CLOJARS_USERNAME and CLOJARS_PASSWORD environment variables (requires the ci task be run first):

$ clojure -T:build deploy

Your library will be deployed to net.clojars.matteoredaelli/ragno on clojars.org by default.

TODO

[ ] error with https://sam-rogers.com

ROADMAP

Evaluating parallel batch processing

[ ] with https://github.com/nilenso/goose [ ] with slurm

License

Distributed under the Eclipse Public License version 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
config		config
doc		doc
resources		resources
src/net/clojars/matteoredaelli		src/net/clojars/matteoredaelli
test/matteoredaelli		test/matteoredaelli
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
build.clj		build.clj
deps.edn		deps.edn
ragno-sample.edn		ragno-sample.edn

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ragno

Usage

TODO

ROADMAP

License

About

Releases

Packages

Languages

License

matteoredaelli/ragno.clj

Folders and files

Latest commit

History

Repository files navigation

ragno

Usage

TODO

ROADMAP

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages