Skip to content

Commit

Permalink
Cleanup files.
Browse files Browse the repository at this point in the history
  • Loading branch information
JanWielemaker committed Feb 28, 2024
1 parent cf2fcc9 commit 7645461
Show file tree
Hide file tree
Showing 7 changed files with 215 additions and 1 deletion.
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
Keep
node_modules
node_modules
swish-bower-components.zip
Expand All @@ -14,3 +15,7 @@ https
web/icons/noble
TAGS
yarn.lock
.yarn-senitel
web/js/swish-min-new.js
web/js/swish-min-new.js.map
swish-node-modules.zip
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ $(YARN_ARCHIVE)::
upload::
rm -f $(YARN_ARCHIVE)
zip -r $(YARN_ARCHIVE) web/node_modules
rsync $(YARN_ARCHIVE) ops:/home/swipl/web/download/swish/$(YARN_ARCHIVE)
rsync $(YARN_ARCHIVE) plweb@oehoe:srv/plweb/data/download/swish/$(YARN_ARCHIVE)


################
Expand Down
61 changes: 61 additions & 0 deletions doc/DS-Blog.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Introducing SWISH DataLab

The SWISH DataLab addresses one of the main bottlenecks of data science,
bringing data from different sources together, cleaning and selecting
this data. Most pipelines use a general purpose programming language
such as Python to clean and ingest the data into a linked data store or
RDBMS after which the relevant data is selected and applicable machine
learning is applied. In contrast, SWISH data management is based on
Prolog, a _relational_ and _logic_ based language. External data sources
such as RDBMS systems, Linked Data, CSV files, XML files, JSON, etc. are
made available using a mixture of _adaptors_ that make the data
available in Prolog's relational model without transferring the data and
_ingestion_, which loads the data into Prolog.

Subsequently, declarative rules are stated to define a clean and
coherent view on the data that is targetted towards analysing this data.
Due to the logic basis of Prolog this view is modular, concise and
declarative, making it easy to maintain. SWI-Prolog's _tabling_
extension provides the same termination properties as DataLog as well as
the same order indepency of rules within the subset Prolog shares with
DataLog. Tabling also provides _caching_ results. At the same time,
users have access to the more general Prolog language to code
transformations that are not supported by DataLog.

SWISH unites [SWI-Prolog](https://www.swi-prolog.org) and
[R](https://www.r-project.org/) together behind a web based IDE that
resembles [Jupyter](https://jupyter.org/) notebooks. This platform can
be deployed on your laptop as well as on a server. The platform allows
multiple data scientists to work on the same data simultaneously while
rule sets can be reused and shared between users. This notably allows
technical people to provide more complicated data transformation steps
to domain experts. The platform can be configured to allow both
authenticated users and anonymous users with limited access rights.
Notebooks and programs are stored in a GIT-like repository and fully
versioned. It is possible to create a snapshot of a query and all
relevant programs for reliable reproduction of results. Data views
defined in SWISH may be downloaded as CSV and can be accessed through a
web based API.

Using Prolog for data integration, cleaning and modelling started life
as a valorisation project within [COMMIT/](https://www.commit-nl.nl/). A
web enabled version of SWI-Prolog was pioneered by [Torbjörn
Lager](https://www.gu.se/english/about_the_university/staff/?languageId=100001&userId=xlagto)
The combination of Prolog and R has been pioneered by Nicos Angelopoulos
at the NKI (Dutch Cancer Institute) in the life sciences domain. SWISH
is in use at CWI to analyse user behaviour based on HTTP log data from
the Dutch national library (Koninklijke Bibliotheek). Samer Abdallah
(University College London) uses SWISH for analysing music. The core of
SWISH is under active development and heavily tested as a shared Prolog
teaching environment.

Useful links:

- [Download SWISH from GitHub](https://github.com/SWI-Prolog/swish)
- [SWISH and R for Docker](https://hub.docker.com/u/swipl)
- [SWISH for Prolog teaching](https://swish.swi-prolog.org)
- [SWISH DataLab: A Web Interface for Data Exploration and Analysis,
BNAIC 2016](https://www.springerprofessional.de/en/swish-datalab-a-web-interface-for-data-exploration-and-analysis/15059986)



104 changes: 104 additions & 0 deletions doc/REPL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
# Clustered SWISH

## Syncing the gitty store

The gitty store is a directed graph of commits. Each commit is linked to
a _data object_. Both commits and data objects are hashed by content and
read-only. This implies they are easily replicated over the network. The
replication takes two forms:

- A node may _announce_ an object by sending the objects content as
a series of chunks.
- A node may _request_ for an entire object or a missing object
chunk. Receiving nodes that have the object will broadcast the
missing object.

The real problem is updating the _head pointer_. This is a central
database that defines the latest version of a file with a certain name.
This notion must be syncronised. This is implemented as follows:

- A node asks the cluster for their current head.
- If all nodes agree on the current head we are done, but some
nodes may not have the indicated file.
- If some nodes have no head, _announce_ the head
- Else
- Ask all nodes to produce a backward path of commits that
includes all reported heads from the other nodes.
- Work out the last common hash, possibly by majority vote.
- Work out the changes since this common hash.
- If nodes agree or have no info, fine
- If nodes disagree, go with the majority.
- Propose the new head to all nodes that agreed on the majority
path. These nodes will _accept_ if nothing changed since their
report, blocking further changes for a specified time.
- If all accept, send a new head notion. Else restart from the
beginning.

The above deals with a life cluster. Nodes that have missed a
conversation or joined the network later may miss a file or the latest
version of a file.

## Remote syncing

Remote syncing is necessary for both new cluster members and for cluster
members that have been offline for some time.

- Find the node with most changes using a request.
- Ask this node to start the process.
- Each cluster member checks it has the change. If not, it starts
a negotiation using gitty_remote_head/2.

## Profile management and login

FIXME

Remote sync of library(persistency)?

- Realise a distributed ledger of changes.
- Apply these.


- Add serial to each event
- Broadcast them
- Adding an event
- Propos


## Email notifications

FIXME

## Chat subsystem

### Maintain a global overview of visitor count

Visitor change messages cary a `local_visitors` and `visitors` field and
are relayed. Nodes receiving such a message uses the `local_visitors` to
update their count of visitors on that node. Nodes composing such a
message count the local visitors and add the known totals from the other
nodes.

### Subscribed files

WSID joining a file, leaving a file or logging out is broadcasted and
each node maintains a view of the remote users by WSID.

FIXME: need to deal with joining nodes and missed updates.

### Profile changes

Profile changes, login, logout are sent to all nodes and each nodes
sends them to the browsers that have the WSID watching some file.

### Chat syncing

- Find the last message of all nodes for DocID.
- If Serial-ID matches, we are done
- Else
- Ask each node for the history as chat(Serial,ID,Time) triples.
- Asses agreement (= no info or same)
- If all agree, send an sync request for the serial range that
is not known everywhere.
- Else, send an agreement _serial_ and a list of Serial-ID
pairs constructed from a chronologically ordered list of
chat messages about which there is no agreement.
5 changes: 5 additions & 0 deletions doc/Redis.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Running SWISH using Redis

## Background

- https://docs.gitlab.com/ee/administration/redis/replication_and_failover_external.html
File renamed without changes.
39 changes: 39 additions & 0 deletions doc/impact.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
- Usage for swish.swi-prolog.org
- Period: Oct 29 2017 - Nov 26 2017
- Visitors: 41433
- Unique visitors: 15375 (based on IP)
- Queries: 738498
- Community:
- Google (Feb 7, 2018)
- "link:swish.swi-prolog.org": 9010 results
- SWISH Prolog: 26.800 results
- GitHub: 6 contributors, 226 stars, 55 forks
- Docker:
- swipl/swish: 121 pulls
- swipl/rserve: 43 pulls (R docker for use with SWISH)
- Commercial use
- Simularity (http://simularity.com/, satellite image analysis)
- Public sites running SWISH with extended versions of Prolog
- http://cplint.ml.unife.it/
Machine learning and R support
- http://lpsdemo.interprolog.com/
"LPS is a logic and computer language for representing the thoughts
and for controlling the behaviour of an intelligent machine situated
in a changing world."
- Publications
- Torbjörn Lager, Jan Wielemaker:
Pengines: Web Logic Programming Made Easy. TPLP 14
- Jan Wielemaker, Torbjörn Lager, Fabrizio Riguzzi:
SWISH: SWI-Prolog for Sharing. IULP 2015. Extended version submitted
to TPLP (Theory and Practice of Logic Programming journal).
- Veruska Zamborlini, Jan Wielemaker, Marcos Da Silveira, Cédric Pruski,
Annette ten Teije, Frank van Harmelen: SWISH for Prototyping Clinical
Guideline Interactions Theory. SWAT4LS 2016
- Wouter Beek, Jan Wielemaker:
SWISH: An Integrated Semantic Web Notebook. International
Semantic Web Conference (Posters & Demos) 2016
- Tessel Bogaard, Jan Wielemaker, Laura Hollink, Jacco van Ossenbruggen:
SWISH DataLab: A Web Interface for Data Exploration and Analysis. BNCAI 2016
- Marco Alberti, Elena Bellodi, Giuseppe Cota, Fabrizio Riguzzi,
Riccardo Zese: cplint on SWISH: Probabilistic Logical Inference
with a Web Browser. Intelligenza Artificiale

0 comments on commit 7645461

Please sign in to comment.