Skip to content

Commit

Permalink
Organize docs into directories
Browse files Browse the repository at this point in the history
  • Loading branch information
valencik committed Apr 6, 2024
1 parent 02a8239 commit ebf3ddf
Show file tree
Hide file tree
Showing 8 changed files with 122 additions and 66 deletions.
1 change: 0 additions & 1 deletion README.md

This file was deleted.

12 changes: 12 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
Protosearch
===========

Protosearch is a prototype search library under active development (hence the "proto").
We're currently focussing on end to end functionality, and not yet worrying too much about API stability or performance throughout.
Protosearch is pre-release software, do not use in production.

[Check out the site to learn more.][site]



[site]: https://cozydev-pink.github.io/protosearch/
13 changes: 13 additions & 0 deletions docs/01-about-protosearch/01-features.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Features
========

Protosearch is a prototype search library aimed at providing advanced querying features and supporting multiple platforms (JVM, JS, Native).

It supports full-text search features like keyword search, phrase search, multiple fields, boolean queries, and regular expressions.

It currently targets static index scenarios such as:

- Powering site documentation search
- In memory search over immutable collections


26 changes: 26 additions & 0 deletions docs/01-about-protosearch/02-design-goals.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
Design Goals
============

## Goals

- Provide building blocks for search on Typelevel sites
- Enable indexing on JVM, searching in browser JS
- Cross compile to JVM / JS / Native
- Support full Lucene query syntax
- Be safe, functional, and performant

## Non Goals

- Competing with or somehow surpassing Lucene
- Being a distributed search like Elasticsearch
- Heavy write workloads


## Lucene Inspired

It's worth calling out how [Lucene][lucene] inspired this library is.
Lucene is an absolutely incredible piece of software.
It has been optimized and extended by a large community for well over 20 years.
If you are looking for very performant search, with a wide range of language support, flexibility and features, you won't find anything better than Lucene on the JVM.

[lucene]: https://lucene.apache.org/
1 change: 1 addition & 0 deletions docs/01-about-protosearch/directory.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
laika.title = About Protosearch
54 changes: 54 additions & 0 deletions docs/02-tutorial/01-indexing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
Indexing Tutorial
=================

Let's setup a collection of books to search over:

```scala mdoc:silent
case class Book(author: String, title: String)

val books: List[Book] = List(
Book("Beatrix Potter", "The Tale of Peter Rabbit"),
Book("Beatrix Potter", "The Tale of Two Bad Mice"),
Book("Dr. Seuss", "One Fish, Two Fish, Red Fish, Blue Fish"),
Book("Dr. Seuss", "Green Eggs and Ham"),
)
```

In order to index our domain type `Book`, we'll need a few things:
- An `Analyzer` to convert strings of text into tokens.
- `Field`s to tell the index what kind of data we want to store
- A way to get the values for each of the fields for a given `Book`

We'll pass all these things to an `IndexBuilder`:

```scala mdoc:silent
import pink.cozydev.protosearch.{Field, IndexBuilder}
import pink.cozydev.protosearch.analysis.Analyzer

val analyzer = Analyzer.default.withLowerCasing
val indexBldr = IndexBuilder.of[Book](
(Field("title", analyzer, stored=true, indexed=true, positions=true), _.title),
(Field("author", analyzer, stored=true, indexed=true, positions=false), _.author),
)
```

And then we can finally index our `books` using the builder:

```scala mdoc:silent
val index = indexBldr.fromList(books)
```

Finally we'll then need a `search` function to test out.
We use a `queryAnalyzer` with the same default field here to make sure our queries get the same analysis as our documents did at indexing time.


```scala mdoc:silent
val qAnalyzer = index.queryAnalyzer

def search(q: String): List[Book] =
index.search(q)
.map(hits => hits.map(h => books(h.id)))
.fold(_ => Nil, identity)
```

Now we can use our `search` function to explore some different query types!
46 changes: 9 additions & 37 deletions docs/queries.md → docs/02-tutorial/02-querying.md
Original file line number Diff line number Diff line change
@@ -1,52 +1,24 @@
# Queries
Querying
========

Protosearch supports queries using boolean logic and a variety of advanced term queries.

## Setup

Let's setup a collection of books to search over:
We'll quickly setup the same index from the [Indexing Tutorial]:

```scala mdoc:silent
case class Book(author: String, title: String)
import pink.cozydev.protosearch.{Field, IndexBuilder}
import pink.cozydev.protosearch.analysis.Analyzer

case class Book(author: String, title: String)
val books: List[Book] = List(
Book("Beatrix Potter", "The Tale of Peter Rabbit"),
Book("Beatrix Potter", "The Tale of Two Bad Mice"),
Book("Dr. Seuss", "One Fish, Two Fish, Red Fish, Blue Fish"),
Book("Dr. Seuss", "Green Eggs and Ham"),
)
```

In order to index our domain type `Book`, we'll need a few things:
- An `Analyzer` to convert strings of text into tokens.
- `Field`s to tell the index what kind of data we want to store
- A way to get the values for each of the fields for a given `Book`

We'll pass all these things to an `IndexBuilder`:

```scala mdoc:silent
import pink.cozydev.protosearch.{Field, IndexBuilder}
import pink.cozydev.protosearch.analysis.Analyzer
Book("Dr. Seuss", "Green Eggs and Ham"))

val analyzer = Analyzer.default.withLowerCasing
val indexBldr = IndexBuilder.of[Book](
val index = IndexBuilder.of[Book](
(Field("title", analyzer, stored=true, indexed=true, positions=true), _.title),
(Field("author", analyzer, stored=true, indexed=true, positions=false), _.author),
)
```

And then we can finally index our `books` using the builder:

```scala mdoc:silent
val index = indexBldr.fromList(books)
```

Finally we'll then need a `search` function to test out.
We use a `queryAnalyzer` with the same default field here to make sure our queries get the same analysis as our documents did at indexing time.


```scala mdoc:silent
val qAnalyzer = index.queryAnalyzer
).fromList(books)

def search(q: String): List[Book] =
index.search(q)
Expand Down
1 change: 1 addition & 0 deletions docs/02-tutorial/directory.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
laika.title = Tutorial
34 changes: 6 additions & 28 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,30 +1,8 @@
# Protosearch
Protosearch
===========

Protosearch is pre-alpha software, do not use in production.
Protosearch is a prototype search library under active development (hence the "proto").
We're currently focussing on end to end functionality, and not yet worrying too much about API stability or performance throughout.

Protosearch is a prototype of a [Lucene][lucene] style search library in pure scala.


## Goals

- Provide building blocks for search on Typelevel sites
- Enable indexing on JVM, searching in browser JS
- Cross compile to JVM / JS / Native
- Support full Lucene query syntax
- Be safe, functional, and performant

## Non Goals

- Competing with or somehow surpassing Lucene
- Being a distributed search like Elasticsearch
- Heavy write workloads


## Lucene Inspired

It's worth calling out how [Lucene][lucene] inspired this library is.
Lucene is an absolutely incredible piece of software.
It has been optimized and extended by a large community for well over 20 years.
If you are looking for very performant search, with a wide range of language support, flexibility and features, you won't find anything better than Lucene on the JVM.

[lucene]: https://lucene.apache.org/
Learn more about Protosearch by reading about our [Features] or [Design Goals].
Additionally you can follow the [Indexing Tutorial] to get up and running.

0 comments on commit ebf3ddf

Please sign in to comment.