-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
8 changed files
with
122 additions
and
66 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
Protosearch | ||
=========== | ||
|
||
Protosearch is a prototype search library under active development (hence the "proto"). | ||
We're currently focussing on end to end functionality, and not yet worrying too much about API stability or performance throughout. | ||
Protosearch is pre-release software, do not use in production. | ||
|
||
[Check out the site to learn more.][site] | ||
|
||
|
||
|
||
[site]: https://cozydev-pink.github.io/protosearch/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
Features | ||
======== | ||
|
||
Protosearch is a prototype search library aimed at providing advanced querying features and supporting multiple platforms (JVM, JS, Native). | ||
|
||
It supports full-text search features like keyword search, phrase search, multiple fields, boolean queries, and regular expressions. | ||
|
||
It currently targets static index scenarios such as: | ||
|
||
- Powering site documentation search | ||
- In memory search over immutable collections | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
Design Goals | ||
============ | ||
|
||
## Goals | ||
|
||
- Provide building blocks for search on Typelevel sites | ||
- Enable indexing on JVM, searching in browser JS | ||
- Cross compile to JVM / JS / Native | ||
- Support full Lucene query syntax | ||
- Be safe, functional, and performant | ||
|
||
## Non Goals | ||
|
||
- Competing with or somehow surpassing Lucene | ||
- Being a distributed search like Elasticsearch | ||
- Heavy write workloads | ||
|
||
|
||
## Lucene Inspired | ||
|
||
It's worth calling out how [Lucene][lucene] inspired this library is. | ||
Lucene is an absolutely incredible piece of software. | ||
It has been optimized and extended by a large community for well over 20 years. | ||
If you are looking for very performant search, with a wide range of language support, flexibility and features, you won't find anything better than Lucene on the JVM. | ||
|
||
[lucene]: https://lucene.apache.org/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
laika.title = About Protosearch |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
Indexing Tutorial | ||
================= | ||
|
||
Let's setup a collection of books to search over: | ||
|
||
```scala mdoc:silent | ||
case class Book(author: String, title: String) | ||
|
||
val books: List[Book] = List( | ||
Book("Beatrix Potter", "The Tale of Peter Rabbit"), | ||
Book("Beatrix Potter", "The Tale of Two Bad Mice"), | ||
Book("Dr. Seuss", "One Fish, Two Fish, Red Fish, Blue Fish"), | ||
Book("Dr. Seuss", "Green Eggs and Ham"), | ||
) | ||
``` | ||
|
||
In order to index our domain type `Book`, we'll need a few things: | ||
- An `Analyzer` to convert strings of text into tokens. | ||
- `Field`s to tell the index what kind of data we want to store | ||
- A way to get the values for each of the fields for a given `Book` | ||
|
||
We'll pass all these things to an `IndexBuilder`: | ||
|
||
```scala mdoc:silent | ||
import pink.cozydev.protosearch.{Field, IndexBuilder} | ||
import pink.cozydev.protosearch.analysis.Analyzer | ||
|
||
val analyzer = Analyzer.default.withLowerCasing | ||
val indexBldr = IndexBuilder.of[Book]( | ||
(Field("title", analyzer, stored=true, indexed=true, positions=true), _.title), | ||
(Field("author", analyzer, stored=true, indexed=true, positions=false), _.author), | ||
) | ||
``` | ||
|
||
And then we can finally index our `books` using the builder: | ||
|
||
```scala mdoc:silent | ||
val index = indexBldr.fromList(books) | ||
``` | ||
|
||
Finally we'll then need a `search` function to test out. | ||
We use a `queryAnalyzer` with the same default field here to make sure our queries get the same analysis as our documents did at indexing time. | ||
|
||
|
||
```scala mdoc:silent | ||
val qAnalyzer = index.queryAnalyzer | ||
|
||
def search(q: String): List[Book] = | ||
index.search(q) | ||
.map(hits => hits.map(h => books(h.id))) | ||
.fold(_ => Nil, identity) | ||
``` | ||
|
||
Now we can use our `search` function to explore some different query types! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
laika.title = Tutorial |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,30 +1,8 @@ | ||
# Protosearch | ||
Protosearch | ||
=========== | ||
|
||
Protosearch is pre-alpha software, do not use in production. | ||
Protosearch is a prototype search library under active development (hence the "proto"). | ||
We're currently focussing on end to end functionality, and not yet worrying too much about API stability or performance throughout. | ||
|
||
Protosearch is a prototype of a [Lucene][lucene] style search library in pure scala. | ||
|
||
|
||
## Goals | ||
|
||
- Provide building blocks for search on Typelevel sites | ||
- Enable indexing on JVM, searching in browser JS | ||
- Cross compile to JVM / JS / Native | ||
- Support full Lucene query syntax | ||
- Be safe, functional, and performant | ||
|
||
## Non Goals | ||
|
||
- Competing with or somehow surpassing Lucene | ||
- Being a distributed search like Elasticsearch | ||
- Heavy write workloads | ||
|
||
|
||
## Lucene Inspired | ||
|
||
It's worth calling out how [Lucene][lucene] inspired this library is. | ||
Lucene is an absolutely incredible piece of software. | ||
It has been optimized and extended by a large community for well over 20 years. | ||
If you are looking for very performant search, with a wide range of language support, flexibility and features, you won't find anything better than Lucene on the JVM. | ||
|
||
[lucene]: https://lucene.apache.org/ | ||
Learn more about Protosearch by reading about our [Features] or [Design Goals]. | ||
Additionally you can follow the [Indexing Tutorial] to get up and running. |