Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-15242: Consolidate README.md with solr/README.md #610

Merged
merged 33 commits into from
Jul 20, 2022
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
da173d3
strawperson version of potential new README.md at the top level, repl…
Feb 8, 2022
9cf87bd
we appear to still be on freenode, not librera
Feb 8, 2022
b67ad39
we have moved to libera.chat
Feb 10, 2022
2a2a32a
Update README_v2.md
epugh Feb 10, 2022
0911f87
updated to latest openjdk
Feb 10, 2022
dfa3089
use consistent descriptions for examples across README and help output
Feb 10, 2022
bd29528
remove fixme
Feb 10, 2022
6877151
experimenting with adding more detailed docs
Feb 10, 2022
e7b5af3
Merge branch 'main' into SOLR-15242
epugh Jul 10, 2022
8426d43
respond to comments
epugh Jul 10, 2022
cf224ad
tabs to spaces
epugh Jul 10, 2022
cf6e6e8
skelaton
epugh Jul 14, 2022
3024f20
bring in some links
epugh Jul 14, 2022
160b000
a take of moving content to the dev-docs
epugh Jul 14, 2022
b43af2e
finish firt draft content
epugh Jul 14, 2022
f299706
darn you markdown formatting
epugh Jul 14, 2022
5825539
port light weight tutorial into ref guide
epugh Jul 14, 2022
36e2fdc
link in tutorial
epugh Jul 14, 2022
28bdc78
better header
epugh Jul 14, 2022
a31f301
better titlte
epugh Jul 14, 2022
f9d6535
README set up for inclusion in our distributions
epugh Jul 19, 2022
950b5a3
typo
epugh Jul 19, 2022
657c5c2
finalizing the top level readme
epugh Jul 19, 2022
7bc921f
Revamp!
epugh Jul 19, 2022
27d85c4
finalized our top level README
epugh Jul 19, 2022
a465efe
Provide developer information on sub directories.
epugh Jul 19, 2022
4c35ef4
comments are hard
epugh Jul 19, 2022
e90cec1
Update README.adoc
epugh Jul 19, 2022
e79a432
Update README.adoc
epugh Jul 19, 2022
67c3f0a
fixing links
epugh Jul 19, 2022
ee66008
Fix links
HoustonPutman Jul 19, 2022
7c93dae
Package new solr binary readme
HoustonPutman Jul 19, 2022
c0120a2
Merge branch 'main' into SOLR-15242
HoustonPutman Jul 19, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
307 changes: 307 additions & 0 deletions README_v2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,307 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
The ASF licenses this file to You under the Apache License, Version 2.0
the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

# Welcome to the Apache Solr project!
-----------------------------------

Solr is the popular, blazing fast open source enterprise search platform
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blazing fast open source enterprise search platform

[0] It feels weird to single out enterprise search here as being our main (only?) use case. We're also an ecommerce search engine, and an analytics engine, etc.

(I understand this language is probably carried over from one of the existing READMEs, and used in many places...no need to fix it here necessarily if you're not interested.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely agree with you @gerlowskija on this... It IS front and center on the website at https://solr.apache.org/ ;-(. Worth a seperate ticket?

Solr is the popular, blazing fast open source search platform for your enterprise, ecommerce, and analytics needs

???

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, we should fix it everywhere. I changed the Official docker image description to remove it. But we should standardize around a common language, and set that everywhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HoustonPutman you want to open a ticket? If you get the consensus, I'd volunteer to do the updates ;-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

written in Java and using [Apache Lucene](https://lucene.apache.org/).

For a complete description of the Solr project, team composition, source
code repositories, and other details, please see the Solr web site at
https://solr.apache.org/

## Online Documentation

This README file only contains basic instructions. For comprehensive documentation,
visit the [Solr Reference Guide](https://solr.apache.org/guide/).

## Getting Started
FIXME maybe put the tutorial that Noble or Ishan wrote??? Then at the end put the examples?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Q] Alternatively - do we really want a full blown tutorial in our README? At best, it duplicates documentation we have elsewhere. At worst it makes it harder to find the information the reader actually came for.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we shouldn't have a full-blown tutorial here. We should have links to our actual tutorials, and enhance them if need be.

If we do keep a tutorial here, let's put other stuff before it, like the docker and kubernetes sections. These are smaller and are really just advertising other documentation/sites that people should be checking out.


### Starting Solr

Start a Solr node in cluster mode (SolrCloud mode)

```
bin/solr -c
```

To start another Solr node and have it join the cluster alongside the first node,

```
bin/solr -c -z localhost:9983 -p 8984
```

An instance of the cluster coordination service, i.e. Zookeeper, was started on port 9983 when the first node was started. To start Zookeeper separately, please refer to XXXX.

### Creating a collection

Like a database system holds data in tables, Solr holds data in collections. A collection can be created as follows:

```
curl --request POST \
--url http://localhost:8983/api/collections \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[+1] Thanks for the effort to use the v2 APIs!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[-0] If we are trying to fixup the v2 APIs before marketing them, lets unfortunately continue to use the v1 APIs until they are ready.

(If we plan on keeping the tutorial section)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are trying to fixup the v2 APIs

I guess that makes sense, unfortunately. I'd love to get these APIs mostly locked in as soon as we can though, so that we can start pushing them. On that note: anyone reading this should go review the proposed changes to the v2 API here 😛

--header 'Content-Type: application/json' \
--data '{
"create": {
"name": "techproducts",
"numShards": 1,
"replicationFactor": 1
}
}'
```

### Defining a schema

Let us define some of the fields that our documents will contain.

```
curl --request POST \
--url http://localhost:8983/api/collections/techproducts/schema \
--header 'Content-Type: application/json' \
--data '{
"add-field": [
{"name": "name", "type": "text_general", "multiValued": false},
{"name": "cat", "type": "string", "multiValued": true},
{"name": "manu", "type": "string"},
{"name": "features", "type": "text_general", "multiValued": true},
{"name": "weight", "type": "pfloat"},
{"name": "price", "type": "pfloat"},
{"name": "popularity", "type": "pint"},
{"name": "inStock", "type": "boolean", "stored": true},
{"name": "store", "type": "location"}
]
}'
```

### Indexing documents

A single document can be indexed as:

```
curl --request POST \
--url 'http://localhost:8983/api/collections/techproducts/update' \
--header 'Content-Type: application/json' \
--data ' {
"id" : "978-0641723445",
"cat" : ["book","hardcover"],
"name" : "The Lightning Thief",
"author" : "Rick Riordan",
"series_t" : "Percy Jackson and the Olympians",
"sequence_i" : 1,
"genre_s" : "fantasy",
"inStock" : true,
"price" : 12.50,
"pages_i" : 384
}'
```

Multiple documents can be indexed in the same request:
```
curl --request POST \
--url 'http://localhost:8983/api/collections/techproducts/update' \
--header 'Content-Type: application/json' \
--data ' [
{
"id" : "978-0641723445",
"cat" : ["book","hardcover"],
"name" : "The Lightning Thief",
"author" : "Rick Riordan",
"series_t" : "Percy Jackson and the Olympians",
"sequence_i" : 1,
"genre_s" : "fantasy",
"inStock" : true,
"price" : 12.50,
"pages_i" : 384
}
,
{
"id" : "978-1423103349",
"cat" : ["book","paperback"],
"name" : "The Sea of Monsters",
"author" : "Rick Riordan",
"series_t" : "Percy Jackson and the Olympians",
"sequence_i" : 2,
"genre_s" : "fantasy",
"inStock" : true,
"price" : 6.49,
"pages_i" : 304
}
]'
```

A file containing the documents can be indexed as follows:
```
curl -H "Content-Type: application/json" \
-X POST \
-d @example/products.json \
--url 'http://localhost:8983/api/collections/techproducts/update?commit=true'
```

### Commit
After documents are indexed into a collection, they are not immediately available for searching. In order to have them searchable, a commit operation (also called `refresh` in other search engines like OpenSearch etc.) is needed. Commits can be scheduled at periodic intervals using auto-commits as follows.

```
curl -X POST -H 'Content-type: application/json' -d '{"set-property":{"updateHandler.autoCommit.maxTime":15000}}' http://localhost:8983/api/collections/techproducts/config
```

### Basic search queries
FIXME

### Solr Examples

Solr includes a few examples to help you get started. To run a specific example, enter:

```
bin/solr -e <EXAMPLE> where <EXAMPLE> is one of:
cloud: SolrCloud example
techproducts: Comprehensive example illustrating many of Solr's core capabilities
schemaless: Schema-less example (schema is inferred from data during indexing)
films: Example of starting with _default configset and adding explicit fields dynamically
```

For instance, if you want to run the techproducts example, enter:

```
bin/solr -e techproducts
```

### Running Solr in Docker

You can run Solr in Docker via the [official image](https://hub.docker.com/_/solr).

To run Solr in a container and expose the Solr port, run:

`docker run -p 8983:8983 solr`

In order to start Solr in cloud mode, run the following.

`docker run -p 8983:8983 solr solr-fg -c`

For documentation on using the official docker builds, please refer to the [DockerHub page](https://hub.docker.com/_/solr).
Up to date documentation for running locally built images of this branch can be found in the [local reference guide](solr/solr-ref-guide/src/running-solr-in-docker.adoc).

There is also a gradle task for building custom Solr images from your local checkout.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps add a link to dev-docs/docker.adoc - eeh, no, wait that does not exist, I meant help/docker.txt -- eh, sorry, wrong again, it should be solr/docker/gradle-help.txt (why on earth do we have dev-docs spread out this much?) so we don't need to say so much about building Docker locally in the main README.

Hmm, looking a few lines down, I now see that ./gradlew helpDocker actually points to that txt file, but that is not consistent either, would expect all help files to exist in help/.. Well, well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to take this as "it's okay" for now then on this comment? Quite agree on your frustration!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for that placement, there was discussion at the time, but it is long forgotten. I'd be happy to move it to wherever it will make the most sense 🙂

These local images are built identically to the official image except for retrieving the Solr artifacts locally instead of from the official release.
This can be useful for testing out local changes as well as creating custom images for yourself or your organization.
The task will output the image name to use at the end of the build.

`./gradlew docker`

For more info on building an image, run:

`./gradlew helpDocker`

### Running Solr on Kubernetes

Solr has official support for running on Kubernetes, in the official Docker image.
Please refer to the [Solr Operator](https://solr.apache.org/operator) home for details, tutorials and instructions.

## Building Solr from Source
Download the Java 11 JDK (Java Development Kit) or later. We recommend the OpenJDK
distribution Eclipse Temurin available from https://adoptium.net/.
You will need the JDK installed, and the $JAVA_HOME/bin (Windows: %JAVA_HOME%\bin)
folder included on your command path. To test this, issue a "java -version" command
from your shell (command prompt) and verify that the Java version is 11 or later.

Download the Apache Solr distribution, from https://solr.apache.org/downloads.html.
Unzip the distribution to a folder of your choice, e.g. C:\solr or ~/solr
Alternately, you can obtain a copy of the latest Apache Solr source code
directly from the Git repository:

<https://solr.apache.org/community.html#version-control>

Solr uses [Gradle](https://gradle.org/) as the build
system. Navigate to the root of your source tree folder and issue the `./gradlew tasks`
command to see the available options for building, testing, and packaging Solr.

`./gradlew dev` will create a Solr executable suitable for development.
cd to `./solr/packaging/build/dev` and run the `bin/solr` script
to start Solr.

NOTE: `gradlew` is the "Gradle Wrapper" and will automatically download and
start using the correct version of Gradle for Solr.

NOTE: `./gradlew help` will print a list of high-level tasks. There are also a
number of plain-text files in <source folder root>/help.

The first time you run Gradle, it will create a file "gradle.properties" that
contains machine-specific settings. Normally you can use this file as-is, but it
can be modified if necessary.

Note as well that the gradle build does not create or copy binaries throughout the
source repository so you need to switch to the packaging output folder `./solr/packaging/build`;
the rest of the instructions below remain identical. The packaging directory
is rewritten on each build.

If you want to build the documentation, type `./gradlew -p solr documentation`.

`./gradlew check` will assemble Solr and run all validation tasks unit tests.

To build the final Solr artifacts run `./gradlew assemble`.

Lastly, there is developer oriented documentation in `./dev-docs/README.adoc` that
you may find useful in working with Solr.


### Gradle build and IDE support
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General about building from source and gradle - perhaps we should make some really good and thorough guides about that topic in dev-docs, and just have a few really basic examples here in the main readme, like ./gradlew dev and cd solr/packaging/dev/ && bin/solr -e techproducts...

But i see that the main focus here is merging the two READMEs, so feel free to postpone dev-docs work to later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the sentiment! We should rationalize more the dev-docs and the stuff in ref guide related to development. Or just put dev-docs etc in Ref Guide!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That can be done in followup issues. I think we plan for a separate dev-guide, and not mash everything into ref-guide.


- *IntelliJ* - IntelliJ idea can import the project out of the box.
Code formatting conventions should be manually adjusted.
- *Eclipse* - Not tested.
- *Netbeans* - Not tested.

## Contributing

Please review the [Contributing to Solr Guide](https://cwiki.apache.org/confluence/display/solr/HowToContribute)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Later, the result of https://issues.apache.org/jira/browse/SOLR-15682 will be linked here...

for information on contributing.

## Discussion and Support

- [Mailing Lists](https://solr.apache.org/community.html#mailing-lists-chat)
- [Issue Tracker (JIRA)](https://issues.apache.org/jira/browse/SOLR)
- IRC: `#solr` and `#solr-dev` on libera.chat
- [Slack](https://solr.apache.org/community.html#slack)

## Export control

This distribution includes cryptographic software. The country in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Q] Is there a legal requirement to have this here, or is it here for practical purposes?

(i.e. This mostly looks like legal boilerplate to me, but I'm not an expert so maybe there's actual nuggets in here useful to sysadmin/security folks?)

Assuming this is here for legal reasons: looking at other projects (e.g. Elasticsearch, Vespa, Zookeeper) who use cryptography similar to Solr, none of them mention export controls in their top level repositories. Maybe we're the rare case of a project actually doing what's legally required, but I suspect we actually don't need this here.

(Also, ./gradlew dependencies makes me think we no longer pull in bouncycastle from TIKA as this suggests.)

which you currently reside may have restrictions on the import,
possession, use, and/or re-export to another country, of
encryption software. BEFORE using any encryption software, please
check your country's laws, regulations and policies concerning the
import, possession, or use, and re-export of encryption software, to
see if this is permitted. See <https://www.wassenaar.org/> for more
information.

The U.S. Government Department of Commerce, Bureau of Industry and
Security (BIS), has classified this software as Export Commodity
Control Number (ECCN) 5D002.C.1, which includes information security
software using or performing cryptographic functions with asymmetric
algorithms. The form and manner of this Apache Software Foundation
distribution makes it eligible for export under the License Exception
ENC Technology Software Unrestricted (TSU) exception (see the BIS
Export Administration Regulations, Section 740.13) for both object
code and source code.

The following provides more details on the included cryptographic
software:

Apache Solr uses Apache Tika which uses the Bouncy Castle generic encryption libraries for
extracting text content and metadata from encrypted PDF files.
See https://www.bouncycastle.org/ for more details on Bouncy Castle.
2 changes: 1 addition & 1 deletion solr/bin/solr
Original file line number Diff line number Diff line change
Expand Up @@ -394,7 +394,7 @@ function print_usage() {
echo " -e <example> Name of the example to run; available examples:"
echo " cloud: SolrCloud example"
echo " techproducts: Comprehensive example illustrating many of Solr's core capabilities"
echo " schemaless: Schema-less example"
echo " schemaless: Schema-less example (schema is inferred from data during indexing)"
echo " films: Example of starting with _default configset and adding explicit fields dynamically"
echo ""
echo " -a Additional parameters to pass to the JVM when starting Solr, such as to setup"
Expand Down
2 changes: 1 addition & 1 deletion solr/bin/solr.cmd
Original file line number Diff line number Diff line change
Expand Up @@ -370,7 +370,7 @@ goto done
@echo -e example Name of the example to run; available examples:
@echo cloud: SolrCloud example
@echo techproducts: Comprehensive example illustrating many of Solr's core capabilities
@echo schemaless: Schema-less example
@echo schemaless: Schema-less example (schema is inferred from data during indexing)
@echo films: Example of starting with _default configset and defining explicit fields dynamically
@echo.
@echo -a opts Additional parameters to pass to the JVM when starting Solr, such as to setup
Expand Down