All notable changes to this project will be documented in this file. This project adheres to Semantic Versioning. In order to read more about upgrading and BC breaks have a look at the UPGRADE Document.
- #55 Added indonesia language
- Changed max char length for builder and index tables
content
field to 16,777,215. According to migrations even 4,294,967,295 characters would be supported.
- New
$encode
option for crawl command. If linkcheck is true, the links will be added to a list. Control whether adding the link to the list should encode or not.
- #48 Added events
beforeProcess
andafterIndex
in order to interact with search results from a none crawled source.
- #46 Prevent the crawler from purge the full index when the builder index is empty. This can be disabled with the new option
--purging=1
.
- #45 Use transaction to sync index table when crawler finish the process.
- Updated deps to latest version of
smalot/pdfparser
parser which now requires at least version php 7.1. Therefore raise php version requirements for luya module crawler to version 7.1 to (which is outdated for a long time already: https://www.php.net/supported-versions.php)
- Small changes in docs, translations, composer dependencies
- #40 Add keywords to content string in order to make them searchable.
- Adjusted the default url rule for the crawler, the action was missing before
crawler/default
nowcrawler/default/index
.
- #39 Added Bulgarian translations
- Added default views for the crawler index action
- #38 Added max length validator for content in order to fix mysql error
SQLSTATE[22001]: String data, right truncated: 1406 Data too long for column 'content' at row 1
.
- #37 Added link check support for relative paths on the website. Use head method for link check instead of get and follow those links if needed. Added PHP 8 tests.
- #36 Add concurrent requests configuration option for crawl command.
This release contains new migrations and requires to run the migrate command after updating. Check the UPGRADE document to read more about breaking changes.
- Crawl mechanism refactoring using https://github.com/nadar/crawler.
- Dropped unused module properties and crawler classes, see Upgrade
- Indexing of PDFs is now by default activated.
- #29 Improve performance, create new indexes, improve when working with group conditions.
- #28 Ensure levenshtein input string does not exceed 255 chars.
- #26 Improve handling with lot of data, add more verbosity, add unit tests.
- New FR translations
- New PT translations
- #23 Changed did you mean behavior with empty input values.
- Added new statistiscs overview
- #14 Add relation between suggestions and search results.
- #1 Add indexer interface with property to provide class which implement the interface.
- #20 Added new link status list.
- #19 Fixed bug when regex delimiter is used in search keyword.
- #17 PHP warning is thrown in PHP 7.2 envs when using empty search.
- #15 Added dashboard object with latest keywords without results.
- Added some missing translation keys.
- #12 Fixed bug with ending whitespace.
- #11 Switched to from htmlentities to htmlspecialchars for content crawling.
- #10 Improved the order of pages with a new relevance to query score.
- #3 Added new did you mean widget which returns suggestions based on search history.
- #9 Fix bug with double encoding of preview content.
- #8 Fix issue with utf8 chars for result previews.
- #5 Add option to provide group search in default controller.
- #4 Add info when base url does not return status code 200.
- #2 Add database index keys for builder and index table.
- Use LUYA Testsuite for unit tests.
- Added PHPDocs.
- Added Table output summary when crawler finish.
- First stable release