Apache Solr

Solr is the popular, blazing fast open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly scalable, providing distributed search and index replication, and it powers the search and navigation features of many of the world's largest internet sites.

Solr is written in Java and runs as a standalone full-text search server within a servlet container such as Tomcat. Solr uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it easy to use from virtually any programming language. Solr's powerful external configuration allows it to be tailored to almost any type of application without Java coding, and it has an extensive plugin architecture when more advanced customization is required.

See the complete feature list for more details.

For more information about Solr, please see the Solr wiki.

Solr News

27 November 2011 - Solr 3.5.0 Available

The Lucene PMC is pleased to announce the availability of Apache Solr 3.5.0.

Solr can be downloaded from http://www.apache.org/dyn/closer.cgi/lucene/solr/

Highlights of the Solr release include:

Bug fixes and improvements from Apache Lucene 3.5.0, including a very substantial (3-5X) RAM reduction required to hold the terms index on opening an IndexReader. (LUCENE-2205)

Added support for distributed result grouping. (SOLR-2066, SOLR-2776)

Added support for Hunspell stemmer TokenFilter supporting stemming for 99 languages. (SOLR-2769)

A new contrib module "langid" adds language identification capabilities as an Update Processor, using Tika's LanguageIdentifier or Cybozu language-detection library (SOLR-1979)

Numeric types including Trie and date types now support sortMissingFirst/Last. (SOLR-2881)

Added hl.q parameter. It is optional and if it is specified, it overrides q parameter in Highlighter. (SOLR-1926)

Several minor bugfixes like date parsing for years from 0001-1000, ignored configurations when using QueryAnalyzer with SpellCheckComponent and many more. See CHANGES.txt entries for full details.

26 October 2011 - Java 7u1 fixes index corruption and crash bugs in Apache Lucene Core and Apache Solr

Oracle released Java 7u1 on October 19. According to the release notes and tests done by the Lucene committers, all bugs reported on July 28 are fixed in this release, so code using Porter stemmer no longer crashes with SIGSEGV. We were not able to experience any index corruption anymore, so it is safe to use Java 7u1 with Lucene Core and Solr. On the same day, Oracle released Java 6u29 fixing the same problems occurring with Java 6, if the JVM switches -XX:+AggressiveOpts or -XX:+OptimizeStringConcat were used. Of course, you should not use experimental JVM options like -XX:+AggressiveOpts in production environments! We recommend everybody to upgrade to this latest version 6u29. In case you upgrade to Java 7, remember that you may have to reindex, as the unicode version shipped with Java 7 changed and tokenization behaves differently (e.g. lowercasing). For more information, read JRE_VERSION_MIGRATION.txt in your distribution package!

14 September 2011 - Lucene Core 3.4.0 and Solr 3.4.0 Available

The Lucene PMC is pleased to announce the availability of Apache Solr 3.4.0.

Solr can be downloaded from http://www.apache.org/dyn/closer.cgi/lucene/solr/

Highlights of the Solr release include:

SolrJ client can now parse grouped and range facets results (SOLR-2523).

A new XsltUpdateRequestHandler allows posting XML that's transformed by a provided XSLT into a valid Solr document (SOLR-2630).

Post-group faceting option (group.truncate) can now compute facet counts for only the highest ranking documents per-group. (SOLR-2665).

Add commitWithin update request parameter to all update handlers that were previously missing it. This tells Solr to commit the change within the specified amount of time (SOLR-2540).

You can now specify NIOFSDirectory (SOLR-2670).

New parameter hl.phraseLimit speeds up FastVectorHighlighter (LUCENE-3234).

The query cache and filter cache can now be disabled per request. See this wiki page (SOLR-2429).

Improved memory usage, build time, and performance of SynonymFilterFactory (LUCENE-3233).

Added omitPositions to the schema, so you can omit position information while still indexing term frequencies (LUCENE-2048).

Various fixes for multi-threaded DataImportHandler.

28 July 2011 - WARNING: Index corruption and crashes in Apache Lucene Core / Apache Solr with Java 7

Oracle released Java 7 today. Unfortunately it contains hotspot compiler optimizations, which miscompile some loops. This can affect code of several Apache projects. Sometimes JVMs only crash, but in several cases, results calculated can be incorrect, leading to bugs in applications (see Hotspot bugs 7070134, 7044738, 7068051). Apache Lucene Core and Apache Solr are two Apache projects, which are affected by these bugs, namely all versions released until today. Solr users with the default configuration will have Java crashing with SIGSEGV as soon as they start to index documents, as one affected part is the well-known Porter stemmer (see LUCENE-3335). Other loops in Lucene may be miscompiled, too, leading to index corruption (especially on Lucene trunk with pulsing codec; other loops may be affected, too - LUCENE-3346). These problems were detected only 5 days before the official Java 7 release, so Oracle had no time to fix those bugs, affecting also many more applications. In response to our questions, they proposed to include the fixes into service release u2 (eventually into service release u1, see this mail). This means you cannot use Apache Lucene/Solr with Java 7 releases before Update 2! If you do, please don't open bug reports, it is not the committers' fault! At least disable loop optimizations using the -XX:-UseLoopPredicate JVM option to not risk index corruptions. Please note: Also Java 6 users are affected, if they use one of those JVM options, which are not enabled by default: -XX:+OptimizeStringConcat or -XX:+AggressiveOpts. It is strongly recommended not to use any hotspot optimization switches in any Java version without extensive testing! In case you upgrade to Java 7, remember that you may have to reindex, as the unicode version shipped with Java 7 changed and tokenization behaves differently (e.g. lowercasing). For more information, read JRE_VERSION_MIGRATION.txt in your distribution package!

1 July 2011 - Solr 3.3 Available

The Lucene PMC is pleased to announce the availability of Apache Solr 3.3.

Solr can be downloaded from http://www.apache.org/dyn/closer.cgi/lucene/solr/

Highlights of the Solr release include:

Grouping / Field Collapsing A new, automaton-based suggest/autocomplete implementation offering an order of magnitude smaller RAM consumption. KStemFilterFactory, an optimized implementation of a less aggressive stemmer for English. Solr defaults to a new, more efficient merge policy (TieredMergePolicy). See http://s.apache.org/merging for more information. Important bugfixes, including extremely high RAM usage in spellchecking. Bugfixes and improvements from Apache Lucene 3.3

4 June 2011 - Lucene Core 3.2 and Solr 3.2 Available

The Lucene PMC is pleased to announce the availability of Apache Solr 3.2.

Solr can be downloaded from http://www.apache.org/dyn/closer.cgi/lucene/solr/

Highlights of the Solr release include:

Ability to specify overwrite and commitWithin as request parameters when using the JSON update format. TermQParserPlugin, useful when generating filter queries from terms returned from field faceting or the terms component. DebugComponent now supports using a NamedList to model Explanation objects in its responses instead of Explanation.toString. Improvements to the UIMA and Carrot2 integrations. Highlighting performance improvements. A test-framework jar for easy testing of Solr extensions. Bugfixes and improvements from Apache Lucene 3.2.

31 March 2011 - Solr 3.1 Available

The Lucene PMC is pleased to announce the availability of Apache Solr 3.1. The version number for Solr 3.1 was chosen to reflect the merge of development with Lucene, which is currently also on 3.1. Going forward, we expect the Solr version to be the same as the Lucene version. Solr 3.1 contains Lucene 3.1 and is the release after Solr 1.4.1.

Solr can be downloaded from http://www.apache.org/dyn/closer.cgi/lucene/solr/

Highlights of the Solr release include:

Numeric range facets (similar to date faceting). New spatial search, including spatial filtering, boosting and sorting capabilities. Example Velocity driven search UI at http://localhost:8983/solr/browse A new termvector-based highlighter Extend dismax (edismax) query parser which addresses some missing features in the dismax query parser along with some extensions. Several more components now support distributed mode: TermsComponent, SpellCheckComponent. A new Auto Suggest component. Ability to sort by functions. JSON document indexing. CSV response format. Apache UIMA integration for metadata extraction. Leverages Lucene 3.1 and it's inherent optimizations and bug fixes as well as new analysis capabilities. Numerous improvements, bug fixes, and optimizations.

The Apache Software Foundation

The Apache Software Foundation provides support for the Apache community of open-source software projects. The Apache projects are defined by collaborative consensus based processes, an open, pragmatic software license and a desire to create high quality software that leads the way in its field. Apache Lucene, Apache Solr, Apache PyLucene, Apache Open Relevance Project and their respective logos are trademarks of The Apache Software Foundation. All other marks mentioned may be trademarks or registered trademarks of their respective owners.


You are viewing a mobilized version of this site...
View original page here

Mobilized by Mowser Mowser