Apache Lucene

Source: Wikipedia, the free encyclopedia.
Lucene
Initial release1999; 25 years ago (1999)
Stable release
9.10.0 / February 20, 2024; 49 days ago (2024-02-20)[1]
Repository
Written in
Apache License 2.0
Websitelucene.apache.org

Apache Lucene is a

Apache Software License. Lucene is widely used as a standard foundation for production search applications.[2][3][4]

Lucene has been ported to other programming languages including Object Pascal, Perl, C#, C++, Python, Ruby and PHP.[5]

History

Xerox PARC, one at Apple, and a fourth at Excite.[7] It was initially available for download from its home at the SourceForge web site. It joined the Apache Software Foundation's Jakarta family of open-source Java products in September 2001 and became its own top-level Apache project in February 2005. The name Lucene is Doug Cutting's wife's middle name and her maternal grandmother's first name.[8]

Lucene formerly included a number of sub-projects, such as Lucene.NET,

Nutch
. These three are now independent top-level projects.

In March 2010, the Apache Solr search server joined as a Lucene sub-project, merging the developer communities.

Version 4.0 was released on October 12, 2012.[9]

In March 2021, Lucene changed its logo, and Apache Solr became a top level Apache project again, independent from Lucene.

Features and common use

While suitable for any application that requires full text

Internet search engines and local, single-site searching.[10][11]

Lucene includes a feature to perform a fuzzy search based on edit distance.[12]

Lucene has also been used to implement recommendation systems.[13] For example, Lucene's 'MoreLikeThis' Class can generate recommendations for similar documents. In a comparison of the term vector-based similarity approach of 'MoreLikeThis' with citation-based document similarity measures, such as co-citation and co-citation proximity analysis, Lucene's approach excelled at recommending documents with very similar structural characteristics and more narrow relatedness.[14] In contrast, citation-based document similarity measures tended to be more suitable for recommending more broadly related documents,[14] meaning citation-based approaches may be more suitable for generating serendipitous recommendations, as long as documents to be recommended contain in-text citations.

Lucene-based projects

Lucene itself is just an indexing and search library and does not contain

parsing
functionality. However, several projects extend Lucene's capability:

See also

References

  1. ^ "Welcome to Apache Lucene". Lucene™ News section. Archived from the original on 12 February 2021. Retrieved 12 February 2020.
  2. PMC 7148026
  3. .
  4. ^ "LuceneImplementations". apache.org. Archived from the original on 6 October 2015. Retrieved 23 September 2015.
  5. ^ KeywordAnalyzer "Better Search with Apache Lucene and Solr" (PDF). 19 November 2007. Archived from the original (PDF) on 31 January 2012.
  6. ^ Cutting, Doug (2019-06-07). "I wrote a couple of search engines at Xerox PARC, then V-Twin at Apple, then re-wrote Excite's search, then Lucene. So, Lucene might be considered V-Twin 3.0? Almost 25 years later, V-Twin still lives on as Mac OS X Search Kit!". @cutting. Retrieved 2019-06-19.
  7. .
  8. ^ "Apache Lucene - Welcome to Apache Lucene". apache.org. Archived from the original on 4 February 2016. Retrieved 4 February 2016.
  9. .
  10. ^ "GNU/Linux Semantic Storage System" (PDF). glscube.org. Archived from the original (PDF) on 2010-06-01.
  11. ^ "Apache Lucene - Query Parser Syntax". lucene.apache.org. Archived from the original on 2017-05-02.
  12. ^ J. Beel, S. Langer, and B. Gipp, “The Architecture and Datasets of Docear’s Research Paper Recommender System,” in Proceedings of the 3rd International Workshop on Mining Scientific Publications (WOSP 2014) at the ACM/IEEE Joint Conference on Digital Libraries (JCDL 2014), London, UK, 2014
  13. ^ a b M. Schwarzer, M. Schubotz, N. Meuschke, C. Breitinger, V. Markl, and B. Gipp, https://www.gipp.com/wp-content/papercite-data/pdf/schwarzer2016.pdf "Evaluating Link-based Recommendations for Wikipedia" in Proceedings of the 16th ACM/IEEE-CS Joint Conference on Digital Libraries (JCDL), New York, NY, USA, 2016, pp. 191-200.
  14. ^ Wayner, Peter. "11 cutting-edge databases worth exploring now". InfoWorld. Archived from the original on 21 September 2015. Retrieved 21 September 2015.
  15. ^ "Elasticsearch: RESTful, Distributed Search & Analytics - Elastic". elastic.co. Archived from the original on 8 October 2015. Retrieved 23 September 2015.
  16. ^ "The Future of Compass & Elasticsearch". the dude abides. Archived from the original on 2015-10-15. Retrieved 2015-10-14.
  17. ^ a b Natividad, Angela. "Socialtext Updates Search, Goes Kino". CMS Wire. Archived from the original on 2012-09-29. Retrieved 2011-05-31.
  18. ^ Marvin Humphrey. "KinoSearch - Search engine library. - metacpan.org". p3rl.org. Retrieved 23 September 2015.
  19. .
  20. .
  21. .

Bibliography

External links