Installation

  • New in mongo-connector 2.5.0, to install mongo-connector with the Solr-doc-manager run:
Pip install 'mongo-connector[solr]'

Setup

  • Create Solr cores

Make sure the LukeRequestHandler is enabled

  • This line should be present in your solrconfig.xml file:
<requestHandler name="/admin/luke" class="org.apache.solr.handler.admin.LukeRequestHandler" />

Set up your Schema

  • Mongo Connector stores metadata in every document to help handle rollbacks.
  • To support these data, you’ll need to include the following to your schema.xml:
<field name="_ts" type="long" indexed="true" stored="true" />
<field name="ns" type="string" indexed="true" stored="true"/>

The Basics:

  • Mongo Connector can replicate to Apache Solr search engine by using the Solr DocManager.
  • To start the connector, you must pass in the base URL for the Solr core to which you want to synchronize.

The most basic usage is the following:

Mongo-connector -m localhost:27017 -t http://localhost:8983/solr -d solr_doc_manager

Mongo Connector and schema.xml

Configuring Solr

  • Refer to the Apache documentation for configuring Solr and SolrCloud.

N.B.: Key Names and Document Flattening

  • Mongo Connector automatically “flattens” MongoDB documents.
  • Fields within sub-documents can be referenced by their “dot-separated path” within the document.
  • Likewise, array fields are unrolled, so that individual elements are accessible by the field’s original name, plus a “.”, plus the index within the array that the element occupied.

An example:

{
"subdoc": {
"a": 1,
"b": 2,
"c": 3,
"array": [
{"name": "elmo"},
{"name": "oscar"}
]
}
}
  • It will become the following in Solr:
{
"subdoc.a": 1,
"subdoc.b": 2,
"subdoc.c": 3,
"subdoc.array.0.name": "elmo",
"subdoc.array.1.name": "oscar"
}
Apache Solr Mongodb

Schema.xml

  • In addition, Mongo Connector comes with an example schema.xml file that can help get you started integrating MongoDB with Apache Solr search.
  • Solr reads schema.xml in order to find field types, fields that documents may have the primary key, and more.
  • MongoDB Connector try to acquire the schema for Apache Solr by using the LukeRequestHandler at a particular URI admin/luke/?show=schema&wt=json which is appended to the base of the Apache Solr URL.
  • So, in the above example, Mongo Connector tries to get the schema for Apache Solr by sending a GET request to http://localhost:8983/solr/admin/luke/?show=schema&wt=json.
  • Mongo Connector will drop fields from MongoDB documents that are not confirmed in your Apache Solr core’s schema in order to avoid Solr throw exceptions and fail to add those documents.
  • If you do not describe the fields then you desire in schema.xml and reload the Apache Solr core, Mongo Connector will merrily continue stripping your MongoDB documents of the offending fields.
  • You can able to check what Apache Solr thinks the schema to your core by visiting the aforementioned endpoint in browser.

Categorized in:

Tagged in:

, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,