Solr 创建索引 XML格式

細細.魚 · 发表于 2016-12-15 08:24:24

Solr receives commands and possibly document data through HTTP POST.One way to send an HTTP POST is through the Unix command line program curl (also available on Windows through Cygwin: http://www.cygwin.com) and that's what we'll use here in the examples. An alternative cross-platform option that comes with Solr is post.jar located in Solr's example/exampledocs directory. To get some

basic help on how to use it, run the following command:

>> java –jar example/exampledocs/post.jar -help
　　You'll see in a bit that you can post name-value pair options as HTML form data. However, post.jar doesn't support that, so you'll be forced to specify the URL and put the options in the query string.（打开post.jar包，看到里面只有一个类SimplePostTool用来转发创建索引的，里面确定了solr服务器的URL只能为：public static final String DEFAULT_POST_URL = "http://localhost:8983/solr/update"，对于自己部署的solr服务不能使用）
　　There are several ways to tell Solr to index data, and all of them are through HTTP POST:
　　·    Send the data as the entire POST payload. curl does this with --data-binary (or some similar options) and an appropriate content-type header for whatever the format is.

·    Send some name-value pairs akin to an HTML form submission. With curl, such pairs are preceded by -F. If you're giving data to Solr to be indexed as opposed to it looking for it in a database, then there are a few ways to do that:

    ° Put the data into the stream.body parameter. If it's small, perhaps less than a megabyte, then this approach is fine. The limit is configured with the multipartUploadLimitInKB setting in solrconfig.xml, defaulting to 2GB. If you're tempted to increase this limit, you should reconsider your approach.

    ° Refer to the data through either a local file on the Solr server using the stream.file parameter or a URL that Solr will fetch through the stream.url parameter. These choices are a feature that Solr calls

remote streaming.
　　Here is an example of the first choice. Let's say we have a Solr Update-XML file named artists.xml in the current directory. We can post it to Solr using the following command line:

>> curl http://localhost:8983/solr/update -H 'Content-type:text/xml; charset=utf-8' --data-binary @artists.xml
　　If it succeeds, then you'll have output that looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader">
<int name="status">0</int><int name="QTime">128</int>
</lst>
</response>
　　To use the stream.body feature for the preceding example, you would do this:

curl http://localhost:8983/solr/update -F stream.body=@artists.xml
　　In both cases, the @ character instructs curl to get the data from the file instead of being @artists.xml literally. If the XML is short, then you can just as easily specify it literally on the command line:

curl http://localhost:8983/solr/update -F stream.body='<commit />'
　　Notice the leading space in the value. This was intentional. In this example, curl treats @ and < to mean things we don't want. In this case, it might be more appropriate to use form-string instead of -F. However, it's more typing, and I'm feeling lazy.
　　Remote streaming

In the preceding examples, we've given Solr the data to index in the HTTP message. Alternatively, the POST request can give Solr a pointer to the data in the form of either a file path accessible to Solr or an HTTP URL to it.
　　Just as before, the originating request does not return a response until Solr has finished processing it. If the file is of a decent size or is already at some known URL, then you may find remote streaming faster and/or more convenient, depending on your situation.
　　Here is an example of Solr accessing a local file:

curl http://localhost:8983/solr/update -F stream.file=/tmp/artists.xml
　　To use a URL, the parameter would change to stream.url, and we'd specify a URL. We're passing a name-value parameter (stream.file and the path), not the actual data.
　　Solr's Update-XML format
　　Using an XML formatted message, you can supply documents to be indexed, tell Solr to commit changes, to optimize the index, and to delete documents. Here is a sample XML file you can HTTP POST to Solr that adds (or replaces) a couple documents:

<add overwrite="true">
<doc boost="2.0">
<field name="id">5432a</field>
<field name="type" ...</field>
<field name="a_name" boost="0.5"></field>

<field name="begin_date">2007-12-31T09:40:00Z</field>
</doc>
<doc>
<field name="id">myid</field>
<field name="type" ...
<field name="begin_date">2007-12-31T09:40:00Z</field>
</doc>

</add>
　　The overwrite attribute defaults to true to guarantee the uniqueness of values in the field that you have designated as the unique field in the schema, assuming you have such a field. If you were to add another document that has the same value for the unique field, then this document would overwrite the previous document. You will not get an error.
　　The boost attribute affects the scores of matching documents in order to affect ranking in score-sorted search results. Providing a boost value, whether at the document or field level, is optional. The default value is 1.0, which is effectively a non-boost. Technically, documents are not boosted, only fields are. The effective boost value of a field is that specified for the document multiplied by that specified for the field.
　　Deleting documents
　　You can delete a document by its unique field. Here we delete two documents:

<delete><id>Artist:11604</id><id>Artist:11603</id></delete>
　　To more flexibly specify which documents to delete, you can alternatively use a Lucene/Solr query:

<delete><query>timestamp:[* TO NOW-12HOUR]</query></delete>
　　Commit
　　Data sent to Solr is not immediately searchable, nor do deletions take immediate effect. Like a database, changes must be committed first. The easiest way to do this is to add a commit=true request parameter to a Solr update URL. The request to Solr could be the same request that contains data to be indexed then committed or an empty request—it doesn't matter. For example, you can visit this URL to issue a commit on our mbreleases core: http://localhost:8983/solr/update?commit=true. You can also commit changes using the XML syntax by simply sending this to Solr:

<commit />

账号		自动登录	找回密码
密码			立即注册

wirelessnetview好用的无线分析工具

Red Hat RHCE 8 (EX294) Cert Guide

Shell从入门到精通（阿良）

亿图图示专家(EDraw Max) V7.9 中文破解版

zabbix3.4.1安装部署+微信推送信息+大屏显

Red Hat OpenShift I: Containers & Kubern

2025 年，C++ 还能“硬核”多久？

[经验分享] Solr 创建索引 XML格式

浏览过的版块

扫码加入运维网微信交流群