/**
*
* An instance of entity processor serves an entity. It is reused throughout the
* import process.
*
*
*
* Implementations of this abstract class must provide a public no-args constructor.
*
*
*
* Refer to http://wiki.apache.org/solr/DataImportHandler
* for more details.
*
*
* This API is experimental and may change in the future.
*
* @version $Id: EntityProcessor.java 824359 2009-10-12 14:31:54Z ehatcher $
* @since solr 1.3
*/
public abstract class EntityProcessor {
/**
* This method is called when it starts processing an entity. When it comes
* back to the entity it is called again. So it can reset anything at that point.
* For a rootmost entity this is called only once for an ingestion. For sub-entities , this
* is called multiple once for each row from its parent entity
*
* @param context The current context
*/
public abstract void init(Context context);
/**
* This method helps streaming the data for each row . The implementation
* would fetch as many rows as needed and gives one 'row' at a time. Only this
* method is used during a full import
*
* @return A 'row'. The 'key' for the map is the column name and the 'value'
* is the value of that column. If there are no more rows to be
* returned, return 'null'
*/
public abstract Map nextRow();
/**
* This is used for delta-import. It gives the pks of the changed rows in this
* entity
*
* @return the pk vs value of all changed rows
*/
public abstract Map nextModifiedRowKey();
/**
* This is used during delta-import. It gives the primary keys of the rows
* that are deleted from this entity. If this entity is the root entity, solr
* document is deleted. If this is a sub-entity, the Solr document is
* considered as 'changed' and will be recreated
*
* @return the pk vs value of all changed rows
*/
public abstract Map nextDeletedRowKey();
/**
* This is used during delta-import. This gives the primary keys and their
* values of all the rows changed in a parent entity due to changes in this
* entity.
*
* @return the pk vs value of all changed rows in the parent entity
*/
public abstract Map nextModifiedParentRowKey();
/**
* Invoked for each parent-row after the last row for this entity is processed. If this is the root-most
* entity, it will be called only once in the import, at the very end.
*
*/
public abstract void destroy();
/**
* Invoked after the transformers are invoked. EntityProcessors can add, remove or modify values
* added by Transformers in this method.
*
* @param r The transformed row
* @since solr 1.4
*/
public void postTransform(Map r) {
}
/**
* Invoked when the Entity processor is destroyed towards the end of import.
*
* @since solr 1.4
*/
public void close() {
//no-op
}
}
继承类EntityProcessorBase是所有具体实体处理器的基类,定义了公用方法,其中最重要的是Map getNext(),从数据迭代器Iterator rowIterator获取Map类型数据记录(其中DIHCacheSupport cacheSupport对象用于缓存)
@Override
public void init(Context context) {
rowcache = null;
this.context = context;
resolver = (VariableResolverImpl) context.getVariableResolver();
//context has to be set correctly . keep the copy of the old one so that it can be restored in destroy
if (entityName == null) {
onError = resolver.replaceTokens(context.getEntityAttribute(ON_ERROR));
if (onError == null) {
onError = ABORT;
}
entityName = context.getEntityAttribute(DataConfig.NAME);
}
delegate.init(context);
}
其他相关方法均为调用被装饰的具体实体处理器的相应方法,另外添加了数据转换等功能,本文不再具体分析