
solr-php-client - issue #5
Strip control characters from document before sending it to solr
What steps will reproduce the problem? 1. Index a document containing at least one control character
What is the expected output? What do you see instead? Document fails to index at solr and a XML parsing exception will occur.
I attached a patch to solve this issue.
Comment #1
Posted on Jul 24, 2009 by Grumpy BearFor the Drupal module, we are applying this regex to all fields - otherwise we frequently see encoding issues.
Note - that as the author of those comments and code which are from the Drupal module, I'm happy to release them for inclusion here under the BSD license.
Comment #2
Posted on Aug 4, 2009 by Happy BirdPatch (with minor changes) has been applied in r14
Status: Fixed
Labels:
Type-Defect
Priority-Medium