Export to GitHub

google-refine - issue #478

All importers should support option to import without changing data


Posted on Nov 5, 2011 by Swift Bird

It should be possible for users to turn off all unnecessary transformations of input data. Many times these are irreversible, so the time to deal with them is at initial import.

One example is the XML importer has no way to turn off string->number conversion. Other converters such strip leading and trailing double quotes that they find.

Comment #1

Posted on Dec 7, 2011 by Happy Giraffe

Comment deleted

Comment #2

Posted on Dec 20, 2011 by Swift Bird

Issue 511 has been merged into this issue.

Comment #3

Posted on Sep 7, 2012 by Swift Bird

(No comment was entered for this change.)

Comment #4

Posted on Sep 7, 2012 by Swift Bird

In r2451 TabularImportingParserBase now defaults guessCellValueTypes to False so that importers which don't specify it don't get it turned on automatically. This was adversely affecting Excel, Open Office Calc, and Google Spreadsheets because they had no control to turn off the default since they have data types built in.

Comment #5

Posted on Sep 7, 2012 by Swift Bird

(No comment was entered for this change.)

Comment #6

Posted on Sep 7, 2012 by Swift Bird

(No comment was entered for this change.)

Comment #7

Posted on Sep 18, 2012 by Swift Bird

I think most of the work is done for this and I'd like to clean up any loose ends and get it included in Refine 2.6.

Status: Started

Labels:
Type-Defect Priority-Medium Milestone-2.6