|
How_To_Load_Data_Into_INQLE
How to load data into INQLE
IntroductionINQLE works with semantic, RDF data. This permits INQLE to recognize the relationships of the data, and to select a random sampling of learnable data. So INQLE requires data to be imported as RDF. INQLE can import data from a number of sources. Import from Comma-Separated (CSV) or other Delimited Text FileMost data seems to be stored in spreadsheets. Spreadsheets are great for collecting and working with a set of data, but within a spreadsheet, there is little semantic info such as the identities of the things represented or the relationships between them. To facilitate importing data into INQLE, we provide a File Data Importer Wizard. This wizard walks you through the steps of capturing data that is explicitly stored in a spreadsheet, as well as implicit data, about the spreadsheet, that does not appear in the spreadsheet. Before using the wizard, you should get your data into a spreadsheet program, and select File -> Save as -> and select to save your spreadsheet as Comma-Seperated Values (CSV) or delimited text file. To start the wizard, right click a dataset in your INQLE Administrator, and select "Load data from a file...". You will now see the file data importer wizard.
Click Next to get past the intro page. Click Browse and find your CSV file, and click Open. Click Upload.
Click Next. You should see the first few rows of your CSV data.
Click Next.
Here you can specify date information about your data. If your CSV all pertains to a single date or time, then you can select that all data being imported has the same date and time, and enter this date (and optionally time). Alternatively, if each row of data pertains to a different date, then specify below which column contains this date information. If your data does not pertain to any dates, you can skip this page. Click Next. The remaining steps of the wizard will specify the things that your data is about, plus the attributes of those things. Subjects are things that describe a single row in the table. When importing subjects, you must specify what type of thing each subject is a member of. Example: we are importing a spreadsheet of companies. The subject type is "company", and each row represents a different instance of that type (i.e., each row is a different company). OK, so how do we import these things and their attributes? First, decide which subject your data is about. In general, start with your table. What does each row represent? You need to add a subject of each thing's type. For this tutorial, we will add a subject of type "Breast Tumor". First we can search to see if others have defined the type "Breast Tumor". We type in "tumor" and click the Search button. This queries both your INQLE server, plus the Central INQLE Server, running at inqle.org. If the type of thing has not yet been defined, you can define it (and register it with the Central INQLE Server) by clicking on the "Enter a new subject" button. In this case, we have found the correct type of thing (Breast Tumor), select it and move on to the next page. In this page, we seek to identify the subject (thing) of each row. Specifcally, we need to understand how to create the unique identifier (URI) for each subject. Next we will specify how the columns of the CSV table map to each breast tumor. Currently our INQLE server (and the Central INQLE Server) do not know about any properties of this type. So we must enter them. Click on the Add new properties for BREAST TUMOR button.
In this dialog, we will create new properties for the Breast Tumor type. We have simplified this process, by permitting you to click the checkbox under the "Create" column. This will enable the property in question. Edit the values for the name and description of each label. These values will be used when other INQLE users import data about breast tumors. If a property is an identifier, rather than a data property, then select the "Identifier?" checkbox. Click Save.
Now our Breast Tumor properties page has selectors for our newly-entered properties. For each property, select which column (if any) contains the value for that property. The form automatically assumes that a if a column's header matches the property name, then that column's values should populate the matching property.
In this case, we can skip this page. Click Next.
In this page you can specify a name and a description for the mapping you have created. In a future version, INQLE will be able to reuse these mappings to automatically import CSV tables with matching column names. Click Finish INQLE has imported your data. You are finished with the File Data Importer Wizard. From here, you could create a custom sampler, run the experimenter agent, and view results of experiments. |
Sign in to add a comment
In this page, we capture static values, which all subjects (rows in the table) share. For instance, if this dataset were about breast tumors in women who were positive for the BRCA gene, then we might add a property "BRCA Status", and specify here that the static value (for all rows) is "positive".