Commit cc5347d1 authored by R.W.Majeed's avatar R.W.Majeed
Browse files

documentation and plans for automatic sorting

parent cf4b37c1
Allow interval values
Parse interval values e.g. using ISO-31-11 notation
like "[0,10]" or "]0,1]"
Allow specification of interval separator (default ',')
for columns supporting intervals.
Validate exceptions
test for specific errors in data files,
e.g. missing visit start timestamps in rows.
To do so, add data files with errors to test/resources
Test for correct order and grouping of ids (patient id in same order over all tables,
visit id in same order over all tables, id references all grouped together (e.g. not 1, 1, 1, 2, 3, 1)
Automatic sorting of table data via temporary files. Proprietary sorted intermediate stream uses
protocol buffer encoding (by google: ).
string originalLocation;
uint64 lastModified
repeated string header;
repeated message row{
string rowid;
repeated string field;
use MessageLite.parseDelimitedFrom and MessageLite.writeDelimitedTo to read/write rows iteratively.
Additional column is stored with location information (e.g. row number for text tables or table primary key for SQL tables)
TODO abstract table source via Path or URL,
String getLocation,
FileTableSource getRelativeSource(String spec)
Import configuration
Allow multiple visit tables with different IDs. Fact tables
......@@ -47,6 +47,12 @@
<!-- add later for sorting data tables
</dependency> -->
