Time Series Data
InfrastructureSystems.jl
implements containers and routines to efficiently manage time series data. This document contains content for developers of new time series data. For the usage please refer to the documentation in PowerSystems.jl.
InfrastructureSystems.jl
provides a mechanism to store time series data for components. Here are reasons to consider using it:
- Time series data, by default, is stored independently of components in HDF5 files. Components store references to that data.
- System memory is not depleted by loading all time series data at once. Only data that you need is loaded.
- Multiple components can share the same time series data by sharing references instead of making expensive copies.
- Supports serialization and deserialization.
- Supports parsing raw data files of several formats as well as data stored in
TimeSeries.TimeArray
andDataFrames.DataFrame
objects.
Your package must reimplement a deepcopy method if you use HDF5 storage for TimeSeriesData.
If you store an instance of InfrastructureSystems.SystemData
within your system and then a user calls deepcopy
on a system, the .h5 file will not be copied. The new and old instances will have references to the same file. You will need to reimplement deepcopy
to handle this. One solution is to serialize and then deserialize the system.
Notes:
- Time series data can optionally be stored fully in memory. Refer to the
InfrastructureSystems.SystemData
documentation. InfrastructureSystems.jl
creates HDF5 files on the tmp filesystem by default, using the location obtained fromtempdir()
. This can be changed if the time series data is larger than the amount of tmp space available. Refer to theInfrastructureSystems.SystemData
link above.- By default, the call to
add_time_series!
will open the .h5 file, write the data to the file, and close the file. Opening and closing the file has overhead. If you will add thousands of time series arrays, consider usingopen_time_series_store!
to add all the arrays with one file handle.
Instructions
- Ensure that
supports_time_series(::MyComponent)
returns true for the struct. It may be implemented on a supertype of the struct.
Data Format
Time series arrays are stored in an HDF5 file according the format described here.
The root path /time_series
defines these HDF5 attributes to control deserialization:
data_format_version
: Designates the InfrastructureSystems format for the file.compression_enabled
: Specifies whether compression is enabled and will be used for new time series.compression_type
: Specifies the type of compression being used.compression_level
: Specifies the level of compression being used.compression_shuffle
: Specifies whether the shuffle filter is being used.
Each time series array is stored in an HDF5 group named with the array's UUID. Each group contains a dataset called data
which contains the actual data. Each group also contains a group called component_references
which contains an HDF5 attribute for each component reference. The component reference uses the format <component_uuid>__<time_series_name>
.
Each time series group defines attributes that control how the data will be deserialized into a TimeSeriesData
instance.
initial_timestamp
: Defines the first timestamp of the array. (All times are not stored.)resolution
: Resolution of the time series in milliseconds.type
: Type of the time series. Subtype ofTimeSeriesData
.module
: Module that defines the type of the time series.data_type
: Describes the type of the array stored.
Example:
/time_series
data_format_version = "1.0.1"
compression_enabled = 1
/9f02f706-3394-4af3-8084-8903d302cbba
/component_references
0b6ecb61-8e8d-4563-b795-f001246c3ea5__max_active_power
613ddbc2-b666-4c9d-adb5-fa69e7f40a95__max_active_power
/data
Debugging
The HDF Group provides tools to inspect and manipulate files. Refer to their website.
HDFView
is especially useful for viewing data. Note that using h5ls
and h5dump
in a terminal combined with UNIX tools like grep
can sometimes be faster.
Maintenance
If you delete time series arrays in your system you may notice that the actual size of the HDF5 does not decrease. The only way to recover this space is to build a new file with only the active objects. The HDF5 tools package provides the tool h5repack
for this purpose.
$ h5repack time_series.h5 new.h5
$ mv new.h5 time_series.h5