Building a System from CSV Files

If you have input data on the component specifications and time series data formatted in CSV files, rather than a Matpower or PSS/e file that can be parsed automatically, the basic formula to build a System manually is:

  1. Format all .csv data into row-column format, where there is a row for each component and a column for each input parameter. Load each .csv into a DataFrame.
  2. Define a System.
  3. Starting with the buses, write a for loop for each component Type to loop over the DataFrame, using a constructor from PowerSystems.jl's Model Library to define a component for each row. Hard-code any required parameters that are missing in your dataset. Use add_component! to add each component to the System.
  4. Similarly, add cost and time series data either within each for loop, or after the components have been defined using begin_time_series_update.
  5. Save your System to a JSON once you are finished

The following example demonstrates this process for selected component types (e.g., ACBus, ThermalStandard, RenewableDispatch), but the same principles apply to build any of the components in PowerSystems.jl's Model Library.

Feel free to reuse the code below, ensuring that you customize the exact file names, data columns, column names, and hard-coded parameters in the for loops based on the data you have available.

Prerequisites

In this example, it is assumed that the CSV files are stored in a directory called MyData. In each section below, ensure your data follows the row-column format before beginning, and that your data are in the given units.

These are the depedencies needed for this how-to:

using PowerSystems
using CSV
using DataFrames
using Dates
using TimeSeries

Build the base System

system_base_power = 100.0
sys = System(system_base_power)

Begin by building the base System using the base power in MVA.

Add Buses and Network Topology

In this example, we assume that the component data for the buses are contained in a CSV file, Buses.csv. Each row is an individual bus, and there is a column for each input parameter:

Bus NumberBusTypeMagnitude (p.u.)Voltage-Max (p.u.)Voltage-Min (p.u.)Base Voltage (kV)Region
1ref11.060.94138R1
2ref11.060.94345R2
.....................

Read in the contents of the CSV file Buses.csv to a data frame and customize the parameter names based on the column names in your CSV file.

bus_params = CSV.read("MyData/Buses.csv", DataFrame)

min_volt = "Voltage-Min (p.u.)"
max_volt = "Voltage-Max (p.u.)"
base_volt = "Base Voltage (kV)"
bus_number = "Bus Number"
region = "Region"

In this example, we assume that the buses are sorted into Areas, where Area is an optional parameter in the ACBus constructor. Because we will be sorting our buses into these areas as we construct the buses, we must first attach the areas to our System.

regions = unique(bus_params[:, region])

for reg in regions
    area = Area(reg)
    add_component!(sys, area)
end

You could similarly add LoadZones at this point, if they are relevant for your system.

Now, we are ready to build the buses in the for loop, using the ACBus constructor, using the input data available in our Buses.csv, and hard coding any missing data.

for row in eachrow(bus_params)
    bus = ACBus(;
        number = row[bus_number],
        name = "bus$(row[bus_number])",
        bustype = ACBusTypes.PQ,
        angle = 0.0,
        magnitude = 1.0,
        voltage_limits = (min = row[min_volt], max = row[max_volt]),
        base_voltage = row[base_volt],
        area = get_component(Area, sys, row[region]),
    )
    add_component!(sys, bus)
end

Add Branches

The next step is to build the Branch components in the system. In this example, we only have two branch types: a Line, if the connecting buses have the same base voltage; and a Transformer2W if they have different base voltages. You may need to implement additional logic if you have other branch types as well.

We assume the data for each Branch is contained in a Branches.csv. The conventions used in Bus from and Bus to must be consistent with the conventions used in the Bus Number column of Buses.csv.

Branch NumberBus fromBus toReactance (p.u.)Resistance (p.u.)Max Flow (MW)Min Flow (MW)
1120.09990.0303600-600
2450.007980.001761700-1700
.....................

Read in the contents of the CSV file Branches.csv to a data frame and customize the parameter names based on the column names in your CSV file.

branch_params = CSV.read("MyData/Branches.csv", DataFrame)

branch_num = "Branch Number"
bus_from_col = "Bus from"
bus_to_col = "Bus to"
reactance = "Reactance (p.u.)"
resistance = "Resistance (p.u.)"
max_flow = "Max Flow (MW)"

Build the lines and transformers using the Line and Transformer2W constructors.

for row in eachrow(branch_params)
    bus_from = get_bus(sys, row[bus_from_col])
    bus_to = get_bus(sys, row[bus_to_col])
    if get_base_voltage(bus_to) == get_base_voltage(bus_from)
        branch = Line(;
            name = "line$(row[branch_num])",
            available = true,
            active_power_flow = 0.0,
            reactive_power_flow = 0.0,
            arc = Arc(; from = bus_from, to = bus_to),
            r = row[resistance],
            x = row[reactance],
            b = (from = 0.0, to = 0.0),
            rating = row[max_flow] / system_base_power,
            angle_limits = (min = 0.0, max = 0.0),
        )
    else
        branch = Transformer2W(;
            name = "tline$(row[branch_num])",
            available = true,
            active_power_flow = 0.0,
            reactive_power_flow = 0.0,
            arc = Arc(; from = bus_from, to = bus_to),
            r = row[resistance],
            x = row[reactance],
            primary_shunt = 0.0,
            rating = row[max_flow] / system_base_power,
        )
    end
    add_component!(sys, branch)
end
Warning

When defining a branch that isn't attached to a System yet, you must define the thermal rating of the transmission line per-unitized in "SYSTEM_BASE" using the base power of the System you plan to connect it to – defined above as system_base_power.

Adding Thermal Generators and their Costs

Build Thermal Generators

We assume the data needed to build each ThermalStandard unit is found in a CSV file Thermal_Gens.csv. The following table is a snapshot of the first 7 columns of an example Thermal_Gens.csv:

Gen NameBusRating (MVA)Min Stable Level (MW)Max Capacity (MW)PrimeMoveTypeFuel Type...
Biomass 011230.93.0OTAG_BIPRODUCT...
Biomass 021031.20.361.2OTAG_BIPRODUCT...
........................

The naming convention used for the contents of the Bus column must be consistent with the convention used in the Bus Number column of Buses.csv.

Read in the contents of the CSV file Thermal_Gens.csv to a data frame and customize the parameter names based on the column names in your CSV file. Note that the PrimeMovers in the thermal_gens data frame are consistent with EIA Form 923, and the Fuel Type column data are consistent with the ThermalFuels naming convention.

thermal_gens = CSV.read("MyData/Thermal_Gens.csv", DataFrame)

name = "Gen Name"
bus_connection = "Bus"
rate = "Rating (MVA)"
min_active_power = "Min Stable Level (MW)"
max_active_power = "Max Capacity (MW)"
ramp_up = "Max Ramp Up (MW/min)"
ramp_down = "Max Ramp Down (MW/min)"
min_up = "Min Up Time (h)"
min_down = "Min Down Time (h)"
prime_move = "PrimeMoveType"
fuel = "Fuel Type"

Build the thermal generator components using the ThermalStandard constructor and data stored in the thermal_gens data frame.

Warning

When you define components that aren't attached to a System yet, the constructors assume define all fields related to power are per-unitized in "DEVICE_BASE". Divide all fields with units such as MW, MVA, MVAR, or MW/min using the base_power of the component (with the exception of base_power itself, which is in MVA).

Since the example data is missing base power, we set base power equal to the rating data, and the rating parameter to 1.0:

for row in eachrow(thermal_gens)
    base = row[rate]
    thermal = ThermalStandard(;
        name = row[name],
        available = true,
        status = true,
        bus = get_bus(sys, row[bus_connection]),
        active_power = 0.0,
        reactive_power = 0.0,
        rating = 1.0,
        active_power_limits = (
            min = row[min_active_power] / base,
            max = row[max_active_power] / base,
        ),
        reactive_power_limits = (min = 0.0, max = 0.0),
        ramp_limits = (up = row[ramp_up] / base, down = row[ramp_down] / base),
        operation_cost = ThermalGenerationCost(nothing),
        base_power = base,
        time_limits = (up = row[min_up], down = row[min_down]),
        prime_mover_type = row[prime_move],
        fuel = row[fuel],
    )
    add_component!(sys, thermal)
end

Add ThermalGenerationCost

In this example the ThermalGenerationCost constructor is defined by a FuelCurve. We are also assuming that the data needed to build each FuelCurve can be found in two separate CSV files. The first, Thermal_Gens.csv, has already been defined in the previous step. The second, Thermal_Fuel_Rates.csv, contains information regarding each FuelType and its cost:

Fuel TypeFuel PriceCO2 rateNOX rateSO2 rate
COAL1.8203.50.3820.33
NATURAL_GAS5.41180.0790.001
OIL21123.10.1760.006
AG_BIPRODUCT2.41300.1770.006
GEOTHERMAL000.1770.006

Read in Thermal_Fuel_Rates.csv, customize parameter names based on the column names in your CSV file, and create a dictionary pairing fuel types and fuel prices. Ensure that the strings populating the Fuel Type column match the strings populating the Fuel Type column of the thermal_gens data frame.

fuel_params = CSV.read("MyData/Thermal_Fuels_Rates.csv", DataFrame)

type = "Fuel Type"
price = "Fuel Price"

fuel_cost_dict = Dict{ThermalFuels, Float64}()
for row in eachrow(fuel_params)
    fuel_cost_dict[row[type]] = row[price]
end

Next, we will create column name variables for heat rate bases, heat rates, load points, fixed, start up, and shut down costs from the thermal_gens data frame. Use this data to build the PiecewiseIncrementalCurve and the FuelCurve functions, and add them to their associated thermal generator.

gen_name = "Generator Name"
heat_rate_base = "Heat Rate Base (MMBTU/hr)"
heat_rate = "Heat Rate (MMBTU/hr)"
load_point = "Load Point Band (MW)"
fixed_cost = "Fixed Cost (dollar)"
start_up_cost = "Start Up Cost (dollar)"
shut_down_cost = "Shut Down Cost (dollar)"

for row in eachrow(thermal_gens)
    thermal = get_component(ThermalStandard, sys, row[gen_name])
    heat_rate_curve =
        PieceWiseIncrementalCurve(row[heat_rate_base], row[load_point], row[heat_rate])
    fuel_curve =
        FuelCurve(;
            value_curve = heat_rate_curve,
            fuel_cost = fuel_cost_dict[row[fuel]],
        )
    cost_thermal = ThermalGenerationCost(;
        variable = fuel_curve,
        fixed = row[fixed_cost],
        start_up = row[start_up_cost],
        shut_down = row[shut_down_cost],
    )
    set_operation_cost!(thermal, cost_thermal)
end

For more information regarding thermal cost functions please visit ThermalGenerationCost.

Adding Renewable Generators and Their Time Series

The following section demonstrates how to add solar generators and their time series to a System. However, if you desire to add other RenewableDispatch generator types, such as wind, the process is exactly the same, with changing the prime mover type from PrimeMovers.PVe to PrimeMovers.WT.

We assume the data needed to build each solar powered RenewableDispatch unit is found in the CSV file Solar_Gens.csv, snapshotted below. The convention used for the contents of the Bus column must be consistent with the convention used in the Bus Number column of Buses.csv:

Gen NameNumberBusRating (MVA)
Solar 01132746.76
Solar 02292369.26
............

Read in the contents of the CSV file Solar_Gens.csv to a data frame, and customize the parameter names based on the column names in your CSV file and the desired prime mover type.

solar_gens = CSV.read("MyData/Solar_Gens.csv", DataFrame)

name = "Gen Name"
num = "Number"
bus_connection = "Bus"
rate = "Rating (MVA)"
prime_mover = PrimeMovers.PVe

Before building the solar generators, we must also set up the solar generators' time series. In this example, we assume that each solar generator has a unique time series, and all of these time series are contained in one file: MyData/Solar_Time_Series.csv. Here is a snapshot of this time series CSV file, where the first column contains time stamps, and the remaining columns are titled with its respective generator's name, and contain the time series values in MW for each solar generator:

Time StampSolar 01Solar 02Solar 03...
...............
1/1/23 09:0051.36118.1154.92...
1/1/23 10:00138.61200.326.85...
...............

Each time series for its respective solar generator has an hourly resolution, and is for the year 2023, plus one full day into 2024, for a total of 366 days, and 8784 values per column.

Create a data frame from Solar_Time_Series.csv and define variables for the aforementioned resolution and time stamps:

solar_time_series = CSV.read("MyData/Solar_Time_Series.csv", DataFrame)

resolution = Dates.Hour(1);
timestamps = range(DateTime("2023-01-01T00:00:00"); step = resolution, length = 8784);

In this example, we assume that the RenewableGenerationCost is at zero marginal cost. If the marginal cost is not zero, follow similar steps to building the ThermalGenerationCost constructor from above, but for the RenewableGenerationCost constructor instead.

Moreover, since the example data is missing base power, we use the range of values in the time series data to determine the value of the base power which will be used to normalize the time series, and set the generator's rating parameter to 1.0. (If your data reports a base power, use that for the generator's base power, and rating data for the generator's rating).

In the same for loop, we will build the solar generator components using the RenewableDispatch constructor and data stored in the solar_gens data frame, and build and attach each solar generator's time series:

for row in eachrow(solar_gens)
    norm = maximum(solar_time_series[:, row[name]])
    solar_data_MW = solar_time_series[:, row[name]]
    base = row[rate]
    if any((solar_data_MW ./ base) .> 1.0)
        @warn "Generator $(row[name]) has a production larger than its base power. Normalizing by its maximum"
        base = norm
    end
    solar_array = TimeArray(timestamps, (solar_data_MW ./ base)) # normalize data
    solar_TS = SingleTimeSeries(;
        name = "max_active_power",
        data = solar_array,
        scaling_factor_multiplier = get_max_active_power,
    )
    solar = RenewableDispatch(;
        name = row[name],
        available = true,
        bus = get_bus(sys, row[bus_connection]),
        active_power = 0.0,
        reactive_power = 0.0,
        rating = 1.0,
        prime_mover_type = prime_mover,
        reactive_power_limits = (min = 0.0, max = 0.0),
        power_factor = 1.0,
        operation_cost = RenewableGenerationCost(zero(CostCurve)),
        base_power = base,
    )
    add_component!(sys, solar)
    add_time_series!(sys, solar, solar_TS)
end

Adding Hydro Generators and Their Time Series

We assume the data needed to build each HydroDispatch unit is found in a CSV file Hydro_Gens.csv, snapshotted below. The convention used for the contents of the Bus column must be consistent with the convention used in the Bus Number column of Buses.csv.

Gen NameNumberBusMin Stable Level (MW)Max Capacity (MW)Max Ramp Up (MW/min)...
Hydro 011560.075.00.83...
Hydro 022660.077.00.86...
.....................

Read in the contents of the CSV file Hydro_Gens.csv to a data frame and customize the parameter names based on the column names in your CSV file.

hydro_gens = CSV.read("MyData/Hydro_Gens.csv", DataFrame)

name = "Gen Name"
bus_connection = "Bus"
num = "Number"
rate = "Rating (MVA)"
min_active_power = "Min Stable Level (MW)"
max_active_power = "Max Capacity (MW)"
ramp_up = "Max Ramp Up (MW/min)"
ramp_down = "Max Ramp Down (MW/min)"
min_up = "Min Up Time (h)"
min_down = "Min Down Time (h)"

Before building the hydro generators, we must also set up the hydro generators' time series. In this example, we assume that each hydro generator has a unique time series, and all of these time series are contained in one file: MyData/Hydro_Time_Series.csv. Here is a snapshot of this time series CSV file, where the first column contains time stamps, and the remaining columns are titled with its respective generator's name, and contain the time series values in MW for each hydro generator:

Time StampHydro 01Hydro 02Hydro 03...
1/1/23 0:000.3253860.3254090.314454...
1/1/23 1:000.3253860.3254090.314454...
...............

Each time series for its respective hydro generator has an hourly resolution, and is for the year 2023, plus one full day into 2024, for a total of 366 days, and 8784 values per column.

Create a data frame from Hydro_Time_Series.csv and define variables for the aforementioned resolution and time stamps:

hydro_time_series = CSV.read("MyData/Hydro_Time_Series.csv", DataFrame)

resolution = Dates.Hour(1);
timestamps = range(DateTime("2023-01-01T00:00:00"); step = resolution, length = 8784);

In this example, we assume that the HydroGenerationCost has both zero fixed and variable costs. Moreover, since the example data is missing base power, we use the range of values in the time series data to determine the value of the base power which will be used to normalize the time series and other parameters, and set the generator's rating parameter to 1.0. (If your data reports a base power, use that for the generator's base power, and rating data for the generator's rating).

In the same for loop, we will build the hydro generator components using the HydroDispatch constructor and data stored in the hydro_gens data frame, and build and attach each hydro generator's time series:

for row in eachrow(hydro_gens)
    norm = maximum(hydro_time_series[:, row[name]])
    hydro_data_MW = hydro_time_series[:, row[name]]
    base = row[rate]
    if any((hydro_data_MW ./ base) .> 1.0)
        @warn "Generator $(row[name]) has a production larger than its base power. Normalizing by its maximum"
        base = norm
    end
    hydro_array = TimeArray(timestamps, (hydro_data_MW ./ base)) # normalize data
    hydro_TS = SingleTimeSeries(;
        name = "max_active_power",
        data = hydro_array,
        scaling_factor_multiplier = get_max_active_power,
    )
    hydro = HydroDispatch(;
        name = row[name],
        available = true,
        bus = get_bus(sys, row[bus_connection]),
        active_power = 0.0,
        reactive_power = 0.0,
        rating = 1.0,
        prime_mover_type = PrimeMovers.HA,
        active_power_limits = (
            min = row[min_active_power] / base,
            max = row[max_active_power] / base,
        ),
        reactive_power_limits = (min = 0.0, max = 0.0),
        ramp_limits = (up = row[ramp_up] / base, down = row[ramp_down] / base),
        time_limits = (up = row[min_up], down = row[min_down]),
        base_power = base,
        operation_cost = HydroGenerationCost(zero(LinearCurve), 0.0),
    )
    add_component!(sys, hydro)
    add_time_series!(sys, hydro, hydro_TS)
end

Adding Loads and Their Time Series

In this example, we assume that all loads will be represented as PowerLoad components, and that the loads are located in three regions. Each region has one unique time series, and every load in each region is assigned its region's normalized time series profile.

The data needed to build each PowerLoad unit is found in the CSV file Loads.csv, snapshotted below. The convention used for the contents of the Bus column must be consistent with the convention used in the Bus Number column of Buses.csv, and similarly for the Region column.

Load NumberBusRegionParticipation Factor
11R10.047168669
22R20.0184963
............

Read in the contents of the CSV file Loads.csv to a data frame and customize the parameter names based on the column names in your CSV file.

load_params = CSV.read("MyData/Loads.csv", DataFrame)

region = "Region"
bus_connection = "Bus"
factor = "Load Participation Factor"
number = "Load Number"

Before building the loads, we must also set up the loads' time series. In this example, we assume that each load region has a unique time series, and all of these time series are contained in one file: MyData/Load_Time_Series.csv. Here is a snapshot of this time series CSV file, where the first column contains time stamps, and the remaining columns are titled with its respective region's name, and contain the time series values in MW for each region:

Time StampR1R2R3
1/1/23 0:005465.72961904.44482486.38409
1/1/23 1:004994.538211726.221342273.63274
............

Each time series for its respective load region has an hourly resolution, and is for the year 2023, plus one full day into 2024, for a total of 366 days, and 8784 values per column. Ensure that your time series file is similarly formatted.

Create a data frame from Load_Time_Series.csv and define variables for the aforementioned resolution and time stamps:

load_time_series = CSV.read("MyData/Load_Time_Series.csv", DataFrame)

resolution = Dates.Hour(1);
timestamps = range(DateTime("2023-01-01T00:00:00"); step = resolution, length = 8784);

Construct and attach the loads to the system using the PowerLoad constructor according to their regions. Because all loads in each region share a time series profile, the max_active_power of each load will depend on the maximum time series value of its region and its load participation factor.

for row in eachrow(load_params)
    num = row[number]
    max = maximum(load_time_series[:, row[region]])
    load = PowerLoad(;
        name = "load$num",
        available = true,
        bus = get_bus(sys, row[bus_connection]),
        active_power = 0.0,
        reactive_power = 0.0,
        base_power = system_base_power,
        max_active_power = (max) * (row[factor]) / system_base_power,
        max_reactive_power = 0.0,
    )
    add_component!(sys, load)
end

In a for loop, iterate over the regions in your System, and use the begin_time_series_update function to create and attach the respective loads' time series to every load in a region at once. Note: due to how max_active_power is defined, the time series values are normalized to its maximum.

regions = unique(load_params[:, region])

for reg in regions
    norm = maximum(load_time_series[:, reg])
    load_array = TimeArray(
        timestamps,
        (load_time_series[:, reg] ./ norm),
    )
    load_TS = SingleTimeSeries(;
        name = "max_active_power",
        data = load_array,
        scaling_factor_multiplier = get_max_active_power,
    )
    region = get_component(Area, sys, reg)
    begin_time_series_update(sys) do
        for component in get_components_in_aggregation_topology(PowerLoad, sys, region)
            add_time_series!(sys, component, load_TS)
        end
    end
end

Customizing and Expanding

Additional resources to help you built your own custom System: