Storing/Retrieving time resolved data

Place to get help with not working mods / modding interface.
Post Reply
eduran
Filter Inserter
Filter Inserter
Posts: 344
Joined: Fri May 09, 2014 2:52 pm
Contact:

Storing/Retrieving time resolved data

Post by eduran »

I would like to store time-stamped key-value pairs and be able to retrieve average values for arbitrary time intervals.

Example: Read the amount of copper ore in a certain chest once a second and store it. An hour later I want to know what the average amount of copper ore was during every minute of the last hour.

Why? To fill a graph like this with data:
graph.JPG
graph.JPG (41.94 KiB) Viewed 782 times

Functionality I need:
  • set_count(key, value) - should set the count for key and the current tick to value
  • get_count_array(key, first_tick, last_tick, N) - should return an array of size N, with averaged values for key in the interval between first_tick and last_tick
Ideally, this should work for a large number of keys (think all items in a modded game) and long time frames (tens of hours) with a time resolution of a few minutes.

Any suggestions on how to tackle this without killing performance or blowing up save file size?

pleegwat
Filter Inserter
Filter Inserter
Posts: 258
Joined: Fri May 19, 2017 7:31 pm
Contact:

Re: Storing/Retrieving time resolved data

Post by pleegwat »

I'd pre-aggregate by truncating the timestamps to the desired resolution. So the first tick in the minute, you just store the current count and store a number of samples (of 1) for that minute. The second tick, you add the current count to the saved count and increment the number of samples in that minute. Proceed in that fashion. On display, divide the total sum by the number of samples to get the average.

If you want to display multiple resolutions, it may be best (both for space used and speed) to just keep a separate dataset for each resolution.

You may want to limit how many intervals you keep for each resolution.

User avatar
eradicator
Smart Inserter
Smart Inserter
Posts: 5206
Joined: Tue Jul 12, 2016 9:03 am
Contact:

Re: Storing/Retrieving time resolved data

Post by eradicator »

Do you really need the data to be available for arbitrary timeframes as opposed to a timeframe that always ends "now" (like in vanilla)?

If you don't:
every second: table.insert(one_minute,1,new_data) one_minute[60] = nil
every minute: table.insert(one_hour,1,sum(one_minute)) one_hour[60] = nil
Rinse and repeat for every resolution you need.
Table.insert(bla,1,bla) should be replaced by a proper ring structure for better performance.

I.e. the further away the data, the less resolution you store.
Author of: Belt Planner, Hand Crank Generator, Screenshot Maker, /sudo and more.
Mod support languages: 日本語, Deutsch, English
My code in the post above is dedicated to the public domain under CC0.

eduran
Filter Inserter
Filter Inserter
Posts: 344
Joined: Fri May 09, 2014 2:52 pm
Contact:

Re: Storing/Retrieving time resolved data

Post by eduran »

Thank you, some good ideas. So this is what I came up with:

Code: Select all

local function new_flow(bin_length, bin_count)
  return {
    bin_length = bin_length,
    bin_count = bin_count,
    data = {},
    sample_count = {},
  }
end

local function add(flow, key, amount)
  local timestamp = math.floor(game.tick / flow.bin_length)
  flow.data[timestamp] = flow.data[timestamp] or {}
  flow.data[timestamp][key] = (flow.data[timestamp][key] or 0) + amount
  flow.sample_count[timestamp][key] = (flow.sample_count[timestamp][key] or 0) + 1
end

local function get_total_counts(flow, key)
  local timestamp = math.floor(game.tick / flow.bin_length)
  local offset = timestamp - flow.bin_count
  local total = {}
  for i = 1 + offset, flow.bin_count + offset do
    total[i - offset] = flow.data[i] and flow.data[i][key] or 0
  end
  return total
end

local function get_average_counts(flow, key)
  local timestamp = math.floor(game.tick / flow.bin_length)
  local offset = timestamp - flow.bin_count
  local avg = {}
  for i = 1 + offset, flow.bin_count + offset do
    avg[i - offset] = flow.data[i] and (flow.data[i][key] / flow.sample_count[i]) or 0
  end
  return avg
end

-- to be called in on_nth tick, once per interval:
local function remove_old_data(flow)
 local timestamp = math.floor(game.tick / flow.bin_length)
 flow.data[timestamp - flow.bin_count] = nil
 flow.sample_count[timestamp - flow.bin_count] = nil
end
Do you see any obvious flaws or room for improvement?

Post Reply

Return to “Modding help”